Research Article Open Access

Taxonomically Clustering Organisms Based on the Profiles of Gene Sequences Using PCA

E. Ramaraj and M. Punithavalli

Abstract

The biological implications of bioinformatics can already be seen in various implementations. Biological taxonomy may seem like a simple science in which the biologists merely observe similarities among organisms and construct classifications according to those similarities[1], but it is not so simple. By applying data mining techniques on gene sequence database we can cluster the data to find interesting similarities in the gene expression data. One of the applications of such kind of clustering is taxonomically clustering the organisms based on their gene sequential expressions. In this study we outlined a method for taxonomical clustering of species of the organisms based on the genetic profile using Principal Component Analysis and Self Organizing Neural Networks. We have implemented the idea using Matlab and tried to cluster the gene sequences taken from PAUP version of the ML5/ML6 database. The taxa used for some of the basidiomycetous fungi form the database. To study the scalability issues another large gene sequence database was used. The proposed method clustered the species of organisms correctly in almost all the cases. The obtained were more significant and promising. The proposed method clustered the species of organisms correctly in almost all the cases. The obtained results were more significant and promising.

Journal of Computer Science
Volume 2 No. 3, 2006, 292-296

DOI: https://doi.org/10.3844/jcssp.2006.292.296

Submitted On: 2 December 2005 Published On: 31 March 2006

How to Cite: Ramaraj, E. & Punithavalli, M. (2006). Taxonomically Clustering Organisms Based on the Profiles of Gene Sequences Using PCA. Journal of Computer Science, 2(3), 292-296. https://doi.org/10.3844/jcssp.2006.292.296

  • 3,400 Views
  • 2,175 Downloads
  • 1 Citations

Download

Keywords

  • Bioinformatics
  • taxonomy
  • gene sequence classification
  • data mining
  • data classification
  • clustering
  • principal component analysis