• Analysis of gene expression data from non-small cell lung carcinoma cell lines reveals distinct sub-classes from those identified at the phenotype level.

      Dalby, Andrew R; Emam, Ibrahim; Franke, Raimo; Helmholtz Centre for infection research, Inhoffenstr. 7, 38124 Braunschweig, Germany. (2012)
      Microarray data from cell lines of Non-Small Cell Lung Carcinoma (NSCLC) can be used to look for differences in gene expression between the cell lines derived from different tumour samples, and to investigate if these differences can be used to cluster the cell lines into distinct groups. Dividing the cell lines into classes can help to improve diagnosis and the development of screens for new drug candidates. The micro-array data is first subjected to quality control analysis and then subsequently normalised using three alternate methods to reduce the chances of differences being artefacts resulting from the normalisation process. The final clustering into sub-classes was carried out in a conservative manner such that sub-classes were consistent across all three normalisation methods. If there is structure in the cell line population it was expected that this would agree with histological classifications, but this was not found to be the case. To check the biological consistency of the sub-classes the set of most strongly differentially expressed genes was be identified for each pair of clusters to check if the genes that most strongly define sub-classes have biological functions consistent with NSCLC.