Cancer is the second leading cause of death in the United States. While recent efforts at analyzing the transcriptome of cancer patients have focused on studying protein-encoding genes, several long non-coding RNA (lncRNA) transcriptions have been found to promote oncogenesis. Previous attempts at identifying these cancer-associated lncRNAs have been limited. The evolution of high-throughput sequencing technology and bioinformatics has enabled researchers to curate existing libraries of genetic information in an unbiased computer-automated fashion, uncovering previously unannotated lncRNAs predicted to be associated with cancer. With the increasing popularity and decreasing cost of personal genome sequencing, these newly discovered lncRNA biomarkers will help lead to novel and individualized treatments for cancers.
Using raw data from 7,256 established RNA-Seq libraries from cancer samples, benign tissues, and cell lines, the inventors have developed an automated process in a high-performance computing environment to sequence potential transcripts and thus generate the “MiTranscriptome.” In the MiTranscriptome, 48,952 previously unknown lncRNAs are predicted to occur, 7% of which overlap with known disease-associated single nucleotide polymorphisms (SNPs). Using newly developed methods, lineage- and cancer-specific lncRNAs have been identified and classified as having strong functional potential, most of which are previously unknown and thus serve as novel targets for therapeutic treatment of specific cancers.
- Identify biomarkers for disease diagnosis.
- Identify targets for RNA interference based therapies.
- Comparative analysis of patient tumor transcriptomes with cancer-associated transcripts could individualize treatment options.
- Includes annotation of monoexonic transcripts and intragenic lncRNAs previously difficult to sequence.
- Advanced bioinformatics algorithms quickly process data from a large sample of tumors.