How to get expression levels
The following approach begins with lincRNAs that we identified, and concludes with gene expression levels (Figure 1).
First, lincRNAs and Ensembl genes were combined to form a novel annotated gene set.
Then, RNA-seq reads for each sample were mapped to the genomic sequences with TopHat (with the option --no-novel-juncs to skips gene and transcript discovery),
and the novel annotated gene set were also fed to TopHat (with the option –G).
The resulting alignment files were then fed to Cufflinks to generate a transcriptome assembly for each sample.
At the same time, Cufflinks quantify gene abundance with the parameter -G and output expression values into the file genes.fpkm_tracking.
For achieving count expression values, we used the summarizeOverlaps function from the GenomicRanges package (version 1.14.4) with the default parameter Mode=Union.
Reads were counted from a list of Binary Alignment/Map (BAM) files derived from TopHat and count expression values were returned in a matrix file for possible use in further analysis such as those offered in DESeq and edgeR.
Figure 1 - Calculation of gene expression levels
Figure 1 - Calculation of gene expression levels