Authors: Jay C. Brown
Although LINE1 DNA sequence elements are well known for their ability to replicate and move autonomously within the human genome, these features are observed in only a small proportion (0.02%) of the total human LINE1 population. Nearly all of the total ~500,000 LINE1 elements are fragments of full-length LINE1 and are inactive for autonomous replication or movement. Truncated, inactive LINE1 sequences are found throughout the human genome including within the body of protein-coding genes, and this intragenic population is the subject of the study described here. The goal was to extend what is known about the properties of intragenic LINE1 sequences. The study was carried out with t1519, a truncated LINE1 sequence composed of the 3’ terminal ~1500 bp of the ~6000 bp full length LINE1 element, and with the sequences of three human chromosomes 16, 17 and 18, that are rich in t1519 sequences. NCBI BLAST was used to identify t1519-containing genes in each chromosome, and the length and expression level of those genes was compared with control genes lacking t1519. A striking result was observed in the case of long protein-coding genes, genes longer than 140 kb. All had one or more t1519 sequences in the gene body, all in introns. An effect on the level of gene expression was also observed. Low expression (<50 TPM) was found in all long, t1519 positive genes while much higher levels (500-600 TPM) were found with genes in the common length range (>< 140 kb) regardless of the presence of t1519. Similar results were obtained when lncRNA genes were studied instead of protein-coding ones. The results are interpreted to support a strong suppressive effect of t1519 on expression of long protein coding genes and also on certain lncRNA genes. It is suggested that the suppressive effect is due to a need for the cell to limit the overall level of transcription it can support.
View/Download pdf