ChIP-seq analysis reveals genes regulated by TFIIE and association of TFIIE with various pathways | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article ChIP-seq analysis reveals genes regulated by TFIIE and association of TFIIE with various pathways Serdar Baysal This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8740923/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Previous studies on transcription factor II E(TFIIE) showed that TFIIE was a general transcription factor and that was involved in the transcription of genes to be transcribed. It is supposed that TFIIE is recruited to promoter sites of genes to be transcribed and plays a role in transcription initiation. The aim of this study was to investigate and identify the disease-specific role of TFIIE rather than its general transcription factor role, and characterize the cellular pathways and genes occupied by TFIIE. In this regards, ChIP-seq analysis was performed. For ChIP-seq data, POLR2A ChIP-seq and eGFP-GTF2E2 ChIP-seq were used on human K562 cell line data obtanined from the ENCODE (Encyclopedia of DNA Elements) Project Consortium database.Six significantly enriched KEGG pathways were identified. These pathways involved splicesomes, alcoholism, ATP-dependent chromatin remodeling, necroptosis, neutrophil extracellular trap formation and systemic lupus erythematosus. Also, sixteen significantly enriched GO terms were identifed, six related to molecular functions, three related to biological processes and seven related to cellular components. The molecular function category contains nucleosomal DNA binding, nucleosome binding, pre-mRNA 5’-splice site binding, pre-mRNA binding, protein heterodimerization activity, and structural constitute of chromatin. The biological process category contains RNA splicing, ribonucleoprotein complex biogenesis, and mRNA 5’-splice site recognition. The cellular component category contains U1-snRNP, nucleosome, spliceosomal snRNP complex, protein-DNA complex, DNA packaging complex, Sm-like protein family complex, and small-nuclear ribonucleoprotein complex. A significant enrichment of 3 motifs was also discovered.The current study revealed TFIIE-associated genes and association of TFIIE with certain groups of cellular pathways.This might shed light on disease specific role of TFIIE and help future studies characterize TFIIE-related human diseases. Bioinformatics bioinformatics ChIP-Seq human disease TFIIE transcription Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 1.Introduction Transcription is the process by which RNA polymerase II which has twelve subunits and POLR2A is the largest subunits of RNAPII, uses DNA as a template to produce mRNA, which is highly preserved and controlled process among organisms such as prokaryotes and eukaryotes (Engel et al., 2018 ). During transcription, the formation of the pre-initiation complex is a critical step for transcription initiation which needs recruitment of RNA polymerase II (RNAPII), mediator, and six general transcription factors (GTFIIA, GTFIIB, GTFIID, GTFIIE, GTFIIF, and GTFIIH) to the promoter site of the gene to be transcribed (Butler and Kadonaga, 2002 ). Among the GTFs, TFIIE, although not as widely characterized, is critical for transcriptional function. The crystal structure of TFIIE was determined at atomic resolution, consisting of TFIIEα and TFIIEβ subunits. (Miwa et al., 2016 ). TFIIEα is a 439 amino acid long polypeptide with a molecular mass of 56 kDa, and TFIIEβ is 291 amino acids long and has a molecular mass of 34 kDa (Peterson et al., 1991 ). Interactions between the α and the β subunits occur through the N-terminal domain of TFIIEα and between the 193–238 amino acid region of TFIIE β (Ohkuma et al., 1995 ). In addition, the C-terminal domain of TFIIEα is believed to interact with TFIIH because TFIIE takes part in the recruitment of TFIIH to the preinitiation complex (PIC) (Thomas and Chiang, 2006). This interaction enables TFIIEα to regulate the kinase and ATPase activities of TFIIH so that it phosphorylates the Cterminal domain of RNAP II and starts the elongation process(Peterson et al., 1991 ). Moreover, TFIIE also helps TFIIH helicase activity for promoter clearance and DNA melting as well (Holstege et al., 1996 ). Beyond its role in transcription regulation, there some recent studies have shown that TFIIE has an association with certain diseases. One study showed that TFIIEα protein expression decreased in colorectal cancer tissues compared with adjacent non-tumour tissues (Mo and Chae, 2021). Another study showed that mutations in TFIIE due to compromised ribosomal biogenesis and translational precision resulted in deprivation of protein homeostasis (proteostasis), which could moderately elucidate the clinical phenotype in trichothiodystrophy (TTH) (Phan et al., 2021 ).A further study showed that the patients with TTH had mutations in general transcription factor TFIIEβ with routine DNA repair ,which could elucidate the clinical findings of TTH (Kuschal et al., 2016 ). ChIP-seq, or chromatin immunoprecipitation followed by sequencing, is a technique used to identify the DNA regions that interact with specific proteins in a cell. It combines chromatin immunoprecipitation (ChIP) with high-throughput DNA sequencing to map protein-DNA interactions on a genome-wide scale. This allows researchers to identify where proteins, such as transcription factors or histone modifications, bind to DNA. ChIP-seq has become an essential technique for investigating gene regulation and epigenetic mechanisms (Park et al., 2009). The experiments in the current work were designed to study the genes occupied by TFIIE and characterize the cellular pathways associated with TFIIE. ChIP-seq analysis was performed on human K562 cell line data obtanined from the ENCODE Project Consortium database with the aim of investigating TFIIE-associated cellular pathways and genes to shed light on the role of TFIIE on certain disease mechanims. 2. Materials and Methods 2.1 ChIP-Seq Analysis For the ChIP-seq experiment, POLR2A ChIP-seq and eGFP-GTF2E2 ChIP-seq on human K562 (a human immortalized myelogenous leukemia) cell line data were obtanined from the ENCODE Project Consortium database using the Gene Expression Omnibus(GEO) accession numbers GSE91721 and GSE105643, respectively (The ENCODE Project Consortium, 2012 ) as duplicate. 1x 10 8 and 2x 10 7 cells were used for POLR2A and eGFP-GTF2E2 ChIP-seq data analysis respectively.eGFP-GTF2E2 involves a fusion protein where a gene encoding enhanced green fluorescent protein (eGFP) is linked to the GTF2E2 gene. The eGFP tag allows for the detection and immunoprecipitation of the GTF2E2 protein.Raw reads were first filtered using trim_galore and fastp and then aligned to the hg38 (UCSC) reference genome using Bowtie2. The MarkDuplicates tool was subsequently used to detect and remove duplicate reads from the aligned data. bigWig files were made using bamCoverage. Peaks were initially characterized from the merged reads of two biological replicates, and peaks that were not present in either replicate were discarded afterward. Peak calling was conducted using the Model-based Analysis of ChIP-Seq 2 (MACS2) software. Peak annotation was performed using ChIPseeker, and motif analysis was performed using Multiple Expectation maximizations for Motif Elicitation 2 (MEME2). Functional enrichment analysis of peaks-located genes was obtained using clusterProfiler. The mutual peak calling between POLR2A and TFIIE2 was accomplished using IDR, and ChIPpeakAnno was used to provide Venn diagrams of promoter regions. Plot heatmaps and profiles for signal enrichment near transcription start sites (TSS) (TSS − 3 kb to TSS + 3 kb) were produced using the output from computeMatrix. FASTQC was used for quality control. Inter-sample correlation heat map was generated using Pearson Correlation Coefficient. 2.2 Gene Ontology (GO) analysis Gene Ontology 1 , is a public bioinformatics categorization database used to unite the display of gene features among different species. It has three main categories: cellular component, molecular function, and biological process. Gene Ontology (Young et al., 2010 ). GO enrichment analysis was performed using the cluster Profiler R package, in which gene length bias was adjusted. GO terms with normalized P-values less than 0.05 were esteemed significantly enriched. 1 Gene Ontology (1999).Gene Ontology Resource [online].Website: http://www.geneontology.org/ [Accessed 26 July 2024]. 2.3 Kyoto Encyclopedia of Genes and Genomes (KEGG) Analysis KEGG 2 is a database that helps interpret biological system such as cells, organisms, and ecosystems, using large-scale molecular data from genome sequencing and other high-throughput methods. The clusterProfiler R package was used to analyze the statistical enrichment of genes in KEGG pathways(Kaeisha et al., 2000). 2 Kyoto Encyclopedia of Genes and Genomes (1995).KEGG Resource [online].Website: https://www.genome.jp/kegg/ [Accessed 2 August 2024]. 3. Results 3.1 ChIP Sequencing Data Quality Control The quality control of ChIP-seq data is presented in Table 1 . Both quality 20(Q20) and quality 30(Q30) of each sample were > 92%. The correlation coefficients between groups were calculated and visualized as heat maps. This method intuitively shows sample differences and replicates between groups. The higher the correlation coefficient of a sample, the closer its expression pattern. The correlation coefficient matrix is shown in Fig. 1 . Table 1 ChIP-Seq Data Quality Sample_ID Clean_reads Clean_bases Q20 Q30 GC_Content RNAPII_1 44055194 4446125776 98.6 94.44 44.91 RNAPII_2 57359258 5788144586 98.56 94.28 46.14 RNAPII_input 89767598 9060422178 98.28 93.31 41.06 TFIIE2_1 38627050 1892725450 97.9 94.01 41 TFIIE2_2 32010404 1568509796 98.1 94.72 41.78 TFIIE2_input 22674893 1111069757 97.35 92.47 41.78 3.2 Co-occupancy of RNAPII and TFIIE TFIIE was not detected in some of the purified RNAPII holoenzyme forms in previous studies. Therefore co-occupancy of RNAPII and TFIIE around TSS of genes was investigated from K652 cell lines by using ChIP-seq. High- confidence peaks were used in our assays. Interestingly, 97% of genes showed differential occupancy of RNAPII and TFIIE around TSS (Fig. 2 ), once again highlighting the possibility that TFIIE might be required for a specific set of genes(Supplementary Table 1). 3.3 KEGG analysis Six significantly enriched KEGG pathways were identified (Fig. 3 ). These pathways involved splicesomes, alcoholism, ATP-dependent chromatin remodeling, necroptosis, and neutrophil extracellular trap formation. Pathways associated with immune system disease such as systemic lupus erythematosus were also included. 3.4 GO analysis Sixteen significantly enriched GO terms were identified ( Fig. 4 , 5 and 6 ), with six related to molecular functions, three related to biological processes and seven with cellular components. The molecular function category contains nucleosomal DNA binding, pre-mRNA 5’-splice site binding, nucleosome binding, pre-mRNA binding, protein heterodimerization activity, and structural constitute of chromatin. The biological process category contains RNA splicing, ribonucleoprotein complex biogenesis, and mRNA 5’-splice site recognition. The cellular component category contains U1-snRNP, nucleosome, spliceosomal snRNP complex, protein-DNA complex, DNA packaging complex, Sm-like protein family complex, and small-nuclear ribonucleoprotein complex. 3.5 Motif Analysis It was questioned whether TFIIE was enriched in any motif sequences and significant enrichment was discovered in three motifs ( E-value < 0.05 ) (Fig. 7 ). 4. Discussion Transcription is a highly preserved and firmly controlled process among organisms such as prokaryotes and eukaryotes (Franklin and Vondriska, 2011 ). There are many steps in transcription regulation, one of which is the initiation phase. The formation of pre-initiation complex is a critical step for transcription initiation which needs recruitment of RNAPII, mediator and six GTFs (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH) to the promoter site of the gene to be transcribed (Roeder et al., 2019). Earlier researches revealed that TFIIE was one of transcription factors recruited to the promoter region of targeted genes for transcription (Peterson et al., 1991 ).It was proposed in a study that involvement of TFIIE was dependent on the gene to be transcribed and recruitment of TFIIE was by-passed in yeast(Fukasawa et al., 1999). In addition, different research has shown the relation of TFIIE with distinct groups of diseases such as colon cancer and TTH (Mo and Chae, 2021; Phan et al., 2021 ). On this matter, in the current study, experiments were designed to investigate and characterize the disease-specific role of TFIIE and which pathways were associated with TFIIE using the ChIP-seq method. TFIIE was not detected in some of the purified RNAPII holoenzyme forms in previous studies (Thomas and Chiang, 2006). Therefore, co- occupancy of RNAPII and TFIIE around the TSS of genes was checked from K652 cell lines by using ChIP-seq. ChIP-seq analysis was performed from the ENCODE Project Consortium database. High confidence peaks were used in our assays. Interestingly, 97% of genes showed differential occupancy of RNAPII and TFIIE around TSSs, once again highlighting the possibility that TFIIE might not be required for all genes, only a certain group of genes. However, it is advisable to conduct tests using various cell lines and experimental conditions because the formation of the PIC is highly transient. This method provides only a brief snapshot, making it unlikely to capture all components of the machinery associated with DNA. Our study revealed that TFIIE might be associated with certain group of cellular, molecular and biological processes even with certain group of diseases such as sytemic lupus erythematosus although TFIIE is considered as a general transcription factor.In this regards, this study can help us understand specific role of TFIIE rather than it’s general transcription factor role. In summary, ChIP-seq analysis was used to investigate the genes occupied by TFIIE to understand its role in certain cellular pathways and diseases. Even though the net effect of the above-identified pathways are not yet known, these investigations into the TFIIE-related genes on a distinct group of pathways can help future studies identify TFIIE-associated diseases. Declarations Conflicts of Interest: The authors declare no conflicts of interest. Acknowledgments: I thank Ervin Fodor (University of Oxford) for his critical input and comments. References Butler JEF, Kadonaga JT (2002) The RNA polymerase II core promoter: a key component in the regulation of gene expression. Genes Dev 16(20):2583–2592. https://doi.org/10.1101/gad.1026202 ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74. https://doi.org/10.1038/nature11247 Engel C, Neyer S, Cramer P (2018) Distinct mechanisms of transcription initiation by RNA polymerases I and II. Annual Rev Biophys 47(1):425–446. https://doi.org/10.1146/annurev-biophys-070317-033058 Franklin S, Vondriska TM (2011) Genomes, proteomes, and the central dogma. Circ Cardiovasc Genet 4(5):576. https://doi.org/10.1161/CIRCGENETICS.110.957795 Holstege FC, van der Vliet PC, Timmers HT (1996) Opening of an RNA polymerase II promoter occurs in two distinct steps and requires the basal transcription factors IIE and IIH. EMBO J 15(7):1666–1677. https://doi.org/10.1002/j.1460-2075.1996.tb00512.x Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28(1):27–30. https://doi.org/10.1093/nar/28.1.27 Kuschal C, Botta E, Orioli D, Digiovanna JJ, Seneca S, Keymolen K, Tamura D, Heller E, Khan SG, Caligiuri G, Lanzafame M, Nardo T, Ricotti R, Peverali FA, Stephens R, Zhao Y, Lehmann AR, Baranello L, Levens D, Stefanini M (2016) GTF2E2 mutations destabilize the general transcription factor complex TFIIE in individuals with DNA repair-proficient trichothiodystrophy. Am J Hum Genet 98(4):627–642. https://doi.org/10.1016/j.ajhg.2016.02.008 Miwa K, Kojima R, Obita T, Ohkuma Y, Tamura Y, Mizuguchi M (2016) Crystal structure of human general transcription factor TFIIE at atomic resolution. J Mol Biol 428(21):4258–4266. https://doi.org/10.1016/j.jmb.2016.09.008 Ohkuma Y, Hashimoto S, Wang CK, Horikoshi M, Roeder RG (1995) Analysis of the role of TFIIE in basal transcription and TFIIH-mediated carboxy-terminal domain phosphorylation through structure-function studies of TFIIE-alpha. Mol Cell Biol 15(9):4856–4866. https://doi.org/10.1128/MCB.15.9.4856 Park PJ (2009) ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 10(10):669–680. https://doi.org/10.1038/nrg2641 Peterson MG, Inostroza J, Maxon ME, Flores O, Admon A, Reinberg D, Tjian R (1991) Structure and functional properties of human general transcription factor IIE. Nature 354(6352):369–373. https://doi.org/10.1038/354369a0 Phan T, Maity P, Ludwig C, Streit L, Michaelis J, Tsesmelis M, Scharffetter-Kochanek K, Iben S (2021) Nucleolar TFIIE plays a role in ribosomal biogenesis and performance. Nucleic Acids Res 49(19):11197–11210. https://doi.org/10.1093/nar/gkab866 Roeder RG (2019) 50 + years of eukaryotic transcription: an expanding universe of factors and mechanisms. Nat Struct Mol Biol 26(9):783–791. https://doi.org/10.1038/s41594-019-0287-x Sakurai H, Fukasawa T (1999) Activator-specific requirement for the general transcription factor IIE in yeast. Biochem Biophys Res Commun 261(3):734–739. https://doi.org/10.1006/bbrc.1999.1113 Su Mo J, Cheon Chae S (2021) MicroRNA 452 regulates GTF2E1 expression in colorectal cancer cells. J Genet 100(2). https://doi.org/10.1007/s12041-021-01312-3 Young MD, Wakefield MJ, Smyth GK, Oshlack A (2010) Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol 11(2):R14. https://doi.org/10.1186/gb-2010-11-2-r14 Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8740923","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":583026452,"identity":"bb0709c8-3cd3-4dce-b137-ba3e4bc89659","order_by":0,"name":"Serdar Baysal","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABDElEQVRIie2PsUrEMBzGU4RMfYCKQn2EihA8rO2DuKQE4nKCD3BDnbpUdyefQKiLk8NfAtelmDVyNzRLJ4cbe+BgeneO7XUUzG/5lu+XL3+ELJY/iocQdOkAbcMu72CsgkDnvFPS8cp7jcXWHar72b1W67elH0hRA3Vl9JwJszILr/qUoCrPJnnTnBaKB0DPF+y1Sowy5zdpn+Jx7LkgnEIho7gLRsAoTip6Ff+pwYffIOJCliug+IMRqYcVpDA+MitJAVOzgiEias9KUPGDi2No2KOa3kKSM0qUWaEDt/jZ3Pn8guXlgyxfdNtGMZHXul7Nwv6P/XIC20w2TbqvvpnbPRqPKVssFsv/4gdhYnNKMQ+X9QAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0003-3394-2005","institution":"University of Oxford","correspondingAuthor":true,"prefix":"","firstName":"Serdar","middleName":"","lastName":"Baysal","suffix":""}],"badges":[],"createdAt":"2026-01-30 12:18:48","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-8740923/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8740923/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":101632226,"identity":"24b62461-ec28-4649-b195-0b42262b4bd2","added_by":"auto","created_at":"2026-02-02 05:45:36","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":47137,"visible":true,"origin":"","legend":"\u003cp\u003eInter-sample correlation heat map. R\u003csup\u003e2\u003c/sup\u003e: Square of Pearson correlation coefficient(R). The closer the correlation coefficient is to 1, the higher similarity the samples have.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-8740923/v1/d8603a4c665d678e60eb46dd.png"},{"id":101752990,"identity":"295780a1-18e6-423b-b60d-dd1787bcdd1b","added_by":"auto","created_at":"2026-02-03 10:38:39","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":43588,"visible":true,"origin":"","legend":"\u003cp\u003eVenn diagram showing the overlap between RNAPII and TFIIE-bound promoters.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-8740923/v1/8efc5532d5ed4b254b7b3d0b.png"},{"id":101752578,"identity":"ba1e499e-91ab-4e48-9c52-b32451903ec4","added_by":"auto","created_at":"2026-02-03 10:28:16","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":41836,"visible":true,"origin":"","legend":"\u003cp\u003eKEGG Enrichment Analysis Scatter Plot.The vertical axis represents the enriched KEGG pathways. The horizontal axis represents gene ratio of each KEGG pathway. Gene ratio refers to the ratio of the number of \u003cu\u003edifferentially expressed genes\u003c/u\u003e (DEGs) enriched in certain KEGG pathway to the number of annotated genes. The greater the value is, the higher the DEGs enrichment degree is. The size of the dots indicates the number of DEGs enriched in certain pathway, and the color of the dots corresponds to the range of the padj value (p.adjust:adjusted p value).\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-8740923/v1/67ed642f199e5aa19b362d53.png"},{"id":101632228,"identity":"c3c0e1a9-94ac-4776-9173-33b9804f7831","added_by":"auto","created_at":"2026-02-02 05:45:36","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":172151,"visible":true,"origin":"","legend":"\u003cp\u003eGO Enrichment Analysis Scatter Plot for biological process.The vertical axis represents the enriched GO pathways. The horizontal axis represents gene ratio of each GO pathway. Gene ratio refers to the ratio of the number of DEGs enriched in certain GO pathway to the number of annotated genes. The greater the value is, the higher the DEGs enrichment degree is. The size of the dots indicates the number of DEGs enriched in certain pathway, and the color of the dots corresponds to the range of the padj value (padj:adjusted p value).\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-8740923/v1/effe55f9393cc9a089caa936.png"},{"id":101754196,"identity":"c6743480-c918-4d8f-a473-a5736ca251c9","added_by":"auto","created_at":"2026-02-03 10:42:01","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":148264,"visible":true,"origin":"","legend":"\u003cp\u003eGO Enrichment Analysis Scatter Plot for cellular component category.The vertical axis represents the enriched GO pathways. The horizontal axis represents gene ratio of each GO pathway. Gene ratio refers to the ratio of the number of DEGs enriched in certain GO pathway to the number of annotated genes. The greater the value is, the higher the DEGs enrichment degree is. The size of the dots indicates the number of DEGs enriched in certain pathway, and the color of the dots corresponds to the range of the padj value (padj:adjusted p value).\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-8740923/v1/e10288dbcc11ec90f184d72f.png"},{"id":101632231,"identity":"dec38a54-8b66-43b5-9869-1f5d16ee3267","added_by":"auto","created_at":"2026-02-02 05:45:36","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":146094,"visible":true,"origin":"","legend":"\u003cp\u003eGO Enrichment Analysis Scatter Plot for molecular function category.The vertical axis represents the enriched GO pathways. The horizontal axis represents gene ratio of each GO pathway. Gene ratio refers to the ratio of the number of DEGs enriched in certain GO pathway to the number of annotated genes. The greater the value is, the higher the DEGs enrichment degree is. The size of the dots indicates the number of DEGs enriched in certain pathway, and the color of the dots corresponds to the range of the padj value (padj:adjusted p value).\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-8740923/v1/c45feda0645a294c1fdb11f7.png"},{"id":101632230,"identity":"1dd6fcf1-bf37-4c5f-b921-d89f15628021","added_by":"auto","created_at":"2026-02-02 05:45:36","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":200805,"visible":true,"origin":"","legend":"\u003cp\u003eLogos of the motifs discovered by peak-motifs for the factor TFIIE adapted from the ChIP-seq data and MEME2 was used as enrichment program.\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-8740923/v1/03ad8c32631411cc93b86c53.png"},{"id":101755885,"identity":"fc164ae2-ad69-4512-a57d-dff1b4230e1b","added_by":"auto","created_at":"2026-02-03 10:55:14","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1149823,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8740923/v1/d60f5949-a832-4c91-a155-6d67b6b58d83.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003e\u003cstrong\u003eChIP-seq analysis reveals genes regulated by TFIIE and\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eassociation of TFIIE with various pathways\u003c/strong\u003e\u003c/p\u003e","fulltext":[{"header":"1.Introduction","content":"\u003cp\u003eTranscription is the process by which RNA polymerase II which has twelve subunits and POLR2A is the largest subunits of RNAPII, uses DNA as a template to produce mRNA, which is highly preserved and controlled process among organisms such as prokaryotes and eukaryotes (Engel et al., \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). During transcription, the formation of the pre-initiation complex is a critical step for transcription initiation which needs recruitment of RNA polymerase II (RNAPII), mediator, and six general transcription factors (GTFIIA, GTFIIB, GTFIID, GTFIIE, GTFIIF, and GTFIIH) to the promoter site of the gene to be transcribed (Butler and Kadonaga, \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2002\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eAmong the GTFs, TFIIE, although not as widely characterized, is critical for transcriptional function. The crystal structure of TFIIE was determined at atomic resolution, consisting of TFIIEα and TFIIEβ subunits. (Miwa et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). TFIIEα is a 439 amino acid long polypeptide with a molecular mass of 56 kDa, and TFIIEβ is 291 amino acids long and has a molecular mass of 34 kDa (Peterson et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e1991\u003c/span\u003e). Interactions between the α and the β subunits occur through the N-terminal domain of TFIIEα and between the 193\u0026ndash;238 amino acid region of TFIIE β (Ohkuma et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e1995\u003c/span\u003e). In addition, the C-terminal domain of TFIIEα is believed to interact with TFIIH because TFIIE takes part in the recruitment of TFIIH to the preinitiation complex (PIC) (Thomas and Chiang, 2006). This interaction enables TFIIEα to regulate the kinase and ATPase activities of TFIIH so that it phosphorylates the C\u0026shy;terminal domain of RNAP II and starts the elongation process(Peterson et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e1991\u003c/span\u003e). Moreover, TFIIE also helps TFIIH helicase activity for promoter clearance and DNA melting as well (Holstege et al., \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e1996\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eBeyond its role in transcription regulation, there some recent studies have shown that TFIIE has an association with certain diseases. One study showed that TFIIEα protein expression decreased in colorectal cancer tissues compared with adjacent non-tumour tissues (Mo and Chae, 2021). Another study showed that mutations in TFIIE due to compromised ribosomal biogenesis and translational precision resulted in deprivation of protein homeostasis (proteostasis), which could moderately elucidate the clinical phenotype in trichothiodystrophy (TTH) (Phan et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2021\u003c/span\u003e).A further study showed that the patients with TTH had mutations in general transcription factor TFIIEβ with routine DNA repair ,which could elucidate the clinical findings of TTH (Kuschal et al., \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2016\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eChIP-seq, or chromatin immunoprecipitation followed by sequencing, is a technique used to identify the DNA regions that interact with specific proteins in a cell. It combines chromatin immunoprecipitation (ChIP) with high-throughput DNA sequencing to map protein-DNA interactions on a genome-wide scale. This allows researchers to identify where proteins, such as transcription factors or histone modifications, bind to DNA. ChIP-seq has become an essential technique for investigating gene regulation and epigenetic mechanisms (Park et al., 2009). The experiments in the current work were designed to study the genes occupied by TFIIE and characterize the cellular pathways associated with TFIIE. ChIP-seq analysis was performed on human K562 cell line data obtanined from the ENCODE Project Consortium database with the aim of investigating TFIIE-associated cellular pathways and genes to shed light on the role of TFIIE on certain disease mechanims.\u003c/p\u003e"},{"header":"2. Materials and Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\n \u003ch2\u003e2.1 ChIP-Seq Analysis\u003c/h2\u003e\n \u003cp\u003eFor the ChIP-seq experiment, POLR2A ChIP-seq and eGFP-GTF2E2 ChIP-seq on human K562 (a human immortalized myelogenous leukemia) cell line data were obtanined from the ENCODE Project Consortium database using the Gene Expression Omnibus(GEO) accession numbers GSE91721 and GSE105643, respectively (The ENCODE Project Consortium, \u003cspan class=\"CitationRef\"\u003e2012\u003c/span\u003e) as duplicate. 1x 10\u003csup\u003e8\u003c/sup\u003e and 2x 10\u003csup\u003e7\u003c/sup\u003e cells were used for POLR2A and eGFP-GTF2E2 ChIP-seq data analysis respectively.eGFP-GTF2E2 involves a fusion protein where a gene encoding enhanced green fluorescent protein (eGFP) is linked to the GTF2E2 gene. The eGFP tag allows for the detection and immunoprecipitation of the GTF2E2 protein.Raw reads were first filtered using trim_galore and fastp and then aligned to the hg38 (UCSC) reference genome using Bowtie2. The MarkDuplicates tool was subsequently used to detect and remove duplicate reads from the aligned data. bigWig files were made using bamCoverage. Peaks were initially characterized from the merged reads of two biological replicates, and peaks that were not present in either replicate were discarded afterward. Peak calling was conducted using the Model-based Analysis of ChIP-Seq 2 (MACS2) software. Peak annotation was performed using ChIPseeker, and motif analysis was performed using Multiple Expectation maximizations for Motif Elicitation 2 (MEME2). Functional enrichment analysis of peaks-located genes was obtained using clusterProfiler. The mutual peak calling between POLR2A and TFIIE2 was accomplished using IDR, and ChIPpeakAnno was used to provide Venn diagrams of promoter regions. Plot heatmaps and profiles for signal enrichment near transcription start sites (TSS) (TSS\u0026thinsp;\u0026minus;\u0026thinsp;3 kb to TSS\u0026thinsp;+\u0026thinsp;3 kb) were produced using the output from computeMatrix. FASTQC was used for quality control. Inter-sample correlation heat map was generated using Pearson Correlation Coefficient.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\n \u003ch2\u003e2.2 Gene Ontology (GO) analysis\u003c/h2\u003e\n \u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n \u003cp\u003eGene Ontology\u003csup\u003e1\u003c/sup\u003e, is a public bioinformatics categorization database used to unite the display of gene features among different species. It has three main categories: cellular component, molecular function, and biological process. Gene Ontology (Young et al., \u003cspan class=\"CitationRef\"\u003e2010\u003c/span\u003e). GO enrichment analysis was performed using the cluster Profiler R package, in which gene length bias was adjusted. GO terms with normalized P-values less than 0.05 were esteemed significantly enriched.\u003c/p\u003e\n \u003cp\u003e\u003csup\u003e1\u003c/sup\u003eGene Ontology (1999).Gene Ontology Resource [online].Website: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://www.geneontology.org/\u003c/span\u003e\u003c/span\u003e [Accessed 26 July 2024].\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\n \u003ch2\u003e2.3 Kyoto Encyclopedia of Genes and Genomes (KEGG) Analysis\u003c/h2\u003e\n \u003cp\u003eKEGG\u003csup\u003e2\u003c/sup\u003e is a database that helps interpret biological system such as cells, organisms, and ecosystems, using large-scale molecular data from genome sequencing and other high-throughput methods. The clusterProfiler R package was used to analyze the statistical enrichment of genes in KEGG pathways(Kaeisha et al., 2000).\u003c/p\u003e\n \u003cp\u003e\u003csup\u003e2\u003c/sup\u003eKyoto Encyclopedia of Genes and Genomes (1995).KEGG Resource [online].Website: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.genome.jp/kegg/\u003c/span\u003e\u003c/span\u003e [Accessed 2 August 2024].\u003c/p\u003e\n\u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003e3.1 ChIP Sequencing Data Quality Control\u003c/h2\u003e \u003cp\u003eThe quality control of ChIP-seq data is presented in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Both quality 20(Q20) and quality 30(Q30) of each sample were \u0026gt;\u0026thinsp;92%. The correlation coefficients between groups were calculated and visualized as heat maps. This method intuitively shows sample differences and replicates between groups. The higher the correlation coefficient of a sample, the closer its expression pattern. The correlation coefficient matrix is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eChIP-Seq Data Quality\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSample_ID\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eClean_reads\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eClean_bases\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eQ20\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eQ30\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eGC_Content\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRNAPII_1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e44055194\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4446125776\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e98.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e94.44\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e44.91\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRNAPII_2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e57359258\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5788144586\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e98.56\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e94.28\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e46.14\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRNAPII_input\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e89767598\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e9060422178\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e98.28\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e93.31\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e41.06\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTFIIE2_1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e38627050\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1892725450\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e97.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e94.01\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e41\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTFIIE2_2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e32010404\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1568509796\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e98.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e94.72\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e41.78\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTFIIE2_input\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e22674893\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1111069757\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e97.35\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e92.47\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e41.78\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Co-occupancy of RNAPII and TFIIE\u003c/h2\u003e \u003cp\u003eTFIIE was not detected in some of the purified RNAPII holoenzyme forms in previous studies. Therefore co-occupancy of RNAPII and TFIIE around TSS of genes was investigated from K652 cell lines by using ChIP-seq.\u0026nbsp;High- confidence peaks were used in our assays. Interestingly, 97% of genes showed differential occupancy of RNAPII and TFIIE around TSS (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e), once again highlighting the possibility that TFIIE might be required for a specific set of genes(Supplementary Table\u0026nbsp;1).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003e3.3 KEGG analysis\u003c/b\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eSix significantly enriched KEGG pathways were identified (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). These pathways involved splicesomes, alcoholism, ATP-dependent chromatin remodeling, necroptosis, and neutrophil extracellular trap formation. Pathways associated with immune system disease such as systemic lupus erythematosus were also included.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003e3.4 GO analysis\u003c/h2\u003e \u003cp\u003eSixteen significantly enriched GO terms were identified ( Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, \u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e and \u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e), with six related to molecular functions, three related to biological processes and seven with cellular components. The molecular function category contains nucleosomal DNA binding, pre-mRNA 5\u0026rsquo;-splice site binding, nucleosome binding, pre-mRNA binding, protein heterodimerization activity, and structural constitute of chromatin. The biological process category contains RNA splicing, ribonucleoprotein complex biogenesis, and mRNA 5\u0026rsquo;-splice site recognition. The cellular component category contains U1-snRNP, nucleosome, spliceosomal snRNP complex, protein-DNA complex, DNA packaging complex, Sm-like protein family complex, and small-nuclear ribonucleoprotein complex.\u003c/p\u003e\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003e3.5 Motif Analysis\u003c/h2\u003e \u003cp\u003eIt was questioned whether TFIIE was enriched in any motif sequences and significant enrichment was discovered in three motifs ( E-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 ) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eTranscription is a highly preserved and firmly controlled process among organisms such as prokaryotes and eukaryotes (Franklin and Vondriska, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2011\u003c/span\u003e). There are many steps in transcription regulation, one of which is the initiation phase. The formation of pre-initiation complex is a critical step for transcription initiation which needs recruitment of RNAPII, mediator and six GTFs (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH) to the promoter site of the gene to be transcribed (Roeder et al., 2019).\u003c/p\u003e \u003cp\u003eEarlier researches revealed that TFIIE was one of transcription factors recruited to the promoter region of targeted genes for transcription (Peterson et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e1991\u003c/span\u003e).It was proposed in a study that involvement of TFIIE was dependent on the gene to be transcribed and recruitment of TFIIE was by-passed in yeast(Fukasawa et al., 1999). In addition, different research has shown the relation of TFIIE with distinct groups of diseases such as colon cancer and TTH (Mo and Chae, 2021; Phan et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). On this matter, in the current study, experiments were designed to investigate and characterize the disease-specific role of TFIIE and which pathways were associated with TFIIE using the ChIP-seq method.\u003c/p\u003e \u003cp\u003eTFIIE was not detected in some of the purified RNAPII holoenzyme forms in previous studies (Thomas and Chiang, 2006). Therefore, co-\u0026shy; occupancy of RNAPII and TFIIE around the TSS of genes was checked from K652 cell lines by using ChIP-seq.\u0026nbsp;ChIP-seq analysis was performed from the ENCODE Project Consortium database. High confidence peaks were used in our assays. Interestingly, 97% of genes showed differential occupancy of RNAPII and TFIIE around TSSs, once again highlighting the possibility that TFIIE might not be required for all genes, only a certain group of genes. However, it is advisable to conduct tests using various cell lines and experimental conditions because the formation of the PIC is highly transient. This method provides only a brief snapshot, making it unlikely to capture all components of the machinery associated with DNA.\u003c/p\u003e \u003cp\u003eOur study revealed that TFIIE might be associated with certain group of cellular, molecular and biological processes even with certain group of diseases such as sytemic lupus erythematosus although TFIIE is considered as a general transcription factor.In this regards, this study can help us understand specific role of TFIIE rather than it\u0026rsquo;s general transcription factor role.\u003c/p\u003e \u003cp\u003eIn summary, ChIP-seq analysis was used to investigate the genes occupied by TFIIE to understand its role in certain cellular pathways and diseases. Even though the net effect of the above-identified pathways are not yet known, these investigations into the TFIIE-related genes on a distinct group of pathways can help future studies identify TFIIE-associated diseases.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e \u003ch2\u003eConflicts of Interest:\u003c/h2\u003e \u003cp\u003eThe authors declare no conflicts of interest.\u003c/p\u003e \u003c/p\u003e\u003ch2\u003eAcknowledgments:\u003c/h2\u003e \u003cp\u003eI thank Ervin Fodor (University of Oxford) for his critical input and comments.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eButler JEF, Kadonaga JT (2002) The RNA polymerase II core promoter: a key component in the regulation of gene expression. Genes Dev 16(20):2583\u0026ndash;2592. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1101/gad.1026202\u003c/span\u003e\u003cspan address=\"10.1101/gad.1026202\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57\u0026ndash;74. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/nature11247\u003c/span\u003e\u003cspan address=\"10.1038/nature11247\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEngel C, Neyer S, Cramer P (2018) Distinct mechanisms of transcription initiation by RNA polymerases I and II. Annual Rev Biophys 47(1):425\u0026ndash;446. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1146/annurev-biophys-070317-033058\u003c/span\u003e\u003cspan address=\"10.1146/annurev-biophys-070317-033058\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFranklin S, Vondriska TM (2011) Genomes, proteomes, and the central dogma. Circ Cardiovasc Genet 4(5):576. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1161/CIRCGENETICS.110.957795\u003c/span\u003e\u003cspan address=\"10.1161/CIRCGENETICS.110.957795\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHolstege FC, van der Vliet PC, Timmers HT (1996) Opening of an RNA polymerase II promoter occurs in two distinct steps and requires the basal transcription factors IIE and IIH. EMBO J 15(7):1666\u0026ndash;1677. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1002/j.1460-2075.1996.tb00512.x\u003c/span\u003e\u003cspan address=\"10.1002/j.1460-2075.1996.tb00512.x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28(1):27\u0026ndash;30. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/nar/28.1.27\u003c/span\u003e\u003cspan address=\"10.1093/nar/28.1.27\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKuschal C, Botta E, Orioli D, Digiovanna JJ, Seneca S, Keymolen K, Tamura D, Heller E, Khan SG, Caligiuri G, Lanzafame M, Nardo T, Ricotti R, Peverali FA, Stephens R, Zhao Y, Lehmann AR, Baranello L, Levens D, Stefanini M (2016) GTF2E2 mutations destabilize the general transcription factor complex TFIIE in individuals with DNA repair-proficient trichothiodystrophy. Am J Hum Genet 98(4):627\u0026ndash;642. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.ajhg.2016.02.008\u003c/span\u003e\u003cspan address=\"10.1016/j.ajhg.2016.02.008\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMiwa K, Kojima R, Obita T, Ohkuma Y, Tamura Y, Mizuguchi M (2016) Crystal structure of human general transcription factor TFIIE at atomic resolution. J Mol Biol 428(21):4258\u0026ndash;4266. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jmb.2016.09.008\u003c/span\u003e\u003cspan address=\"10.1016/j.jmb.2016.09.008\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOhkuma Y, Hashimoto S, Wang CK, Horikoshi M, Roeder RG (1995) Analysis of the role of TFIIE in basal transcription and TFIIH-mediated carboxy-terminal domain phosphorylation through structure-function studies of TFIIE-alpha. Mol Cell Biol 15(9):4856\u0026ndash;4866. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1128/MCB.15.9.4856\u003c/span\u003e\u003cspan address=\"10.1128/MCB.15.9.4856\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePark PJ (2009) ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 10(10):669\u0026ndash;680. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/nrg2641\u003c/span\u003e\u003cspan address=\"10.1038/nrg2641\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePeterson MG, Inostroza J, Maxon ME, Flores O, Admon A, Reinberg D, Tjian R (1991) Structure and functional properties of human general transcription factor IIE. Nature 354(6352):369\u0026ndash;373. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/354369a0\u003c/span\u003e\u003cspan address=\"10.1038/354369a0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePhan T, Maity P, Ludwig C, Streit L, Michaelis J, Tsesmelis M, Scharffetter-Kochanek K, Iben S (2021) Nucleolar TFIIE plays a role in ribosomal biogenesis and performance. Nucleic Acids Res 49(19):11197\u0026ndash;11210. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/nar/gkab866\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkab866\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRoeder RG (2019) 50\u0026thinsp;+\u0026thinsp;years of eukaryotic transcription: an expanding universe of factors and mechanisms. Nat Struct Mol Biol 26(9):783\u0026ndash;791. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41594-019-0287-x\u003c/span\u003e\u003cspan address=\"10.1038/s41594-019-0287-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSakurai H, Fukasawa T (1999) Activator-specific requirement for the general transcription factor IIE in yeast. Biochem Biophys Res Commun 261(3):734\u0026ndash;739. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1006/bbrc.1999.1113\u003c/span\u003e\u003cspan address=\"10.1006/bbrc.1999.1113\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSu Mo J, Cheon Chae S (2021) MicroRNA 452 regulates GTF2E1 expression in colorectal cancer cells. J Genet 100(2). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s12041-021-01312-3\u003c/span\u003e\u003cspan address=\"10.1007/s12041-021-01312-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYoung MD, Wakefield MJ, Smyth GK, Oshlack A (2010) Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol 11(2):R14. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/gb-2010-11-2-r14\u003c/span\u003e\u003cspan address=\"10.1186/gb-2010-11-2-r14\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"University of Oxford","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"bioinformatics, ChIP-Seq, human disease, TFIIE, transcription","lastPublishedDoi":"10.21203/rs.3.rs-8740923/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8740923/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003ePrevious studies on transcription factor II E(TFIIE) showed that TFIIE was a general transcription factor and that was involved in the transcription of genes to be transcribed. It is supposed that TFIIE is recruited to promoter sites of genes to be transcribed and plays a role in transcription initiation. The aim of this study was to investigate and identify the disease-specific role of TFIIE rather than its general transcription factor role, and characterize the cellular pathways and genes occupied by TFIIE. In this regards, ChIP-seq analysis was performed. For ChIP-seq data, POLR2A ChIP-seq and eGFP-GTF2E2 ChIP-seq were used on human K562 cell line data obtanined from the ENCODE (Encyclopedia of DNA Elements) Project Consortium database.Six significantly enriched KEGG pathways were identified. These pathways involved splicesomes, alcoholism, ATP-dependent chromatin remodeling, necroptosis, neutrophil extracellular trap formation and systemic lupus erythematosus. Also, sixteen significantly enriched GO terms were identifed, six related to molecular functions, three related to biological processes and seven related to cellular components. The molecular function category contains nucleosomal DNA binding, nucleosome binding, pre-mRNA 5\u0026rsquo;-splice site binding, pre-mRNA binding, protein heterodimerization activity, and structural constitute of chromatin. The biological process category contains RNA splicing, ribonucleoprotein complex biogenesis, and mRNA 5\u0026rsquo;-splice site recognition. The cellular component category contains U1-snRNP, nucleosome, spliceosomal snRNP complex, protein-DNA complex, DNA packaging complex, Sm-like protein family complex, and small-nuclear ribonucleoprotein complex. A significant enrichment of 3 motifs was also discovered.The current study revealed TFIIE-associated genes and association of TFIIE with certain groups of cellular pathways.This might shed light on disease specific role of TFIIE and help future studies characterize TFIIE-related human diseases.\u003c/p\u003e","manuscriptTitle":"ChIP-seq analysis reveals genes regulated by TFIIE and\nassociation of TFIIE with various pathways","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-02-02 05:45:20","doi":"10.21203/rs.3.rs-8740923/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"77173005-5cdf-452e-bbb3-9e34341f8a86","owner":[],"postedDate":"February 2nd, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":62028713,"name":"Bioinformatics"}],"tags":[],"updatedAt":"2026-02-02T05:45:20+00:00","versionOfRecord":[],"versionCreatedAt":"2026-02-02 05:45:20","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8740923","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8740923","identity":"rs-8740923","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.