Constrained conditional model for screening marker genes through integrated high-throughput transcriptome big data

preprint OA: closed
Full text JSON View at publisher
Full text 35,072 characters · extracted from preprint-html · click to expand
Constrained conditional model for screening marker genes through integrated high-throughput transcriptome big data | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Constrained conditional model for screening marker genes through integrated high-throughput transcriptome big data Xiaobei Zhou, Jing Wan, Na Lv, Ruiyang Ma, Wensu Liu This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4858125/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Currently, a large volume of high-throughput transcriptome data are accessible through various public databases. This brings urgent need to integrate transcriptome data sourced from numerous independent studies to extract common characteristics to precisely address specific research questions. Referring to this, we innovatively designed a model framework that leverages integrated transcriptome data with declarative constraints. Here the constraint is used to incorporate the prior knowledge (genes in certain biological pathways, functions or user-specified terms) into the model and drive the model prediction/decision to satisfy these constraints. Distinguishing from existing models or methods, this framework implies tailed non-parametric bootstrapping algorithm to generate millions of samples of p-value of independent analyses to avoid the normal procedure of integrating data with different distribution styles. The model was applied in 5 tumor and 5 non-tumor case studies using 81 downloaded transcriptome datasets with 10,647 samples. High percentage of the selected genes were accordant with published work, co-occurrence results and gene ontology results. Experimental validations including transwell invasion/migration confirmed the identified genes associated with cancer progression in prostate, liver, and endometrial cancer case studies. Therefore our model was effective in extracting marker genes under user-specified conditions and were possessing significance in further understanding specified situations. Constraints marker genes bootstrapping transcriptome data differential gene Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Full Text Additional Declarations No competing interests reported. Supplementary Files TableS1.xlsx TableS2.xlsx TableS3.xlsx TableS4.xlsx TableS5.xlsx TableS6.xlsx TableS7.xlsx TableS8.xlsx TableS9.xlsx TableS10.xlsx TableS11.xlsx TableS12.xlsx TableS13.xlsx TableS14.xlsx TableS15.xlsx TableS16.xlsx TableS17.xlsx TableS18.xlsx TableS19.xlsx TableS20.xlsx TableS21.xlsx TableS22.xlsx TableS23.xlsx TableS24.xlsx FigureS1.tif FigureS2.tif FigureS4.tif FigureS5.tif FigureS6.tif FigureS7.tif FigureS8.tif FigureS9.tif FigureS10.tif FigureS11.tif FigureS12.tif FigureS13.tif Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4858125","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":336477049,"identity":"df4207f6-2bef-4987-b8ca-fb8b2e06b49d","order_by":0,"name":"Xiaobei Zhou","email":"","orcid":"","institution":"China Medical University","correspondingAuthor":false,"prefix":"","firstName":"Xiaobei","middleName":"","lastName":"Zhou","suffix":""},{"id":336477050,"identity":"d5912470-b250-4655-85bb-16ee4562b3dd","order_by":1,"name":"Jing Wan","email":"","orcid":"","institution":"China Medical University","correspondingAuthor":false,"prefix":"","firstName":"Jing","middleName":"","lastName":"Wan","suffix":""},{"id":336477051,"identity":"854731f9-9286-458e-a769-1f961cedf9c2","order_by":2,"name":"Na Lv","email":"","orcid":"","institution":"China Medical University","correspondingAuthor":false,"prefix":"","firstName":"Na","middleName":"","lastName":"Lv","suffix":""},{"id":336477052,"identity":"80f1d991-8607-433f-9e9b-7782c1b5d640","order_by":3,"name":"Ruiyang Ma","email":"","orcid":"","institution":"The First Hospital of China Medical University","correspondingAuthor":false,"prefix":"","firstName":"Ruiyang","middleName":"","lastName":"Ma","suffix":""},{"id":336477053,"identity":"11b5c5d0-1587-4218-bd07-569e5c40d3c8","order_by":4,"name":"Wensu Liu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAsElEQVRIiWNgGAWjYFAC5oYDDBUQpgSRWhiBWs6QqoWBsY0ULQbnDzYe5p1nF21wgPngbR4GuzzCWm4kNhzm3Zacu+EAW7I1D0NyMUEtZjcYQVoOALXwmEnzMBxIbCCo5fxBoJY5IC3834jUcgDksAawLWzEabEH+uXgnGPJuTMPsxlbzjFIJqxFsv/w4Q9vauxy+443P7zxpsKOsBYEYAYRBsSrHwWjYBSMglGABwAAqO9AlE+O4RgAAAAASUVORK5CYII=","orcid":"","institution":"China Medical University","correspondingAuthor":true,"prefix":"","firstName":"Wensu","middleName":"","lastName":"Liu","suffix":""}],"badges":[],"createdAt":"2024-08-04 20:08:34","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4858125/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4858125/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":64277681,"identity":"460eadea-f6cd-41c4-98e8-e0474e902309","added_by":"auto","created_at":"2024-09-11 06:59:04","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":490105,"visible":true,"origin":"","legend":"\u003cp\u003eBlueprint of the CCMSB model\u003c/p\u003e\n\u003cp\u003e(A) The pattern diagram illustrating the basic principals and ideology for the CCMSB model.\u003c/p\u003e\n\u003cp\u003e(B) The flowchart showing the procedure of construction and validation of CCMSB model.\u003c/p\u003e","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/aed8c911d5e6c6fd6118b408.png"},{"id":64277698,"identity":"2b29c0ce-68e3-4664-80e0-c0bf731d98da","added_by":"auto","created_at":"2024-09-11 06:59:05","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":981885,"visible":true,"origin":"","legend":"\u003cp\u003eApplying CCMSB model on identifying genes promoting liver cancer and experimental validation\u003c/p\u003e\n\u003cp\u003e(A) Pearson’s correlation map for expression of downloaded datasets with single-cell RNA-seq datasets labeled red.\u003c/p\u003e\n\u003cp\u003e(B) Survival information and reported status (with pmid illustrated) of the final filtered genes.\u003c/p\u003e\n\u003cp\u003e(C-D) Transwell invasion (C) and migration (D) experiment for si-HLTF against normal control (si-NC) with barplot and cell images on HUH7 cells with with statistical significance.\u003c/p\u003e\n\u003cp\u003e(E-F) Transwell invasion (E) and migration (F) experiment for si-HLTF against normal control (si-NC) with barplot and cell images on HepG2 cells with with statistical significance.\u003c/p\u003e","description":"","filename":"Figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/e59d871444534b8b70d8ed60.png"},{"id":64277692,"identity":"10cbc52c-2073-48a1-9977-181a14f9731d","added_by":"auto","created_at":"2024-09-11 06:59:04","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":1177953,"visible":true,"origin":"","legend":"\u003cp\u003eApplying CCMSB model on identifying genes promoting castration-resistance for prostate cancer and experimental validation\u003c/p\u003e\n\u003cp\u003e(A) Pearson’s correlation map for expression of downloaded datasets with single-cell RNA-seq datasets labeled red.\u003c/p\u003e\n\u003cp\u003e(B) Survival information and reported status (with pmid illustrated) of the final filtered genes.\u003c/p\u003e\n\u003cp\u003e(A) Transwell invasion experiment for si-ABCC4 against normal control (si-NC) with barplot and cell images on 22Rv1 cells with with statistical significance.\u003c/p\u003e\n\u003cp\u003e(B) Transwell migration experiment for si-ABCC4 against normal control (si-NC) with barplot and cell images on 22Rv1 cells with with statistical significance.\u003c/p\u003e","description":"","filename":"Figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/94437619463afbf7b1e97bbf.png"},{"id":64277685,"identity":"b06665a9-d6ea-41d0-ab31-60aedc51b52b","added_by":"auto","created_at":"2024-09-11 06:59:04","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":723615,"visible":true,"origin":"","legend":"\u003cp\u003eApplying CCMSB model on identifying genes promoting endometrial cancer and colon cancer\u003c/p\u003e\n\u003cp\u003e(A) Pearson’s correlation map for expression of downloaded datasets with single-cell RNA-seq datasets labeled red for endometrial cancer.\u003c/p\u003e\n\u003cp\u003e(B) Survival information and reported status (with pmid illustrated) of the final filtered genes for endometrial cancer.\u003c/p\u003e\n\u003cp\u003e(C) Transwell invasion and migration experiment for si-SKAP1 against normal control (si-NC) with barplot and cell images on KLE cells with statistical significance.\u003c/p\u003e\n\u003cp\u003e(D) Pearson’s correlation map for expression of downloaded datasets with single-cell RNA-seq datasets labeled red for colon cancer.\u003c/p\u003e\n\u003cp\u003e(E) Survival information and reported status (with pmid illustrated) of the final filtered genes for colon cancer.\u003c/p\u003e\n\u003cp\u003e(F) Co-occur number of filtered genes with the important terms engaged in the process of colon cancer.\u003c/p\u003e","description":"","filename":"Figure4.png","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/34210d0df4ca1c6f0431de3f.png"},{"id":64277683,"identity":"849a4edc-1870-40aa-b47f-364c0aa1c141","added_by":"auto","created_at":"2024-09-11 06:59:04","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":615629,"visible":true,"origin":"","legend":"\u003cp\u003eApplying CCMSB model on identifying genes promoting tamoxifen resistance for breast cancer and AD with metal chelators\u003c/p\u003e\n\u003cp\u003e(A) Pearson’s correlation map for expression of downloaded datasets for tamoxifen resistant breast tumor.\u003c/p\u003e\n\u003cp\u003e(B) Survival information and reported status (with pmid illustrated) of the final filtered genes for tamoxifen resistance and breast tumor.\u003c/p\u003e\n\u003cp\u003e(C) Co-occur number of filtered genes with the important terms engaged in the process of tamoxifen resistance for breast tumor.\u003c/p\u003e\n\u003cp\u003e(D) Pearson’s correlation map for expression of downloaded datasets for Alzheimer’s disease.\u003c/p\u003e\n\u003cp\u003e(E) Reported status (with pmid illustrated) of the final filtered genes for Alzheimer’s disease and metal chelators.\u003c/p\u003e\n\u003cp\u003e(F) Co-occur number of filtered genes with the important terms engaged in the process of Alzheimer’s disease and metal chelators.\u003c/p\u003e","description":"","filename":"Figure5.png","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/cf1dc100875f985f2afc90a0.png"},{"id":64278617,"identity":"fd45f9ae-8fd0-4fd3-a06f-d075c738c55b","added_by":"auto","created_at":"2024-09-11 07:07:05","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":892295,"visible":true,"origin":"","legend":"\u003cp\u003eApplying CCMSB model on identifying genes promoting immunity for aging and visceral fat for obesity\u003c/p\u003e\n\u003cp\u003e(A) Pearson’s correlation map for expression of downloaded datasets for aging.\u003c/p\u003e\n\u003cp\u003e(B) Reported status (with pmid illustrated) of the final filtered genes for aging and immunity.\u003c/p\u003e\n\u003cp\u003e(C) Co-occur number of filtered genes with the important terms engaged in the process of aging and immunity. Here FDis implies functional disability, ImmunoS implies immunosenescence , neuroD implies neurodegeneration and OS implies oxidative stress.\u003c/p\u003e\n\u003cp\u003e(D) Pearson’s correlation map for expression of downloaded datasets for obesity on visceral fat.\u003c/p\u003e\n\u003cp\u003e(E) Reported status (with pmid illustrated) of the final filtered genes for obesity on visceral fat.\u003c/p\u003e\n\u003cp\u003e(F) Co-occur number of filtered genes with the important terms engaged in the process of obesity.\u003c/p\u003e","description":"","filename":"Figure6.png","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/81bb56d48a431c2c3a607979.png"},{"id":64277699,"identity":"cdc0e1cf-d3fd-4d07-9136-17636f80fc1b","added_by":"auto","created_at":"2024-09-11 06:59:05","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":220680,"visible":true,"origin":"","legend":"\u003cp\u003eApplying CCMSB model on identifying genes promoting COVID19\u003c/p\u003e\n\u003cp\u003e(A) Pearson’s correlation map for expression of downloaded datasets for COVID19 on blood samples.\u003c/p\u003e\n\u003cp\u003e(B) Reported status (with pmid illustrated) of the final filtered genes from blood samples for COVID19.\u003c/p\u003e\n\u003cp\u003e(C) Co-occur number of filtered genes with the important terms from blood samples engaged in the process of COVID19.\u003c/p\u003e\n\u003cp\u003e(D) Pearson’s correlation map for expression of downloaded datasets for COVID19 on lung samples.\u003c/p\u003e\n\u003cp\u003e(E) Reported status (with pmid illustrated) of the final filtered genes from lung samples on COVID19.\u003c/p\u003e\n\u003cp\u003eCo-occur number of filtered genes from lung samples with the important terms engaged in the process of COVID19.\u003c/p\u003e","description":"","filename":"Figure7.png","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/3a2332f2ffa809815e8a2e8f.png"},{"id":71419075,"identity":"f265408e-d643-4d67-a3c4-0c11007f977c","added_by":"auto","created_at":"2024-12-14 17:01:41","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1931176,"visible":true,"origin":"","legend":"","description":"","filename":"CCMSBmanuscriptv1.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1_covered_95112a9e-421c-42f3-8826-525bef341639.pdf"},{"id":64277690,"identity":"ccf02614-652a-4188-9925-544380f36c39","added_by":"auto","created_at":"2024-09-11 06:59:04","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":9401,"visible":true,"origin":"","legend":"","description":"","filename":"TableS1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/27c127f6f3720937ddf3f372.xlsx"},{"id":64277682,"identity":"09c64080-7046-4cef-96e9-dda6a7a99e9c","added_by":"auto","created_at":"2024-09-11 06:59:04","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":9869,"visible":true,"origin":"","legend":"","description":"","filename":"TableS2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/b8808c1157f55d40e6575c6b.xlsx"},{"id":64279022,"identity":"ab8dab35-af5e-4777-8eca-aa8587fade80","added_by":"auto","created_at":"2024-09-11 07:15:04","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":9267,"visible":true,"origin":"","legend":"","description":"","filename":"TableS3.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/df521194cc334e4ad605eff4.xlsx"},{"id":64278616,"identity":"84019677-14ec-47bb-a708-ec828658916c","added_by":"auto","created_at":"2024-09-11 07:07:05","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":9511,"visible":true,"origin":"","legend":"","description":"","filename":"TableS4.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/5fdc002476c01fb5479930b3.xlsx"},{"id":64278600,"identity":"e94a0747-4880-4c64-8c40-83fa2c33602b","added_by":"auto","created_at":"2024-09-11 07:07:04","extension":"xlsx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":8917,"visible":true,"origin":"","legend":"","description":"","filename":"TableS5.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/b1d4b2d3d6e937d54a37f061.xlsx"},{"id":64279020,"identity":"7e0e6fa7-3c9c-4398-97a4-451e63d3c40b","added_by":"auto","created_at":"2024-09-11 07:15:04","extension":"xlsx","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":10436,"visible":true,"origin":"","legend":"","description":"","filename":"TableS6.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/08fe5055201cfe0383a16c85.xlsx"},{"id":64278619,"identity":"b2d17aec-bd16-4637-86e4-3d2bbedfd4c3","added_by":"auto","created_at":"2024-09-11 07:07:06","extension":"xlsx","order_by":7,"title":"","display":"","copyAsset":false,"role":"supplement","size":9163,"visible":true,"origin":"","legend":"","description":"","filename":"TableS7.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/deac8d49296dfb5ed1175981.xlsx"},{"id":64279636,"identity":"82f9d3b2-5b10-4fbe-89f6-e9b7670d7d17","added_by":"auto","created_at":"2024-09-11 07:23:04","extension":"xlsx","order_by":8,"title":"","display":"","copyAsset":false,"role":"supplement","size":9969,"visible":true,"origin":"","legend":"","description":"","filename":"TableS8.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/f8d691d9536ab2c2829250ef.xlsx"},{"id":64277696,"identity":"c2ba42ae-5577-478c-afba-7df9c96eac83","added_by":"auto","created_at":"2024-09-11 06:59:05","extension":"xlsx","order_by":9,"title":"","display":"","copyAsset":false,"role":"supplement","size":8941,"visible":true,"origin":"","legend":"","description":"","filename":"TableS9.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/2bde7f082d60c590cc500334.xlsx"},{"id":64277710,"identity":"0b868bdc-3751-482a-821f-8c2b0b3c1bcf","added_by":"auto","created_at":"2024-09-11 06:59:06","extension":"xlsx","order_by":10,"title":"","display":"","copyAsset":false,"role":"supplement","size":9762,"visible":true,"origin":"","legend":"","description":"","filename":"TableS10.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/ec6e49b0516f8579c2267ec7.xlsx"},{"id":64277689,"identity":"c0c106bf-8c4f-4209-9b73-a798b2ed80c8","added_by":"auto","created_at":"2024-09-11 06:59:04","extension":"xlsx","order_by":11,"title":"","display":"","copyAsset":false,"role":"supplement","size":10555,"visible":true,"origin":"","legend":"","description":"","filename":"TableS11.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/5d1cb053144e56ece1722e03.xlsx"},{"id":64278620,"identity":"da8890e9-7f74-4aba-9ca3-983740c77b1d","added_by":"auto","created_at":"2024-09-11 07:07:07","extension":"xlsx","order_by":12,"title":"","display":"","copyAsset":false,"role":"supplement","size":11193,"visible":true,"origin":"","legend":"","description":"","filename":"TableS12.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/4964d61dc693a4969a5eb24b.xlsx"},{"id":64277691,"identity":"664861f9-4196-4c95-9d81-d3979b403cb4","added_by":"auto","created_at":"2024-09-11 06:59:04","extension":"xlsx","order_by":13,"title":"","display":"","copyAsset":false,"role":"supplement","size":10490,"visible":true,"origin":"","legend":"","description":"","filename":"TableS13.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/d42448292cf0ba2fc7b499a1.xlsx"},{"id":64277716,"identity":"939ddfbf-089b-40fa-af09-a8536721c14a","added_by":"auto","created_at":"2024-09-11 06:59:07","extension":"xlsx","order_by":14,"title":"","display":"","copyAsset":false,"role":"supplement","size":12047,"visible":true,"origin":"","legend":"","description":"","filename":"TableS14.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/b7cf2ab5487995c866fc6735.xlsx"},{"id":64277714,"identity":"df02728d-ce13-4f8c-8ad2-de6936fc51e0","added_by":"auto","created_at":"2024-09-11 06:59:06","extension":"xlsx","order_by":15,"title":"","display":"","copyAsset":false,"role":"supplement","size":10398,"visible":true,"origin":"","legend":"","description":"","filename":"TableS15.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/ca2b95fb571c5bda2abbd993.xlsx"},{"id":64277707,"identity":"05804e61-20a2-4db4-8b55-76192c280459","added_by":"auto","created_at":"2024-09-11 06:59:05","extension":"xlsx","order_by":16,"title":"","display":"","copyAsset":false,"role":"supplement","size":10604,"visible":true,"origin":"","legend":"","description":"","filename":"TableS16.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/82f0cddec47b7dfdb7fee80b.xlsx"},{"id":64277711,"identity":"f5d69a99-1ad1-4440-95f8-6b4cd7a4ec90","added_by":"auto","created_at":"2024-09-11 06:59:06","extension":"xlsx","order_by":17,"title":"","display":"","copyAsset":false,"role":"supplement","size":10422,"visible":true,"origin":"","legend":"","description":"","filename":"TableS17.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/51902009ff2705ae80c61042.xlsx"},{"id":64277706,"identity":"02ad893c-1e99-4e67-928c-ab8ed54fd199","added_by":"auto","created_at":"2024-09-11 06:59:05","extension":"xlsx","order_by":18,"title":"","display":"","copyAsset":false,"role":"supplement","size":10338,"visible":true,"origin":"","legend":"","description":"","filename":"TableS18.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/668a9e4abae86f19f30b651b.xlsx"},{"id":64278602,"identity":"4e9f036f-6d03-46c0-93ee-ce72919001f2","added_by":"auto","created_at":"2024-09-11 07:07:04","extension":"xlsx","order_by":19,"title":"","display":"","copyAsset":false,"role":"supplement","size":9540,"visible":true,"origin":"","legend":"","description":"","filename":"TableS19.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/1d98cbc4ca612b37ac8689d4.xlsx"},{"id":64277687,"identity":"b6bc68bf-175f-475f-8ca2-b1ec3f914f98","added_by":"auto","created_at":"2024-09-11 06:59:04","extension":"xlsx","order_by":20,"title":"","display":"","copyAsset":false,"role":"supplement","size":9582,"visible":true,"origin":"","legend":"","description":"","filename":"TableS20.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/357e68a75b7c7cc0aa1a4d4f.xlsx"},{"id":64277715,"identity":"345fb2ba-1d42-48d8-9530-9bc9fbaec5d9","added_by":"auto","created_at":"2024-09-11 06:59:06","extension":"xlsx","order_by":21,"title":"","display":"","copyAsset":false,"role":"supplement","size":10325,"visible":true,"origin":"","legend":"","description":"","filename":"TableS21.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/071acc74c453da6dd0437314.xlsx"},{"id":64277705,"identity":"9e65a1e3-d454-44c0-8200-98c8f1f62046","added_by":"auto","created_at":"2024-09-11 06:59:05","extension":"xlsx","order_by":22,"title":"","display":"","copyAsset":false,"role":"supplement","size":10507,"visible":true,"origin":"","legend":"","description":"","filename":"TableS22.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/ccffd173e6e8c9c200bd9abf.xlsx"},{"id":64278621,"identity":"089d40d7-7b7a-419b-8449-a171cae443d3","added_by":"auto","created_at":"2024-09-11 07:07:07","extension":"xlsx","order_by":23,"title":"","display":"","copyAsset":false,"role":"supplement","size":9549,"visible":true,"origin":"","legend":"","description":"","filename":"TableS23.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/498226043973497526916fb2.xlsx"},{"id":64277703,"identity":"84b5249e-c3c7-4efa-a678-20932b98dcd7","added_by":"auto","created_at":"2024-09-11 06:59:05","extension":"xlsx","order_by":24,"title":"","display":"","copyAsset":false,"role":"supplement","size":9143,"visible":true,"origin":"","legend":"","description":"","filename":"TableS24.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/15c7f9a453145e552bb625ea.xlsx"},{"id":64278605,"identity":"2f47bd7e-f4a2-4cec-bec0-de4d4de58195","added_by":"auto","created_at":"2024-09-11 07:07:04","extension":"tif","order_by":25,"title":"","display":"","copyAsset":false,"role":"supplement","size":13007404,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS1.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/7c2c03f44a78d92f90edb82f.tif"},{"id":64277694,"identity":"cc9de9df-1b34-4c7d-96d8-62ec44cd7e37","added_by":"auto","created_at":"2024-09-11 06:59:05","extension":"tif","order_by":26,"title":"","display":"","copyAsset":false,"role":"supplement","size":7392572,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS2.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/7b021b384118017515b92c0a.tif"},{"id":64277709,"identity":"993a315d-8851-4838-87cb-0b677fa70018","added_by":"auto","created_at":"2024-09-11 06:59:06","extension":"tif","order_by":28,"title":"","display":"","copyAsset":false,"role":"supplement","size":22039128,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS4.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/887dcbc8d66d32364ae69a56.tif"},{"id":64277717,"identity":"b2748a23-afc5-4c2b-9a45-0ae95b167f4a","added_by":"auto","created_at":"2024-09-11 06:59:07","extension":"tif","order_by":29,"title":"","display":"","copyAsset":false,"role":"supplement","size":43191400,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS5.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/83988389df8cdcae5c2aa7a4.tif"},{"id":64279024,"identity":"e2ce698b-be48-4aa8-bb27-9284371344b4","added_by":"auto","created_at":"2024-09-11 07:15:07","extension":"tif","order_by":30,"title":"","display":"","copyAsset":false,"role":"supplement","size":56350156,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS6.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/728a1cf52f7baa9ab9d41dcc.tif"},{"id":64277722,"identity":"eedbd9fc-d90a-4d66-bcb1-175557497231","added_by":"auto","created_at":"2024-09-11 06:59:07","extension":"tif","order_by":31,"title":"","display":"","copyAsset":false,"role":"supplement","size":44817220,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS7.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/03cdfa8f465231979cf9467f.tif"},{"id":64277708,"identity":"cd55fcae-7818-43db-aa6b-db542cf330ad","added_by":"auto","created_at":"2024-09-11 06:59:06","extension":"tif","order_by":32,"title":"","display":"","copyAsset":false,"role":"supplement","size":34011852,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS8.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/d85e515a371d9bed2fddbc2a.tif"},{"id":64278623,"identity":"b64daebc-565f-4092-9012-908ac4779767","added_by":"auto","created_at":"2024-09-11 07:07:07","extension":"tif","order_by":33,"title":"","display":"","copyAsset":false,"role":"supplement","size":26594656,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS9.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/d1de8bcf2f13052287f91426.tif"},{"id":64277713,"identity":"536ceafb-9164-4896-b82a-7dcbaf6132d5","added_by":"auto","created_at":"2024-09-11 06:59:06","extension":"tif","order_by":34,"title":"","display":"","copyAsset":false,"role":"supplement","size":31056876,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS10.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/5f2187510454ff7952f8d05b.tif"},{"id":64277702,"identity":"0218e6a8-9e75-4df3-ae8b-5f49a9ae8b36","added_by":"auto","created_at":"2024-09-11 06:59:05","extension":"tif","order_by":35,"title":"","display":"","copyAsset":false,"role":"supplement","size":17061900,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS11.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/8c64874569d5cc11baabf4a6.tif"},{"id":64277719,"identity":"dddf01b1-a1ef-40b4-aaa4-04d5a93cac65","added_by":"auto","created_at":"2024-09-11 06:59:07","extension":"tif","order_by":36,"title":"","display":"","copyAsset":false,"role":"supplement","size":8525968,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS12.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/6987ea35619d722d4c426193.tif"},{"id":64277724,"identity":"9bcdfa5f-72c4-4b4e-87d7-febda603c61a","added_by":"auto","created_at":"2024-09-11 06:59:08","extension":"tif","order_by":37,"title":"","display":"","copyAsset":false,"role":"supplement","size":96332828,"visible":true,"origin":"","legend":"","description":"","filename":"FigureS13.tif","url":"https://assets-eu.researchsquare.com/files/rs-4858125/v1/0efd87fb963e40f2763ccae3.tif"}],"financialInterests":"No competing interests reported.","formattedTitle":"Constrained conditional model for screening marker genes through integrated high-throughput transcriptome big data","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Constraints, marker genes, bootstrapping, transcriptome data, differential gene","lastPublishedDoi":"10.21203/rs.3.rs-4858125/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4858125/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eCurrently, a large volume of high-throughput transcriptome data are accessible through various public databases. This brings urgent need to integrate transcriptome data sourced from numerous independent studies to extract common characteristics to precisely address specific research questions. Referring to this, we innovatively designed a model framework that leverages integrated transcriptome data with declarative constraints. Here the constraint is used to incorporate the prior knowledge (genes in certain biological pathways, functions or user-specified terms) into the model and drive the model prediction/decision to satisfy these constraints. Distinguishing from existing models or methods, this framework implies tailed non-parametric bootstrapping algorithm to generate millions of samples of p-value of independent analyses to avoid the normal procedure of integrating data with different distribution styles. The model was applied in 5 tumor and 5 non-tumor case studies using 81 downloaded transcriptome datasets with 10,647 samples. High percentage of the selected genes were accordant with published work, co-occurrence results and gene ontology results. Experimental validations including transwell invasion/migration confirmed the identified genes associated with cancer progression in prostate, liver, and endometrial cancer case studies. Therefore our model was effective in extracting marker genes under user-specified conditions and were possessing significance in further understanding specified situations.\u003c/p\u003e","manuscriptTitle":"Constrained conditional model for screening marker genes through integrated high-throughput transcriptome big data","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-09-11 06:58:58","doi":"10.21203/rs.3.rs-4858125/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"7ae964cc-ce72-4d06-8ed1-fa4c04f56336","owner":[],"postedDate":"September 11th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-02-10T11:23:14+00:00","versionOfRecord":[],"versionCreatedAt":"2024-09-11 06:58:58","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4858125","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4858125","identity":"rs-4858125","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00