{"paper_id":"1bc39262-2b41-4fda-ae44-c25a4ea3f4ff","body_text":"Deciphering the ghost proteome in ovarian cancer cells by deep proteogenomic characterization | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Deciphering the ghost proteome in ovarian cancer cells by deep proteogenomic characterization Cardon Tristan, Diego Garcia-del Rio, Mehdi Derhourhi, Amelie Bonnefond, and 6 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3972487/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 30 Sep, 2024 Read the published version in Cell Death & Disease → Version 1 posted You are reading this latest preprint version Abstract Proteogenomics is becoming a powerful tool in personalized medicine by linking genomics, transcriptomics and mass spectrometry (MS)-based proteomics. Due to increasing evidence of alternative open reading frame-encoded proteins (AltProts), proteogenomics has a high potential to unravel the characteristics, variants and expression levels of the alternative proteome, in addition to already annotated proteins (RefProts). To obtain a broader view of the proteome of ovarian cancer cells compared to ovarian epithelial cells, cell-specific total RNA-sequencing profiles and customized protein databases were generated. In total, 128 RefProts and 30 AltProts were identified exclusively in SKOV-3 and PEO-4 cells. Among them, an AltProt variant of IP_715944, translated from DHX8 , was found mutated (p.Leu44Pro). We show high variation in protein expression levels of RefProts and AltProts in different subcellular compartments. The presence of 117 RefProt and two AltProt variants was described, along with their possible implications in the different physiological/pathological characteristics. To identify the possible involvement of AltProts in cellular processes, crosslinking-MS (XL-MS) was performed in each cell line to identify AltProt-RefProt interactions. This approach revealed an interaction between POLD3 and the AltProt IP_183088, which after molecular docking, was placed between POLD3-POLD2 binding sites, highlighting its possibility of the involvement in DNA replication and repair. Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 INTRODUCTION Historically, protein sequence databases have only considered proteins to originate from the coding regions of mRNA molecules (CDS) ( 1 , 2 ). However, we now know that the sequences of many products of transcript translation are not stored in such databases ( 3 ). Such translated transcripts include small open reading frames (smORFs) ( 4 – 6 ), which translate to short encoding proteins (SEPs) ( 7 , 8 ) with a length of less than 100 amino acids. Additionally, alternative proteins (AltProt) ( 9 – 11 ) are translated from alternative ORFs (AltORFs) present in non-coding regions, including the 5' and 3'UTRs, overlapping a CDS with a + 1 or + 2 reading, or present in non-coding RNAs (ncRNAs). In contrast to SEPs, AltProts are not limited to a maximum length of 100 amino acids. Synthesis of such proteins may result from leaky scanning and reinitiation of ribosomes as described by Marylin Kozak ( 12 , 13 ). However, such underlying mechanisms remain poorly understood and, importantly, they were not considered when the first protein databases were built, explaining the absence of quite some protein sequences in the most-often used protein sequence databases such as Swiss-Prot. Nevertheless, an effort has been made to make such databases more comprehensive, notably by integrating predicted protein sequences (TrEmbl) ( 14 ) which increase the size of the (theoretical) proteome. Yet, the used prediction rules are restrictive and do not consider the concept of AltProts. To tackle this, databases holding predicted sequence for AltProts such as OpenProt ( 9 , 15 ) have been created. With such databases AltProts can now be identified from bottom-up proteomic datasets. However, although such databases consider the presence of the \"ghost proteome\", they do not consider mutations and neither the transcriptomic expression of samples. To overcome these limitations, OpenCustomDB( 16 ), is a new tool that uses RNA-seq data to generate sample-specific protein sequence databases incorporating AltProts and their genetic variants. Such a proteogenomic approach coupled with AltProt research, is therefore expected to provide more comprehensive views on cellular proteomes. AltProts are ubiquitously expressed in cells and can carry physiological functions ( 17 ). Several AltProts have been linked to several pathways such as protein synthesis ( 18 – 20 ), DNA repair ( 8 ) and innate immunity ( 17 ). AltProts have also been linked to pathologies ( 21 , 22 ) such as cancers (glioblastoma, breast, ovarian and colorectal cancer) ( 23 – 26 ) and amyotrophic lateral sclerosis (Alt-FUS) ( 27 ). Although their identification has been facilitated by specific enrichment and detection strategies ( 19 , 28 – 30 ), for the overall majority of AltProts, their functions remains to be elucidated, yet targeted approaches have shed light on the function of a few AltProts ( 20 , 29 , 31 – 33 ). Recently, we have demonstrated the effectiveness of a protein crosslink strategy coupled to mass spectrometry (XL-MS) to annotate AltProt functions. XL-MS enabled us to identify interactions that are very close in space from 5.3 Å ( 34 ) to 30 Å ( 35 ), and by identifying crosslinked peptides between AltProts and known proteins, it completed our understanding of the function of these new proteins. Ovarian cancer (OvCa) is considered a stealth killer due to its misdiagnosis and extended chemoresistance to treatment. In 2021, OvCa was the 8th most frequently diagnosed and source of fatal cancer in women ( 36 ). The high mortality rate of OvCa is related to its late detection. In the initial stages of the pathology, few unspecific symptoms are evident and diagnostic methods are not sufficiently effective ( 37 ). The current standard treatment is based on surgery or chemotherapy. For advanced stage tumours, debulking surgery and subsequent adjuvant chemotherapy is needed (carboplatin combined with paclitaxel is most commonly used). With this combination of treatments, up to 80% of patients will go into remission, but around 65% will relapse. Radical strategies such as oophorectomy and salpingectomy are recommended for avoiding recurrence ( 38 ). Among the metabolic pathways involved in cancer. The Kyoto Encyclopedia of Genes and Genomes (KEGG) ( 39 ) summarized different metabolic pathways. Among the central carbon metabolism in cancer (hsa05230) summarizes the metabolic changes that take place in cancer cells to facilitate their growth and survival ( 40 ). This pathway involves the conversion of glucose and glutamine into intermediate molecules, which are then used to synthesize the necessary macromolecules for the replication of cancer cell biomass and genome. The Warburg effect ( 41 ), a key feature of this pathway, is characterized by the heightened utilization of glucose and glutamine by cancer cells. This phenomenon describes the extensive glucose consumption, high rates of glycolysis, and conversion of a significant portion of glucose into lactic acid even in the presence of sufficient oxygen ( 42 ). More recently, it has been realized that the Warburg effect also encompasses an increased reliance on glutamine. Along the signalling pathways that regulate c-MYC, HIF-1, and p53, numerous other oncogenes and tumour suppressor genes are clustered( 40 ). We hypothesized that molecular characterization of OvCa at the proteomic level might help to improve patient care and treatment. In this context, studying AltProts may shed light on mechanisms that are not yet completely understood yet have an impact on OvCa pathology. Therefore, we here describe a proteogenomic approach to characterize the ghost proteome of two OvCa cell lines and an immortalized epithelial ovarian cell line. This approach allowed us to identify differential expression of RefProts, novel isoforms, AltProts and their transcripts. Additionally, the subcellular location, characteristics and interactors of several AltProts were mapped. MATERIAL AND METHODS Cell culture Human PEO-4 ovarian cancer cells were cultured in Roswell Park Memorial Institute (RPMI) 1640 medium (Thermo Fisher Scientific), supplemented with 10% fetal bovine serum (Thermo Fisher Scientific), 2 mM L-glutamine (Thermo Fisher Scientific) and 100 U/mL penicillin-streptomycin (Thermo Fisher Scientific). Human SKOV-3 ovarian cancer cells were cultured in McCoy's 5A (modified) medium (Thermo Fisher Scientific), supplemented with 10% fetal bovine serum and 100 U/mL penicillin-streptomycin. Human immortalized ovarian epithelial cells SV-40 (T1074) were cultured in Prigrow I medium (Applied Biological Materials), supplemented with 10% fetal bovine serum and 100 U/mL penicillin-streptomycin. The three cell lines were grown in a humidified air incubator at 37°C under an atmosphere of 5% CO 2 . Aliquots of three million cells were harvested by trypsin-EDTA (0.05%, phenol red) (Thermo Fisher Scientific), centrifuged at 100 x g for 5 min at 20°C and washed three times with DPBS (Thermo Fisher Scientific). Cell line specific database creation Total RNA sequencing (RNA-Seq). RNA was extracted from four replicates of three million cells from each cell line employing the NucleoSpin RNA Mini kit for RNA purification (MACHEREY-NAGEL), following the vendor’s protocol. 1 µg of RNA was utilized for library preparation using RiboNaut rRNA Depletion Kit and Rapid Directional RNAseq Kit 2.0 (PerkinElmer). Nine cycles of PCR were performed during this preparation. Library sequencing was carried out using the NovaSeq6000 sequencing platform (Illumina; SP flow cell) following a 2x75 paired-end mode. Demultiplexing was performed using bcl2fastq v2.20.0.422. Subsequent fastq trimming utilized trimmomatic v0.39 with parameters MINLEN:35 and AVGQUAL:20. The mapping and counting steps were executed using RSEM v1.3.1 along with STAR v2.7.3a, referencing genome version hg38 and GTF from Gencode v39. Differential analysis was conducted through DESeq2 v1.24.0, employing R v3.6.3. Customized protein database generation with OpenCustomDB. RNA-Seq reads were aligned to the reference genome GRCh38.p12 using STAR version 2.7.3a with default parameters except for ‘–outSAMprimaryFlag: AllBestScore,–outFilterMismatchNmax: 5, –alignSJoverhangMin 10, –alignMatesGapMax 200 000, –alignIntronMax 200 000, –alignSJstitchMismatchNmax “5 − 1 5 5”,–bamRemoveDuplicatesType UniqueIdenticalNotMulti’. Transcript expression was quantified in transcripts per million (tpm) with Kallisto version 0.46.0 with default parameters. Variant calling files (VCF) were generated from BAM files with FreeBayes version 1.3.1 with the setting “–min-alternate-count” set to 5. SNPs and Indels with FreeBayes quality of less than 20 were filtered out with an internal Python script. Variations were inserted in the corresponding transcripts with the variant annotator OpenVar. Next, the transcripts quantified by Kallisto were arranged in descending order based on their expression level (top 100,000 transcripts). Subsequently, OpenProt-annotated proteins linked to these transcripts were incorporated into the customized database until 100,000 entries (100K DB) were reached, as described by Guilloy et al . ( 16 ). Upon adding a protein variant to the database, the corresponding reference protein without any variation was simultaneously included to account for potential heterozygosity. Chemical protein cross-linking and subcellular fractionation In cellulo chemical cross-linking. The cross-linking methodology was described in Garcia-del Rio et al. ( 17 , 30 ). To prepare a 50 mM stock solution of disuccinimidyl sulfoxide (DSSO, Thermo Fisher Scientific), dry DMSO (Sigma-Aldrich) was used. Three million cells of each cell line were resuspended in 200 µL of DPBS. The crosslinking reaction was carried out with 2 mM of DSSO (final concentration) at 37°C with end-over-end stirring. After one hour, the reaction was quenched by adding 10 µL of 500 mM Tris-HCl pH 8.5 and gently stirring for 30 min. Protein subcellular fractionation. The subcellular fractionation methodology was also described in our previous work ( 17 , 30 ). In brief, three replicates of three million cells that underwent crosslinking were pelleted and the supernatant was removed. The Subcellular Protein Fractionation Kit for Cultured Cells (Thermo Fisher Scientific) was used to isolate five distinct protein cell compartments: cytoplasmic, membrane, nuclear, chromatin-bound and cytoskeletal proteins. Each fraction was extracted following the manufacturer’s instructions and stored at -80°C until use. Filter Aided Sample Preparation (FASP) and digestion. Each subcellular fraction was transferred to a 50 kDa molecular weight cut-off Amicon filter (Merck) and concentrated by centrifugation (14,000 g x 15 min at 4°C). Proteins were denatured by adding 100 mL of a denaturing buffer (8 M urea (Euromedex), 100 mM Tris-HCl (Interchim), pH 8.5). Reduction was performed by adding 100 mL of 100 mM dithiothreitol (VWR Life Science) in the denaturing buffer and incubating at 56°C for 40 min. Alkylation was then done by adding 100 mL of 50 mM iodoacetamide (Sigma-Aldrich) in the denaturing buffer at room temperature (RT) for 30 min in the dark. After alkylation, three washes with 200 µL of 50 mM ammonium bicarbonate buffer were performed. Sequential digestion was performed in each fraction by adding 40 µL of 40 ng/µL trypsin/Lys-C Mix, Mass Spec Grade (Promega) to the Amicon filter and incubating at 37°C overnight, followed by 25 µL of 40 ng/µL chymotrypsin, Sequencing Grade (Promega) at room temperature for 4 h. Finally, the resulting peptides were recuperated by adding 50 µL of ammonium bicarbonate buffer and centrifugating for 15 min at 14,000 x g. Finally, this flowthrough was acidified with 0.1% TFA (Sigma-Aldrich) and vacuum dried. Nano LC-MS/MS analysis The peptides of each replicate were suspended in 20 µL of 0.1% TFA and desalted using a ZipTip with C18 resin (Merck), following the manufacturer's instructions. Afterwards, the samples were vacuum-dried and resuspended in 20 µL of a solution containing acetonitrile (ACN, Carlo Erba Reagents) and 0.1% formic acid (2:98 v/v, TCI America). Five microliters of the resulting peptide solution were analysed on a nanoAcquity (Waters) coupled to a Q Exactive mass spectrometer (Thermo Fisher Scientific), as described in ( 24 ). Label-free quantification (LFQ) data analysis Processing workflow. The raw data obtained by nanoLC-MS/MS analysis were analysed using Proteome Discoverer V2.5 (Thermo Fisher Scientific). For each subcellular compartment, a different LFQ analysis was performed. Here, three processing steps (for each cell line’s replicates) were employed using Minor Feature Detector and three iterative Sequest HT nodes (Fig. 1 A). The detailed parameters of the Sequest HT node are described in ( 30 ). In the first Sequest HT node, the top 100,000 sequences derived from RNA-seq experiments (100K DB) were utilized. Next, a percolator with a relaxed 0.05 FDR and strict 0.01 FDR was applied. A spectrum confidence filter was applied before moving on to the next Sequest HT node, discarding any spectra with a confidence rating worse than high. In the second Sequest HT node, the full transcript-derived database (Full DB) from OpenCustomDB was used, minus the sequences contained in the 100K DB. The same parameters were used for a second percolator and spectrum confidence filter. Finally, in the third Sequest HT node, OpenProt was used to interrogate the sequences not found in the two previous databases (Fig. 1 B). Consensus workflow. The five different subcellular fractionation MSF files were subjected to independent consensus workflows. At the feature mapper node, chromatographic alignment was performed with a maximum retention time shift of 10 min, a 10 ppm mass tolerance and coarse tuning. Unique and razor peptides were used at the precursor ions quantifier node. Protein groups were considered for peptide uniqueness and shared quant results were used. Precursor abundance was based on intensity without any threshold. Total peptide amount was used for normalization mode without scaling mode. All peptides were used for normalization and protein roll-up. Modified peptides (methionine oxidation, N-terminus acetylation and cysteine carbamidomethylation) were excluded for pairwise ratios. At the PSM grouper node, the site probability threshold was set to 75. The strict and relaxed FDRs were set at 0.01 and 0.05, respectively, at the peptide validator node. Validation was based on the q-value, and automatic target/decoy selection was used for PSM level FDR calculation based on score. At the peptides and protein filter node, the peptide confidence was set to medium with six amino acids per peptide. Additionally, a minimum of one peptide was set. A strict (0.01) and relaxed (0.05) FDR confidence threshold were set at the protein FDR validator. The results were filtered for RefProts, AltProts and novel isoforms ( 9 ). Briefly, a RefProt is a protein matching a NCBI RefSeq, Ensembl or UniProt protein entry. A novel isoform is a protein encoded by the same gene as a RefProt with a significant level of identity (over 80% of protein sequence identity with the RefProt over 50% of the length). An AltProt does not have any significant similarity with a RefProt. Protein identification. The master protein files were uploaded as a text file to Perseus v.1.6.10.43. The abundance matrix was annotated into three categories based on the cell lines used: SKOV-3, PEO-4 and T1074. Next to count an identification, proteins needed to be identified in 70% of the replicates from at least one cell line and the groups were averaged. A numeric Venn diagram was used to identify the unique RefProts, AltProts and novel isoforms in each compartment for each cell line. Statistical analysis workflow. The master protein files were uploaded as a text file to Perseus v.1.6.10.43. As a first step, log2 transformation and categorical annotation were performed on the normalized abundance values matrix, with cell lines SKOV-3, PEO-4 and T1074. To consider a valid identification, proteins needed to be identified in 70% of the replicates from each cell line. Moreover, missing values were replaced with low values of the normal distribution. An ANOVA multiple sample test was performed using a Benjamini-Hochberg FDR q-value cutoff of 0.05. Non-significant values were filtered out, and a Z-score processing was applied without grouping. To ensure quality control, a principal component analysis (PCA) was conducted with a Benjamini-Hochberg FDR cutoff of 0.05. Finally, hierarchical clustering employing Pearson correlation was applied to the averaged Z-scores to identify the different protein clusters. Crosslinking data analysis Processing workflow. The RAW data obtained by nano LC-MS/MS analysis were analysed using Proteome Discoverer V2.5 (Thermo Fisher Scientific). The detailed parameters for the Sequest HT and XlinkX nodes are described in ( 24 ). The triple Sequest HT nodes mentioned earlier were utilized. Instead of a percolator, a target decoy PSM validator was used after each Sequest HT node. A concatenated target decoy strategy was employed, with strict (0.01) and relaxed (0.05) FDR targets. Consensus workflow. The resulting crosslinking MSF files were submitted to a consensus workflow of which the parameters are described in detail in ( 24 ). RESULTS For this study, we selected three cell line models to investigate differences in the reference proteome, novel isoforms and the alternative proteome. Two of these cell lines (PEO-4 and SKOV-3 cells) are derived from ascitic fluid from ovarian adenocarcinomas. Particularly, PEO-4 cells have a high-grade serous histology and were collected after clinical resistance from a patient who previously received cisplatin, 5-fluorouracil and chlorambucil treatment ( 43 ). PEO-4 cells have been xenografted into immune-deprived mice and found to be tumorigenic ( 44 ). SKOV-3 cells are clear cell carcinoma cells and resistant to tumour necrosis factor, diphtheria toxin, cisplatin and adriamycin ( 45 ). According to Hernandez et al. ( 46 ) and Hallas-Potts et al. ( 47 ), PEO-4 cells have a lower tumorigenicity than SKOV-3 cells when injected in nude mice. The T1074 ovarian cancer cell line was immortalized by SV40 virus and originally derived from normal human ovarian surface epithelial cells. Differential gene expression analysis In order to generate custom databases using OpenCustomDB, RNA-Seq data is required. From these reads, the assessment of differential gene expression can be performed. Mapping the RNA-Seq reads to the genome using RSEM and STAR enabled the identification of 117,636 transcripts expressed in 70% of four replicates between cell lines. Of these, 96,442 transcripts were shared by the three cell lines. Additionally, 1567, 2391, and 1780 transcripts were only identified in T1074, PEO-4 and SKOV-3 cells respectively (Fig. 2 A). Total RNA-seq data analysis showed that 37,197 transcripts were differentially expressed (DESeq2, FDR < 0.05). Hierarchical clustering (Fig. 2 B and Supplemental Table 1) indicated six main transcript clusters: upregulation in PEO-4 (cluster 1, 3117) in SKOV-3 (cluster 2, 3220), or in both PEO-4 and SKOV-3 (cluster 3, 1138 transcripts); and downregulation in SKOV-3 (cluster 4), in PEO-4 (cluster 5), and in both cancerous cells (cluster 6, 12,129 transcripts). Mapping RNA-Seq reads to the human genome Hg38 allowed us to find 29,245 expressed genes among the three cell lines. Among these expressed genes, 420, 407 and 540 were identified to be specific for T1074, SKOV-3 and PEO-4 cells respectively (Figure. 3A). Figure 3 B displays the different categories of genes annotated and the major category of these genes were annotated as non-coding (pseudogenes and lncRNAs, 60.9%), while approximately 37% of the genes were annotated as coding genes. Hierarchical clustering was performed on the expression values obtained from the DESeq2 workflow. A total of 17,368 genes were identified as significantly differentially expressed between the three cell lines (Fig. 3 C and Supplemental Table 2), and of these, 2142 and 1949 genes were upregulated in PEO-4 and SKOV-3 cells respectively. On the other hand, 3345 and 2692 genes were downregulated in PEO-4 and SKOV-3 cells respectively. Between the two cancerous cell lines, 632 genes were identified as upregulated and 6608 as downregulated. RNA-Seq based databases We used RNA-Seq data from the ovarian epithelial cell (T1074) and the OvCa cell lines (PEO-4 and SKOV-3) to generate two cell-specific protein databases for each cell line. Figure 4 summarizes the protein types of the sequences stored in these databases. The distribution is similar for the three cell lines used and the custom 100K DB contained around 15% of wild-type (WT) RefProts, 2% of variant RefProts, 5% of WT novel isoforms, less than 1% of variant novel isoforms, 73% of WT AltProts and 5% of variant AltProts (Fig. 4 ). The OpenCustomDB workflow was used to generate comprehensive transcript databases (Full DB) without limiting the maximum number of entries to 100,000. These databases included 448,569, 443,177 and 437,568 entries for T1074, PEO-4 and SKOV-3 cells, respectively. For example, for T1074 cells, 68,759 WT RefProts (15.33%), 5366 variant RefProts (1.2%), 43,609 WT novel isoforms (9.7%), 2529 variant isoforms (0.6%), 319,612 WT AltProts (71.3%) and 8694 variant AltProts (1.9%) were stored in the database. Similar ratios were observed for PEO-4 and SKOV-3 cells (Fig. 4 ). Of the AltProts predicted, we mapped their transcriptomic origin by extracting information from OpenProt (Fig. 5 ). AltORFs overlapping a CDS in a shifted reading frame, or in 3’UTRs and ncRNA were found to be the main sources of predicted AltProts. Additionally, a comparison was performed between the databases across the three cell lines (see Supplemental Fig. 1). In total, 282,287 AltProts were found to overlap across the three cell lines and, 15,109, 11,026 and 8897 unique AltProts were predicted in T1074, PEO-4 and SKOV-3 cells, respectively. Among the cancerous cell lines, 8055 AltProts were found to overlap. Approximately 39,000 sequences of novel isoforms were predicted to be shared across the three cell lines, with specific novel isoforms also identified in each cell line and in both cancerous cells. Almost 60,000 RefProts were found to overlap across all cell lines, with approximately 6000 being specific for each cell line. The same analysis was performed on the 100K DB, with 52,483 AltProts, 3116 novel isoforms and 10,346 RefProts being predicted to overlap across all three cell lines. A main advantage of these databases is that they contain predicted AltProt variants specific of each sample; for instance, 4321 specific AltProt variants were predicted for PEO-4 cells and, 4355 for SKOV-3 and 3540 for T1074 cells. This also shows that both cancerous cells have an increased number of transcript variants, which may be translated into mutated AltProts. Proteome analysis of subcellular compartments To evaluate the deeper differences in the proteome of these three different cell lines. The MS/MS data sets obtained from analysing each subcellular proteome of the three cell lines were analysed using Proteome Discoverer V2.5. Three different child processing workflows that contained three sequential Sequest HT ( 48 ) nodes were used with the databases as described in the material and methods section. We considered a protein as identified when it was present in at least one subcellular compartment in 70% of the replicates of at least one cell line. Figure 6 A displays the distributions of all identified proteins. 6301 RefProts were identified in T1074 cells, 6268 in PEO-4 cells and 6319 in SKOV-3 cells. Among the identified RefProts, 234 (T1074 cells), 224 (PEO-4 cells) and 233 (SKOV-3 cells) were variants of RefProts. In addition, 137 novel isoforms were identified in T1074 cells, and 136 in PEO-4 and SKOV-3 cells. A total of 8 variants of novel isoforms were annotated in T1074 cells, and 9 in SKOV-3 and PEO-4 cells. Finally, over 500 AltProts were identified in each cell line with similar numbers of AltProts identified in SKOV-3 cells (577), T1074 (556) and PEO-4 cells (549). The number of AltProt variants identified was 12 for PEO-4 cells, and 13 for T1074 and SKOV-3 cells. Additionally, the distribution of WT and variant proteins is shown in Fig. 6 B. Subcellular fractionation was used to link (a) cellular compartment(s) to identified AltProts (Fig. 7 A). The membrane-bound fraction of all three cell lines contained the highest number of identified AltProts. In Figs. 7 B and C, some general descriptions of the identified AltProts are displayed. Here, the majority of the AltProts identified possess a 3’UTR origin. Additionally, the vast majority (80.9%) have a molecular weight less than 10 KDa. In addition, we identified cell line-specific RefProts, novel isoforms and AltProts. In T1074 cells, nine specific AltProts were identified, including the variant AltProt IP_290059@Asp99fs, which was found in the cytoskeletal fraction. SKOV-3 cells also had nine cell-specific AltProts, but without any variants, and PEO-4 cells had two specific wild-type AltProts identified. The characteristics of the cell line-specific AltProts are described in Supplemental Table 3. Overall, 508 AltProts were identified shared by all three cell lines, including 11 variants. Among the identifications, 30 AltProts were identified in both cancerous cell lines. The variant IP_715944@Leu44Pro was identified in the cytoskeletal fraction of both cell lines. The WT AltProt IP_715944 is a 4.82 KDa protein composed by 47 amino acids. It is coded in the DHX8 gene. The variant of this AltProt is the result of a base substitution (c.131T > C) observed in the transcript ENST00000587574, which changed the proline at position 44 to a leucine. To verify the impact of the mutation, the sequence was analysed using protein BLAST ( 49 ), InterProScan ( 50 ) and Phobious ( 51 ). No significant similarity or any change in the predicted domains were identified. Next, we performed a label-free quantitative analysis on the subcellular proteomes (n = 4), which led to the identification of 1,022 RefProts with significantly altered levels (ANOVA, q-value < 0.05) in the cytoplasmic fraction, 995 in the membrane-bound fraction, 561 in the nuclear fraction, and 159 in the chromatin and 590 in cytoskeletal fractions. The used RNA-Seq derived databases allowed us to identify and quantify variant proteins, and 88 RefProt variants were found at significantly different levels in the three cell lines. Of these variants, 39 were found in the cytoplasm, 39 in membrane-bound structures, 15 in the nucleus, 6 in the chromatin fraction and 23 in the cytoskeleton. Note, that 22 of the 88 RefProt variants were found in more than one cellular fraction. Hierarchical clustering (Supplemental Fig. 2A and Supplemental Table 4) pointed to six main groups of proteins: up-regulation in ( 1 ) PEO-4 cells, ( 2 ) SKOV-3 cells, and ( 3 ) in both cancerous cells; and down-regulation in ( 4 ) SKOV-3 cells, ( 5 ) PEO-4 cells, and ( 6 ) in both cancerous cells. Table 1 displays the number of significantly deregulated WT and RefProt variants quantified in the three cell lines. Table 1 Wild-type and variant RefProts significantly varied (ANOVA, q-value < 0.05). The number of WT and variant RefProts is displayed for the six main clusters identified upon LFQ proteomics. Cluster WT RefProts RefProt variants Upregulated PEO-4 cells 482 10 SKOV-3 cells 383 6 Both cancerous cells 666 29 Downregulated PEO-4 cells 195 4 SKOV-3 cells 328 16 Both cancerous cells 1154 54 An identical hierarchical clustering was performed on novel isoforms, resulting in the identification of 53 wild-type novel isoforms and three novel isoform variants that were significantly varied (ANOVA, q-value < 0.05) between the three cell lines (Supplemental Fig. 2B and Supplemental Table 5). One of these novel isoform variants, II_587587@Asn359Asp, was found upregulated in both cancerous cell lines in the cytoplasm and membrane-bound fractions. This protein is a novel isoform expressed from the PMPCB gene. A second variant, II_702738@Ala184Thr[Leu79LeuAsn72Asn], was found to be downregulated in SKOV-3 cells in the nuclear fraction. This novel isoform is encoded by the WDR18 gene and possesses a substitution in position 184 and three silent mutations. II_597059@Glu65GlnAsn139AspAla57ValLys122ArgIle6ValGlu80Lys[Val118Val] was identified as upregulated in SKOV-3 cells in the chromatin-bound fraction. This protein is a novel isoform of HLA-H, which possesses seven mutations, one of which is a silent mutation. The same workflow was used to compare the AltProt profiles between the three studied cell lines. In total, 73 AltProts were found at significantly altered levels and 41 of these were upregulated in the ovarian cancer cells, with 12 upregulated only in PEO-4 cells, nine in SKOV-3 cells, and 20 in both. Four AltProts were found to be downregulated only in PEO-4 cells or only in SKOV-3 cells, while 36 AltProts were downregulated in both cells (Supplemental tables 6 and 7). Figure 8 shows the distribution of the significantly altered AltProts over the five different subcellular fractions. We found 11 AltProts to be significantly regulated in more than one unique compartment. IP_067626, IP_070304, IP_108778, IP_147518, IP_178464, IP_213023, IP_246003 and IP_282949 were downregulated in both cancerous cells. Interestingly, IP_582685 (translated from a ncRNA transcript of the pseudogene GDI2P1 ) was identified upregulated at the membrane-bound fraction of both cancerous cells. Moreover, it was also found upregulated in the cytoplasmic and nuclear fractions of SKOV-3 cells. IP_062385 (translated from the 3’UTR part of the transcript ENST00000457946.1 coded by ZMYM4 gene) was found upregulated in both cancerous cells’ cytoplasmic fractions, while it was downregulated in the cytoskeletal fraction of these cells. A similar observation was made for IP_774693 (translated from an ncRNA of TUBAP2 ): this AltProt was upregulated in the membrane-bound fractions of the cancerous cells yet, downregulated in their cytoplasmic fractions. Note that only two AltProt variants were found at significantly different levels. IP_174777 is a 53-amino acid AltProt encoded from the 3’UTR RNA of the TMEM245 gene. During the creation of our databases, a single base substitution (23A > G) in transcript ENST00000374586 led to the prediction of the variant IP_174777@Asn8Ser. This mutant AltProt was identified as significantly downregulated in both cancerous cells, compared to the epithelial ovarian cell line. The second AltProt variant identified as downregulated in the cancerous ovarian cell lines was IP_304294@Leu32fs. The WT AltProt, IP_304294, is a 57-amino acid protein coded by the MTMR1 gene and is translated from the 3’UTR of the transcript ENST00000445323. A guanine deletion at position 93 results in a reading frame shift at leucine 32. This shortens the protein to 44 amino acids and substituted the last 13 amino acids. For both proteins, a cytoplasmic domain was predicted by Phobius, and this prediction remained unchanged after the frame shift. Proteome and transcriptome functional annotation To integrate and interpret the data obtained from the differentially expressed reference proteome and transcriptome, we used the Database for Annotation, Visualization and Integrated Discovery (DAVID) ( 52 ). This online tool allows users to perform GO term enrichment, cluster redundant enriched terms, visualize enriched pathway maps and extract gene functionality and literature. The RefProts identified as upregulated in cancerous cells were submitted to DAVID and showed that two major cancer-related KEGG pathways ( 39 ) were significantly enriched: central carbon metabolism in cancer (hsa05230; p-value: 1.90E-04) and chemical carcinogenesis - reactive oxygen species (hsa05208; p-value: 5.26E-06). The KEGG pathway proteoglycans in cancer (hsa05205; p-value: 0.026) was significantly enriched among the downregulated cancer RefProts. Regulated protein clusters in SKOV-3 cells were found significantly enriched for the central carbon metabolism in cancer pathways (p-value: 7.3E-5). On the contrary, no significant enrichment was identified in PEO-4 cells. Based on this difference we presented the protein and transcript expression profiles on an adapted central carbon metabolism pathway in a cancer pathway map (Fig. 9 ). The complete list of genes and proteins enriched for this pathway can be found in Supplemental Table 8. One observes a significant upregulation of the NRAS protein in the RAS/RAF/MEK/ERK/c-Myc pathway in SKOV-3 cells (ANOVA q-value: 0.017). On the other hand, its transcript levels were significantly downregulated in PEO-4 cells (ANOVA q-value: 0.0004). Moreover, for the other two members of the oncogene RAS family, no significant variation was found at the proteome level whereas on the transcript level, HRAS was downregulated in PEO-4 cells (ANOVA q-value: 3.7E-6) and KRAS upregulated in SKOV-3 cells (ANOVA q-value: 5.58E-5). Other differences were observed for the MEK kinases MAP2K1 and MAP2K2; for instance, MAP2K2 was significantly downregulated in both cancerous cells’ membrane-bound fraction (ANOVA q-value: 0.005) and downregulated in the PEO-4 cytoskeletal fraction (ANOVA q-value: 0.028). MAP2K1 was found downregulated in PEO-4 cells (ANOVA q-value: 2.29E-6) while its transcript level was found upregulated in SKOV-3 cells (ANOVA q-value: 1.49E-5). In another part of the central carbon metabolism in cancer pathway, SIRT6 and SIRT3 are considered as cancer associated genes ( 53 – 55 ). It has been found that downregulation of SIRT6 increased ovarian cancer cells growth ( 55 ). The transcript levels of SIRT6, a tumour suppressor gene, were found downregulated in PEO-4 cells (ANOVA q-value: 4.65E-6), while the transcript levels of c-Myc, an oncogene, were upregulated in these cells (ANOVA q-value: 3.88E-5). Protein levels of SIRT3, another tumour suppressor gene, were upregulated in both cancerous cells (ANOVA q-value: 0.005), while its transcript levels were found downregulated in PEO-4 cells (ANOVA q-value: 0.0001). The expression of the oncogenic PI3K family was also found significantly regulated among the three cell lines. PIK3R1 was upregulated in both cancerous cells’ cytoplasmic fraction (ANOVA q-value: 0.037), while its transcript was only upregulated in SKOV-3 cells (ANOVA q-value: 2.31E-5). Additionally, the transcripts of PIK3CB (ANOVA q-value: 0.0001) and PIK3R2 (ANOVA q-value: 0.005) were also only upregulated in these cells. On the contrary, the PIK3CA (ANOVA q-value: 0.001) and PIK3CD (ANOVA q-value: 0.0001) transcripts were found downregulated in both cancerous cells. Other oncogenes in the central carbon metabolism cancer pathway are members of the AKT family. AKT1 protein (ANOVA q-value: 0.0002) and transcript levels (ANOVA q-value: 2E-5) were downregulated in PEO-4 cells. For AKT2 and AKT3, no significant variation in protein expression was found, while their transcript levels were significantly downregulated in both cancerous cells (ANOVA q-value: 0.02 and 3.6E-6). With our proteogenomic workflow, we could identify a variant form of p53 (ENSP00000269305.8: p.Pro72Arg), an amino acid substitution that stems from the c.215C > G variant in TP53 . This p53 mutant was significantly downregulated in both cancerous cells’ cytoplasmic (ANOVA q-value: 0.0036) and cytoskeletal (ANOVA q-value: 0.0096) fractions, while its transcript levels were only significantly downregulated in SKOV-3 cells (ANOVA p-value: 1.17E-10). Three other RefProt variants were identified in this pathway. ENSP00000359991.5: p.Thr238Met, a mutant of PGAM1 was downregulated in both cancerous cells (ANOVA q-value: 0.0013), while two mutants of HKDC1 were upregulated in both cancerous cells; ENSP00000346643.5: p.Thr124Ile, p.Asn917Lys, p.Arg827Trp, p.Trp721Arg, [p.Phe601Phe] (ANOVA q-value: 0.008) and ENSP00000346643.5: p.Thr124Ile, p.Asn917Lys, p.Trp721Arg, [p.Phe601Phe] (ANOVA q-value: 0.023). Crosslinking network analysis The computational analysis of the crosslinked samples was carried out as described in ( 30 ), which allowed us to generate a protein interaction map in Cytoscape ( 56 ) (Supplemental Fig. 3). A total of 90 crosslinks were identified (Supplemental table 9), among them 20 intra-crosslinks were identified, which do not give interactome information, but might be useful for structural studies. In this protein network (Supplemental Fig. 3), 28 protein-protein interactions (PPIs) were found in PEO-4 cells (marked in purple), 27 in SKOV-3 cells (marked in blue) and 35 in T1074 cells (marked in green). From these pairs, 12 crosslink interactions were identified in at least two cell lines. Among all the crosslinked pairs, 20 involved AltProts, four crosslinks were AltProt-AltProt interactions, and 13 AltProt-RefProt crosslinks were identified. The latter were considered most important for our study as they provide hints to an AltProt’s physiological or pathological involvement. To attribute functions to an AltProt from this set of PPIs, we retrieved the known interactions from the STRING ( 57 , 58 ), BioGrid ( 59 ) and IntAct ( 60 ) databases and included the identified crosslinked interactions (Supplemental Fig. 4). Additionally, for the RefProts that did not present a referenced STRING interaction within the crosslinked network, the addition of three STRING interactors has been performed to expand the network. We observed that seven PPIs had already been described (pink lines): B2M-HLA-B, B2M-HLA-A, ITGA5-ITGA1, TUBA1C-TUBB, HIST3H2A-HIST2H3D, PRC1-ORC1 and VP39-VPS13C. Using this network, a molecular function GO term and KEGG pathway enrichment analysis was performed with the ClueGO App( 61 ) from Cytoscape. The interactions between AltProts and RefProts were displayed along with the enriched GO terms (Fig. 10 ). Four direct AltProt-RefProt-GO-term interactions were detected. The AltProt IP_192190 was crosslinked to KIF13A in PEO-4 cells and linked to the vesicle-mediated transport of plasma membrane (GO:0098876), Golgi to plasma membrane protein transport (GO:0043001), protein localization to plasma membrane (GO:0072659) and post-Golgi vesicle mediated transport (GO:0006892). The AltProt IP_136846 was identified as crosslinked to LGALS1 in T1074 cells, which is linked to the GO terms viral entry into host cell (GO:0046718) and biological process involved in interaction with host (GO:0051701). Similarly, IP_235241, crosslinked to ITGA5 in T1074 cells, was linked to the phagosome KEGG pathway (KEGG:04145) and the GO terms virus receptor activity (GO:0001618), biological process involved in interaction with host (GO:0051701) and viral entry into host cell (GO:0046718). Finally, IP_183088 was crosslinked to POLD3 in T1074 and PEO-4 cells. POLD3 is part of the DNA polymerase involved in the replication and reparation of DNA and linked to the UV-damage excision repair (GO:0070914) and response to UV (GO:0009411) GO terms. Three AltProt-GO-term/KEGG pathways indirect links were identified. IP_292259, crosslinked to TMEM260 in T1074 cells, and TMEM260 possesses a STRING interaction with TOGARAM, which is linked to the non-membrane-bounded organelle assembly (GO:0140694), spindle assembly (GO:0051225) and microtubule cytoskeleton organization involved in mitosis (GO:1902850). Additionally, TMEM260 interacts with GOLGA7, which is linked to GO terms related to the vesicle-mediated transport to the plasma membrane. In addition, two AltProts were also identified to be related to these GO terms: IP_105326 and IP_118499. The former was crosslinked to VIM in SKOV-3 cells, and VIM was crosslinked to MACF1, which is linked to vesicle-mediated transport GO terms. IP_118499 was found crosslinked to CNNM3 in SKOV-3 cells, which processes a STRING interaction with CCNL2, which was crosslinked to VPS13C, which is linked to vesicle-mediated transport GO terms. To confirm the probability of the observed interactions, we analysed 3D models of RefProt-AltProts using unguided interaction docking between the two partners (as described in ( 30 )). The structures of the AltProts were predicted using I-Tasser( 62 ), while those of the interactors were predicted using ClusPro ( 63 ). The RefProt, for which the structure was predicted by AlphaFold( 64 ), was used as a receptor of the AltProt, which was smaller in structure. By measuring the distance of the predicted interactions, we confirmed the observed interactions from XL-MS with a mean of 23.467 Å (Supplemental Fig. 5), which is consistent with the distances described in the literature for DSSO, ranging from 5.3 ( 34 ) to 30 Å ( 35 ). DISCUSSION Proteogenomics establishes a direct connection between the genome blueprint and the constructed proteome. We utilized this approach to explore potential implications of AltProts in ovarian cancer. We selected the PEO-4 cell line possessing a high-grade serous histology, the SKOV-3 clear cell carcinoma cell line, and the T1074 ovarian epithelial cell line, originally derived from normal human ovarian surface epithelial cells, serving as a non-tumorous control. The transcriptome as a source of information for the proteomic perspective The transcriptomic analysis employing DESeq2 to analyse the RNA-seq data enabled us to identify clusters of regulated genes in the cancer cell models. Each cell line showed about 500 uniquely expressed genes. Among the 540 genes uniquely expressed in PEO-4 cells, proto-oncogenes SSX1, SSX2 and SSX2B were found, along with an additional 24 genes related to cancer according to the Gene-Disease Associations Dataset (GAD) ( 65 ). Among the 406 genes uniquely expressed in SKOV-3 cells, 23 were related to cancer according to GAD. While transcriptomic analysis provided cell specificity information, the strength of this approach lies in the custom creation of cell-specific databases using OpenCustomDB. These databases contain a larger number of AltProt variants due to a high number of predicted AltProts. The ratio of variant RefProts to WT RefProts was greater than the ratio of variant AltProts to WT AltProts, which can be attributed to differences in sequence length. Longer genomic sequences have higher mutation rates and replication errors. Additionally, predicted AltProts mostly originate from ncRNAs, but mRNA CDS frame shifts and 3'UTRs also contributed significantly to the top 100,000 most abundant transcripts. This suggests a greater potential for ncRNAs to code for AltProts, although there is a larger abundance of mRNAs capable of coding for AltProts. The proteogenomic approach of constructing a custom database, combined with reading frame prediction for AltProt generation, presents analytical challenges. However, our iterative triple SEQUEST HT processing workflow using the 100,000-abundance cut-off database in the first node overcomes the FDR limitations of a 400,000-sequences database (full database) search, which may increase the number of false positives and false negative identifications ( 16 ). To not lose possible identifications, such iterative workflows provide a stepwise increase in possible protein identifications by expanding the search space, until the last step with OpenProtDB, where proteins translated from ncRNAs not detected by RNA-Seq can be recovered. Finally, using Percolator, we removed false positive identifications by this semi-supervised machine learning algorithm ( 66 ). Percolator effectively estimates the statistical significance of peptide-spectrum matches and assigns confidence scores to identified peptides in a fast and accurate way. It enhances the rate of confident peptide identifications from a collection of tandem mass spectra ( 67 ). A larger view on the proteomic landscape Subcellular fractionation is a validated approach to decrease sample complexity and to maximize resolution in LC-MS/MS analysis. In our previous works ( 17 , 30 ), such subcellular fractionation was proven beneficial for XL-MS workflows and provided better coverage of the proteome compared to analysing whole cell lysates ( 68 ). This enhanced the detection of low-abundant proteins (AltProts and crosslinked proteins). Furthermore, subcellular fractionation helps to determine the subcellular localization of AltProts and monitors changes under different cellular conditions ( 69 ). For instance, IP_062385 was found to be located in the cytoplasm and upregulated in cancerous cells, while downregulated in their cytoskeleton fractions. This may reflect a functional change linked to cancer, yet targeted studies will be necessary to prove such links between tumour development and AltProts re-localising over different cellular compartments. However, it is important to be note that subcellular fractionation based on the use of protein extraction using different detergents can lead to potential cross-contamination and inaccuracies in downstream data interpretation. Subcellular fractionation led to the identification of ~ 6,000 common RefProts among the three cell lines. Over 3% of all identified proteins in each cell line were RefProt variants (Fig. 6 B). However, these ~ 180 RefProt variants require deeper characterization to understand their (pathological) role. Cell line-specific AltProts were also found in all three cell lines, AltProts in SKOV-3 and PEO-4 cells are of interest as potential new protein markers for OvCa. Among them, IP_715944@Leu44Pro (Fig. 11 ) caught our attention as it is a variant AltProt not predicted in the T1074 RNA-Seq database. Moreover, six additional AltProts from this group were also not predicted, which highlights the importance of a cell-specific analysis to identify new biomarkers. Based on the LFQ proteome analysis data, AltProts were found to be upregulated in all compartments except the cytoskeleton in PEO-4 and SKOV-3 cells, while downregulation of AltProts was only observed in the membrane-bound and nuclear fractions in PEO-4 cells, and in the nuclear and chromatin fractions in SKOV-3 cells. Such differentially expressed AltProts can be important for distinguishing between cancer cell lines. When comparing both cancerous cell lines to T1074 cells, significant downregulation of AltProts was observed in all five compartments. AltProts upregulated in both cancerous cells were present in all compartments except the nucleus. These findings provide some insights into the specific expression of AltProts in high grade serous and non-serous OvCa. Functional domains were predicted for 23 out of 73 AltProts, which can help us understand their potential roles in interactions. Future targeted interactomic approaches such as Virotrap ( 70 ), BioID ( 71 ) and proximity ligation assays ( 72 ) could be used to identify the interaction partners of these AltProts, which may shed light on their involvement in the pathogenic development of OvCa or drug resistance. Interpretation of the major protein and transcript fluctuations from the three cell-line highlights cancer-related KEGG pathways NRAS, a member of the RAS oncogene family, is involved in cell signalling, regulation of cell growth, differentiation and angiogenesis. In ovarian clear cell carcinoma, no NRAS mutations were found in our SKOV-3 cell transcriptome data ( 73 ). Overexpression of NRAS was shown to increase tumor aggressiveness in mice ( 74 ). KRAS, another member of the RAS oncogene family, was found to be upregulated in SKOV-3 cells and in metastatic lesions in endometrial cancer ( 75 ), which is associated with adverse prognosis ( 76 ). Downregulation of HRAS has been linked to lower aggressiveness and reduced cell proliferation in certain types of cancer ( 46 , 77 , 78 ). Another branch of the pathway also shows MEK (mitogen-activated extracellular signal-regulated kinase) which is a kinase cascade pathway that plays a central role in carcinogenesis and the maintenance of several cancers. We found downregulation of MAP2K1 and MAP2K2 in both cancerous cell lines, as also evident from data in The Human Protein Atlas ( 79 ). In parallel, related to cancer metabolism, we observed SIRT6 downregulation and c-Myc upregulation in PEO-4 cells. Lower levels of SIRT6 are associated with poorer prognosis and increased tumour aggressiveness ( 54 , 55 ). SIRT6 also regulates ribosome metabolism by repressing c-Myc activity. As a result, higher levels of c-Myc , resulting from downregulation of SIRT6 , promote energy production and biomolecule synthesis for rapid cell proliferation. On the other hand, SIRT3 is described as a tumor suppressor gene in OvCa ( 80 ) and its expression increases in detached cells and tumor cells from malignant ascites, indicating its pro-metastatic role in OvCa ( 53 ). Our proteomic data show upregulation of SIRT3 in both cancerous cells, while SIRT3 transcripts are downregulated in PEO-4 cells. Discordance between mRNA and protein levels has been observed in various studies ( 81 – 84 ), attributed to post-transcriptional regulation, transcript isoform switching and DNA variants ( 82 , 85 ). We found that PIK3R1 (p85α) was upregulated in the tumoral cells, which also corresponds to the identified overexpression of PIK3R1 in an OvCa cohort of 98 patients ( 86 ). However, contrary to literature findings ( 87 ), transcript levels of PIK3CD were downregulated in both cancerous cell lines. Stronach et al . ( 88 ) and Liu et al . ( 89 ) have studied the role of the AKT kinase signalling pathway in OvCa cell proliferation, cell cycle regulation and anti-apoptosis. They discovered that SKOV-3 cells rely on AKT1 for cisplatin resistance, while PEO-4 cells depend on AKT3. In line with this study, in our dataset, both protein and transcript levels of AKT1 were found to be overexpressed in SKOV-3 cells. On the importance of identifying variants Among the significantly deregulated RefProts identified in our study, P53 rs1042522 was found downregulated in both cancer cell lines. The corresponding Pro72Arg substitution in the canonical P53 sequence (UniProtKB: P04637-1) occurs in a proline-rich, intrinsically disordered region (residues 64–92) ( 90 ). This region is described as rigid ( 91 ) and a substitution of one of the prolines in this region might decrease its stiffness. Moreover, position 72 is part of the binding site of P53 with the oncogenic protein MDM2 ( 92 ). Even though there is evidence suggesting that there may be an association between this mutation and OvCa risk, a meta-analysis by Schildkraut et al. could not confirm an association with OvCa ( 93 ). Additionally, using our proteogenomic approach we were able to confirm the observations of Yaginuma et al. ( 94 ) describing SKOV-3 as a null-WT-P53 cell line. HKDC1 variants were found upregulated for both cancerous cells. Three (rs906219, rs1111335 and rs874556) of the four single nucleotide variations (SNVs) are reported as natural variants of HKDC1 (UniProtKB: Q2TB90). The last, SNV rs138235256 is not reported in UniProt and does not possess any clinical significance so far. Additionally, the variant ENSP00000359991.5@Thr238Met (PGAM1) was identified downregulated in both cancer cells and results from rs202055965 SNV (C > T). XL-MS reveals clues about AltProt functions based on AltProt-RefProt PPIs IP_183088, a 38-amino acid AltProt, is encoded by MAPK8 and was found to interact with POLD3 in T1074 and PEO-4 cell lines. Figure 12 A displays the model of the human polymerase delta holoenzyme complex (PDB: 6s1m). Herein, the four subunits of the complex are shown (POLD1 turquoise, POLD2 green, POLD3 blue and POLD4 yellow), additionally, the proliferating cell nuclear antigen is displayed in light blue and the AltProt IP_183088 in red, together with its crosslinks. Figure 12 B zooms in on the crosslinked region of POLD3-IP_183088, revealing that this interaction occurs in the region where POLD2 and POLD3 interact. Our transcriptomic data point to POLD3 downregulation in both cancerous cells. This correlates with the findings of Willes et al . who described that POLD3 downregulation is correlated with a poor cancer outcome ( 95 ) and those of Weberpals et al. who showed that POLD3 is overexpressed in patients with high grade serous ovarian carcinoma and with good response to carboplatin/paclitaxel ( 96 ). On the other hand, the inhibition of the interaction between POLD3 and POLD2 driven by IP_183088 can reflect two effects. (i) An increase of the mutagenesis in the cells upon reduced activity of the POLD complex and, therefore, errors in DNA replication are more likely to occur and go unrepaired, which can be expected in PEO-4 cells. (ii) A regulatory system of the POLD complex, where the POLD3-IP183088 interaction in T1074 cells could lead to cell apoptosis; Murga et al. ( 97 ) showed that POLD3 stabilizes the POLD complex and in its absence, the cell is driven to apoptosis. The difficulty of detecting interactions by XL-MS means that we cannot claim that the observed interactions are cell-type specific, but they do provide information about potential protein functions for unreferenced proteins. The use of this approach for studying AltProt thus makes sense, and in the case of IP_183088, allowed us to hypothesize a regulatory function of POLD3-POLD2 interaction, the stability of the POLD complex and therefore an effect in the regulation of DNA replication error repair. To conclude, one main advantage of the databases generated by OpenCustomDB is the possibility of predicting and identifying cell-specific proteins in cell lines and, in the future, in patient samples, resulting in a big step forward towards personalized medicine. Subcellular fractionation allowed us to study differences in the reference, alternative and novel isoforms proteome of OvCa cell lines compared to a non-tumoral ovarian epithelial cell. Additionally, it allowed us to identify RefProts variants and understudied AltProts and their variants. The versatility of these databases allowed us to identify AltProt-RefProts PPIs and gave some clue about the function of AltProts, which however need to be validated. In summary, our large-scale characterization study revealed other research targets and demonstrated the complexity of the cell proteome and its largely unmapped ghost proteome. Declarations DATA AVAILABILITY \"The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (98) partner repository with the dataset identifier PXD045689\". Reviewer account details: Username: [email protected] Password: nI04XiIQ RNAseq analysis data have been deposited to BioProject (SRA) with dataset identifier: PRJNA1041444 and GEO dataset identifier: GSE248039, SUPPLEMENTARY DATA AUTHOR CONTRIBUTIONS Diego Fernando Garcia-del Rio: Conceptualization, Formal analysis, Methodology, Validation, Writing & editing—original draft. Mehdi Derhourhi: Formal analysis, Methodology, Writing. Amelie Bonnefond: Methodology, Writing—review & editing. Sébastien Leblanc: Formal analysis, Methodology, Writing—review & editing. Noé Guilloy: Methodology, Writing—review & editing. Xavier Roucou: Methodology, Writing—review & editing. Kris Gevaert: review & editing, Funding. Sven Eyckerman: review & editing, Funding. Michel Salzet: Conceptualization, review & editing, Funding. Tristan Cardon: Conceptualization, Methodology, Validation, Writing & editing—original draft. FUNDING This research was supported by funding from I-SITE, Institut National de la Santé et de la Recherche Médicale (Inserm) and Université de Lille and by The Research Foundation - Flanders (FWO), project number G008018N. CONFLICT OF INTEREST The authors declare no competing interests. References The UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Research , 43 , D204–D212. Breuza,L., Poux,S., Estreicher,A., Famiglietti,M.L., Magrane,M., Tognolli,M., Bridge,A., Baratin,D., Redaschi,N., and UniProt Consortium (2016) The UniProtKB guide to the human proteome. Database (Oxford) , 2016 , bav120. Mouilleron,H., Delcourt,V. and Roucou,X. (2016) Death of a dogma: eukaryotic mRNAs can code for more than one protein. Nucleic Acids Res , 44 , 14–23. Hao,Y., Zhang,L., Niu,Y., Cai,T., Luo,J., He,S., Zhang,B., Zhang,D., Qin,Y., Yang,F., et al. (2018) SmProt: a database of small proteins encoded by annotated coding and non-coding RNA loci. Briefings in Bioinformatics , 19 , 636–643. Galindo,M.I., Pueyo,J.I., Fouix,S., Bishop,S.A. and Couso,J.P. (2007) Peptides Encoded by Short ORFs Control Development and Define a New Eukaryotic Gene Family. PLOS Biology , 5 , e106. Albuquerque,J.P., Tobias-Santos,V., Rodrigues,A.C., Mury,F.B. and Fonseca,R.N. da (2015) small ORFs: A new class of essential genes for development. Genet. Mol. Biol. , 38 , 278–283. Ruiz-Orera,J., Messeguer,X., Subirana,J.A. and Alba,M.M. (2014) Long non-coding RNAs as a source of new peptides. eLife , 3 , e03523. Slavoff,S.A., Heo,J., Budnik,B.A., Hanakahi,L.A. and Saghatelian,A. (2014) A human short open reading frame (sORF)-encoded polypeptide that stimulates DNA end joining. J Biol Chem , 289 , 10950–10957. Brunet,M.A., Brunelle,M., Lucier,J.-F., Delcourt,V., Levesque,M., Grenier,F., Samandi,S., Leblanc,S., Aguilar,J.-D., Dufour,P., et al. (2018) OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes. Nucleic Acids Research , 10.1093/nar/gky936. Cardon,T., Fournier,I. and Salzet,M. (2021) Shedding Light on the Ghost Proteome. Trends in Biochemical Sciences , 46 , 239–250. Brunet,M.A. and Roucou,X. (2019) Mass Spectrometry-Based Proteomics Analyses Using the OpenProt Database to Unveil Novel Proteins Translated from Non-Canonical Open Reading Frames. JoVE (Journal of Visualized Experiments) , 10.3791/59589. Kozak,M. (1999) Initiation of translation in prokaryotes and eukaryotes. Gene , 234 , 187–208. Kozak,M. (2006) Rethinking some mechanisms invoked to explain translational regulation in eukaryotes. Gene , 382 , 1–11. Boeckmann,B., Bairoch,A., Apweiler,R., Blatter,M.-C., Estreicher,A., Gasteiger,E., Martin,M.J., Michoud,K., O’Donovan,C., Phan,I., et al. (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Research , 31 , 365–370. Brunet,M.A., Lucier,J.-F., Levesque,M., Leblanc,S., Jacques,J.-F., Al-Saedi,H.R.H., Guilloy,N., Grenier,F., Avino,M., Fournier,I., et al. (2021) OpenProt 2021: deeper functional annotation of the coding potential of eukaryotic genomes. Nucleic Acids Research , 49 , D380–D388. Guilloy,N., Brunet,M.A., Leblanc,S., Jacques,J.-F., Hardy,M.-P., Ehx,G., Lanoix,J., Thibault,P., Perreault,C. and Roucou,X. (2023) OpenCustomDB: Integration of Unannotated Open Reading Frames and Genetic Variants to Generate More Comprehensive Customized Protein Databases. J. Proteome Res. , 22 , 1492–1500. Garcia-del Rio,D.F., Cardon,T., Eyckerman,S., Fournier,I., Bonnefond,A., Gevaert,K. and Salzet,M. (2023) Employing non-targeted interactomics approach and subcellular fractionation to increase our understanding of the ghost proteome. iScience , 26 . Cao,X., Khitun,A., Harold,C.M., Bryant,C.J., Zheng,S.-J., Baserga,S.J. and Slavoff,S.A. (2022) Nascent alt-protein chemoproteomics reveals a pre-60S assembly checkpoint inhibitor. Nat Chem Biol , 18 , 643–651. Cardon,T., Salzet,M., Franck,J. and Fournier,I. (2019) Nuclei of HeLa cells interactomes unravel a network of ghost proteins involved in proteins translation. Biochimica et Biophysica Acta (BBA) - General Subjects , 1863 , 1458–1470. D’Lima,N.G., Ma,J., Winkler,L., Chu,Q., Loh,K.H., Corpuz,E.O., Budnik,B.A., Lykke-Andersen,J., Saghatelian,A. and Slavoff,S.A. (2017) A human microprotein that interacts with the mRNA decapping complex. Nat Chem Biol , 13 , 174–180. Matsumoto,A., Pasut,A., Matsumoto,M., Yamashita,R., Fung,J., Monteleone,E., Saghatelian,A., Nakayama,K.I., Clohessy,J.G. and Pandolfi,P.P. (2017) mTORC1 and muscle regeneration are regulated by the LINC00961-encoded SPAR polypeptide. Nature , 541 , 228–232. Stein,C.S., Jadiya,P., Zhang,X., McLendon,J.M., Abouassaly,G.M., Witmer,N.H., Anderson,E.J., Elrod,J.W. and Boudreau,R.L. (2018) Mitoregulin: A lncRNA-Encoded Microprotein that Supports Mitochondrial Supercomplexes and Respiratory Efficiency. Cell Rep , 23 , 3710-3720.e8. Cardon,T., Ozcan,B., Aboulouard,S., Kobeissy,F., Duhamel,M., Rodet,F., Fournier,I. and Salzet,M. (2020) Epigenetic Studies Revealed a Ghost Proteome in PC1/3 KD Macrophages under Antitumoral Resistance Induced by IL-10. ACS Omega , 10.1021/acsomega.0c02530. Delcourt,V., Franck,J., Leblanc,E., Narducci,F., Robin,Y.-M., Gimeno,J.-P., Quanico,J., Wisztorski,M., Kobeissy,F., Jacques,J.-F., et al. (2017) Combined Mass Spectrometry Imaging and Top-down Microproteomics Reveals Evidence of a Hidden Proteome in Ovarian Cancer. EBioMedicine , 21 , 55–64. Huang,J.-Z., Chen,M., Chen,D., Gao,X.-C., Zhu,S., Huang,H., Hu,M., Zhu,H. and Yan,G.-R. (2017) A Peptide Encoded by a Putative lncRNA HOXB-AS3 Suppresses Colon Cancer Growth. Molecular Cell , 68 , 171-184.e6. Polycarpou-Schwarz,M., Groß,M., Mestdagh,P., Schott,J., Grund,S.E., Hildenbrand,C., Rom,J., Aulmann,S., Sinn,H.-P., Vandesompele,J., et al. (2018) The cancer-associated microprotein CASIMO1 controls cell proliferation and interacts with squalene epoxidase modulating lipid droplet formation. Oncogene , 37 , 4750–4768. Brunet,M.A., Jacques,J.-F., Nassari,S., Tyzack,G.E., McGoldrick,P., Zinman,L., Jean,S., Robertson,J., Patani,R. and Roucou,X. (2020) The FUS gene is dual‐coding with both proteins contributing to FUS‐mediated toxicity. EMBO reports , 10.15252/embr.202050640. Cao,X., Chen,Y., Khitun,A. and Slavoff,S.A. (2023) BONCAT-based Profiling of Nascent Small and Alternative Open Reading Frame-encoded Proteins. Bio Protoc , 13 , e4585. Slavoff,S.A., Mitchell,A.J., Schwaid,A.G., Cabili,M.N., Ma,J., Levin,J.Z., Karger,A.D., Budnik,B.A., Rinn,J.L. and Saghatelian,A. (2013) Peptidomic discovery of short open reading frame–encoded peptides in human cells. Nat Chem Biol , 9 , 59–64. Garcia-del Rio,D.F., Fournier,I., Cardon,T. and Salzet,M. (2023) Protocol to identify human subcellular alternative protein interactions using cross-linking mass spectrometry. STAR Protocols , 4 , 102380. Vanderperre,B., Staskevicius,A.B., Tremblay,G., McCoy,M., O’Neill,M.A., Cashman,N.R. and Roucou,X. (2011) An overlapping reading frame in the PRNP gene encodes a novel polypeptide distinct from the prion protein. The FASEB Journal , 25 , 2373–2386. Zhang,Q., Vashisht,A.A., O’Rourke,J., Corbel,S.Y., Moran,R., Romero,A., Miraglia,L., Zhang,J., Durrant,E., Schmedt,C., et al. (2017) The microprotein Minion controls cell fusion and muscle formation. Nat Commun , 8 , 15664. Yosten,G.L.C., Liu,J., Ji,H., Sandberg,K., Speth,R. and Samson,W.K. (2016) A 5′-upstream short open reading frame encoded peptide regulates angiotensin type 1a receptor production and signalling via the β-arrestin pathway. The Journal of Physiology , 594 , 1601–1605. Kao,A., Chiu,C., Vellucci,D., Yang,Y., Patel,V.R., Guan,S., Randall,A., Baldi,P., Rychnovsky,S.D. and Huang,L. (2011) Development of a Novel Cross-linking Strategy for Fast and Accurate Identification of Cross-linked Peptides of Protein Complexes. Mol Cell Proteomics , 10 , M110.002212. Hevler,J.F., Lukassen,M.V., Cabrera-Orefice,A., Arnold,S., Pronker,M.F., Franc,V. and Heck,A.J.R. (2021) Selective cross-linking of coinciding protein assemblies by in-gel cross-linking mass spectrometry. The EMBO Journal , 40 , e106174. Berek,J.S., Renz,M., Kehoe,S., Kumar,L. and Friedlander,M. (2021) Cancer of the ovary, fallopian tube, and peritoneum: 2021 update. International Journal of Gynecology & Obstetrics , 155 , 61–85. Wentzensen,N., Poole,E.M., Trabert,B., White,E., Arslan,A.A., Patel,A.V., Setiawan,V.W., Visvanathan,K., Weiderpass,E., Adami,H.-O., et al. (2016) Ovarian Cancer Risk Factors by Histologic Subtype: An Analysis From the Ovarian Cancer Cohort Consortium. J Clin Oncol , 34 , 2888–2898. Stewart,C., Ralyea,C. and Lockwood,S. (2019) Ovarian Cancer: An Integrated Review. Seminars in Oncology Nursing , 35 , 151–156. Kanehisa,M. and Goto,S. (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res , 28 , 27–30. Soga,T. (2013) Cancer metabolism: Key players in metabolic reprogramming. Cancer Science , 104 , 275–281. Warburg,O. (1925) The Metabolism of Carcinoma Cells1. The Journal of Cancer Research , 9 , 148–163. Vander Heiden,M.G., Cantley,L.C. and Thompson,C.B. (2009) Understanding the Warburg effect: the metabolic requirements of cell proliferation. Science , 324 , 1029–1033. Wolf,C.R., Hayward,I.P., Lawrie,S.S., Buckton,K., McIntyre,M.A., Adams,D.J., Lewis,A.D., Scott,A.R.R. and Smyth,J.F. (1987) Cellular heterogeneity and drug resistance in two ovarian adenocarcinoma cell lines derived from a single patient. International Journal of Cancer , 39 , 695–702. Langdon,S.P., Lawrie,S.S., Hay,F.G., Hawkes,M.M., McDonald,A., Hayward,I.P., Schol,D.J., Hilgers,J., Leonard,R.C.F. and Smyth,J.F. Characterization and Properties of Nine Human Ovarian Adenocarcinoma Cell Lines. Fogh,J., Fogh,J.M. and Orfeo,T. (1977) One hundred and twenty-seven cultured human tumor cell lines producing tumors in nude mice. J Natl Cancer Inst , 59 , 221–226. Hernandez,L., Kim,M.K., Lyle,L.T., Bunch,K.P., House,C.D., Ning,F., Noonan,A.M. and Annunziata,C.M. (2016) Characterization of ovarian cancer cell lines as in vivo models for preclinical studies. Gynecol Oncol , 142 , 332–340. Hallas-Potts,A., Dawson,J.C. and Herrington,C.S. (2019) Ovarian cancer cell lines derived from non-serous carcinomas migrate and invade more aggressively than those derived from high-grade serous carcinomas. Sci Rep , 9 , 5515. Tabb,D.L., Eng,J.K. and Yates,J.R. (2001) Protein Identification by SEQUEST. In James,P. (ed), Proteome Research: Mass Spectrometry , Principles and Practice. Springer, Berlin, Heidelberg, pp. 125–142. McGinnis,S. and Madden,T.L. (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res , 32 , W20-25. Jones,P., Binns,D., Chang,H.-Y., Fraser,M., Li,W., McAnulla,C., McWilliam,H., Maslen,J., Mitchell,A., Nuka,G., et al. (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics , 30 , 1236–1240. Käll,L., Krogh,A. and Sonnhammer,E.L.L. (2004) A Combined Transmembrane Topology and Signal Peptide Prediction Method. Journal of Molecular Biology , 338 , 1027–1036. Sherman,B.T., Hao,M., Qiu,J., Jiao,X., Baseler,M.W., Lane,H.C., Imamichi,T. and Chang,W. (2022) DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update). Nucleic Acids Research , 50 , W216–W221. Dong,X.-C., Jing,L.-M., Wang,W.-X. and Gao,Y.-X. (2016) Down-regulation of SIRT3 promotes ovarian carcinoma metastasis. Biochem Biophys Res Commun , 475 , 245–250. Sebastián,C., Zwaans,B.M.M., Silberman,D.M., Gymrek,M., Goren,A., Zhong,L., Ram,O., Truelove,J., Guimaraes,A.R., Toiber,D., et al. (2012) The histone deacetylase SIRT6 is a tumor suppressor that controls cancer metabolism. Cell , 151 , 1185–1199. Zhang,J., Yin,X.-J., Xu,C.-J., Ning,Y.-X., Chen,M., Zhang,H., Chen,S.-F. and Yao,L.-Q. (2015) The histone deacetylase SIRT6 inhibits ovarian cancer cell proliferation via down-regulation of Notch 3 expression. Eur Rev Med Pharmacol Sci , 19 , 818–824. Shannon,P., Markiel,A., Ozier,O., Baliga,N.S., Wang,J.T., Ramage,D., Amin,N., Schwikowski,B. and Ideker,T. (2003) Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. , 13 , 2498–2504. Jensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T., Julien,P., Roth,A., Simonovic,M., et al. (2009) STRING 8--a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res , 37 , D412-416. Doncheva,N.T., Morris,J.H., Gorodkin,J. and Jensen,L.J. (2019) Cytoscape StringApp: Network Analysis and Visualization of Proteomics Data. J Proteome Res , 18 , 623–632. Oughtred,R., Rust,J., Chang,C., Breitkreutz,B.-J., Stark,C., Willems,A., Boucher,L., Leung,G., Kolas,N., Zhang,F., et al. (2021) The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci , 30 , 187–200. Orchard,S., Ammari,M., Aranda,B., Breuza,L., Briganti,L., Broackes-Carter,F., Campbell,N.H., Chavali,G., Chen,C., del-Toro,N., et al. (2014) The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Research , 42 , D358–D363. Bindea,G., Mlecnik,B., Hackl,H., Charoentong,P., Tosolini,M., Kirilovsky,A., Fridman,W.-H., Pagès,F., Trajanoski,Z. and Galon,J. (2009) ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics , 25 , 1091–1093. Yang,J., Yan,R., Roy,A., Xu,D., Poisson,J. and Zhang,Y. (2015) The I-TASSER Suite: protein structure and function prediction. Nat Methods , 12 , 7–8. Kozakov,D., Hall,D.R., Xia,B., Porter,K.A., Padhorny,D., Yueh,C., Beglov,D. and Vajda,S. (2017) The ClusPro web server for protein–protein docking. Nat Protoc , 12 , 255–278. Jumper,J., Evans,R., Pritzel,A., Green,T., Figurnov,M., Ronneberger,O., Tunyasuvunakool,K., Bates,R., Žídek,A., Potapenko,A., et al. (2021) Highly accurate protein structure prediction with AlphaFold. Nature , 596 , 583–589. Becker,K.G., Barnes,K.C., Bright,T.J. and Wang,S.A. (2004) The genetic association database. Nat Genet , 36 , 431–432. Käll,L., Canterbury,J.D., Weston,J., Noble,W.S. and MacCoss,M.J. (2007) Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat Methods , 4 , 923–925. The,M., MacCoss,M.J., Noble,W.S. and Käll,L. (2016) Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0. J. Am. Soc. Mass Spectrom. , 27 , 1719–1727. Paulo,J.A., Gaun,A., Kadiyala,V., Ghoulidi,A., Banks,P.A., Conwell,D.L. and Steen,H. (2013) Subcellular Fractionation Enhances Proteome Coverage of Pancreatic Duct Cells. Biochim Biophys Acta , 1834 , 791–797. Na,Z., Dai,X., Zheng,S.-J., Bryant,C.J., Loh,K.H., Su,H., Luo,Y., Buhagiar,A.F., Cao,X., Baserga,S.J., et al. (2022) Mapping subcellular localizations of unannotated microproteins and alternative proteins with MicroID. Molecular Cell , 82 , 2900-2911.e7. Eyckerman,S., Titeca,K., Van Quickelberghe,E., Cloots,E., Verhee,A., Samyn,N., De Ceuninck,L., Timmerman,E., De Sutter,D., Lievens,S., et al. (2016) Trapping mammalian protein complexes in viral particles. Nat Commun , 7 , 11416. Roux,K.J., Kim,D.I., Burke,B. and May,D.G. (2018) BioID: A Screen for Protein-Protein Interactions. Curr Protoc Protein Sci , 91 , 19.23.1-19.23.15. Alam,M.S. (2018) Proximity Ligation Assay (PLA). Curr Protoc Immunol , 123 , e58. Therachiyil,L., Anand,A., Azmi,A., Bhat,A., Korashy,H.M. and Uddin,S. (2022) Role of RAS signaling in ovarian cancer. F1000Res , 11 , 1253. Zheng,Z.-Y., Elsarraj,H., Lei,J.T., Hong,Y., Anurag,M., Feng,L., Kennedy,H., Shen,Y., Lo,F., Zhao,Z., et al. (2022) Elevated NRAS expression during DCIS is a potential driver for progression to basal-like properties and local invasiveness. Breast Cancer Research , 24 , 68. Birkeland,E., Wik,E., Mjøs,S., Hoivik,E.A., Trovik,J., Werner,H.M.J., Kusonmano,K., Petersen,K., Raeder,M.B., Holst,F., et al. (2012) KRAS gene amplification and overexpression but not mutation associates with aggressive and metastatic endometrial cancer. Br J Cancer , 107 , 1997–2004. Zhou,J.-D., Yao,D.-M., Li,X.-X., Zhang,T.-J., Zhang,W., Ma,J.-C., Guo,H., Deng,Z.-Q., Lin,J. and Qian,J. (2017) KRAS overexpression independent of RAS mutations confers an adverse prognosis in cytogenetically normal acute myeloid leukemia. Oncotarget , 8 , 66087–66097. Jung,J., Cho,K.-J., Naji,A.K., Clemons,K.N., Wong,C.O., Villanueva,M., Gregory,S., Karagas,N.E., Tan,L., Liang,H., et al. (2019) HRAS-driven cancer cells are vulnerable to TRPML1 inhibition. EMBO reports , 20 , e46685. Miglietta,G., Gouda,A.S., Cogoi,S., Pedersen,E.B. and Xodo,L.E. (2015) Nucleic Acid Targeted Therapy: G4 Oligonucleotides Downregulate HRAS in Bladder Cancer Cells through a Decoy Mechanism. ACS Med. Chem. Lett. , 6 , 1179–1183. November 2020,19 (2020) The Human Protein Atlas: A 20-year journey into the body. Science | AAAS . Ouyang,S., Zhang,Q., Lou,L., Zhu,K., Li,Z., Liu,P. and Zhang,X. (2022) The Double-Edged Sword of SIRT3 in Cancer and Its Therapeutic Applications. Frontiers in Pharmacology , 13 . Chen,G., Gharib,T.G., Huang,C.-C., Taylor,J.M.G., Misek,D.E., Kardia,S.L.R., Giordano,T.J., Iannettoni,M.D., Orringer,M.B., Hanash,S.M., et al. (2002) Discordant Protein and mRNA Expression in Lung Adenocarcinomas *. Molecular & Cellular Proteomics , 1 , 304–313. Bauernfeind,A.L. and Babbitt,C.C. (2017) The predictive nature of transcript expression levels on protein expression in adult human brain. BMC Genomics , 18 , 322. Perl,K., Ushakov,K., Pozniak,Y., Yizhar-Barnea,O., Bhonker,Y., Shivatzki,S., Geiger,T., Avraham,K.B. and Shamir,R. (2017) Reduced changes in protein compared to mRNA levels across non-proliferating tissues. BMC Genomics , 18 , 305. Fukao,Y. (2015) Discordance between protein and transcript levels detected by selected reaction monitoring. Plant Signal Behav , 10 , e1017697. Brion,C., Lutz,S.M. and Albert,F.W. (2020) Simultaneous quantification of mRNA and protein in single cells reveals post-transcriptional effects of genetic variation. eLife , 9 , e60645. De Marco,C., Rinaldo,N., Bruni,P., Malzoni,C., Zullo,F., Fabiani,F., Losito,S., Scrima,M., Marino,F.Z., Franco,R., et al. (2013) Multiple genetic alterations within the PI3K pathway are responsible for AKT activation in patients with ovarian carcinoma. PLoS One , 8 , e55362. Wang,G., Yang,X., Li,C., Cao,X., Luo,X. and Hu,J. (2014) PIK3R3 Induces Epithelial-to-Mesenchymal Transition and Promotes Metastasis in Colorectal Cancer. Molecular Cancer Therapeutics , 13 , 1837–1847. Stronach,E.A., Chen,M., Maginn,E.N., Agarwal,R., Mills,G.B., Wasan,H. and Gabra,H. (2011) DNA-PK mediates AKT activation and apoptosis inhibition in clinically acquired platinum resistance. Neoplasia , 13 , 1069–1080. Liu,Q., Turner,K.M., Alfred Yung,W.K., Chen,K. and Zhang,W. (2014) Role of AKT signaling in DNA repair and clinical response to cancer therapy. Neuro Oncol , 16 , 1313–1323. Arlt,C., Ihling,C.H. and Sinz,A. (2015) Structure of full-length p53 tumor suppressor probed by chemical cross-linking and mass spectrometry. PROTEOMICS , 15 , 2746–2755. Wells,M., Tidow,H., Rutherford,T.J., Markwick,P., Jensen,M.R., Mylonas,E., Svergun,D.I., Blackledge,M. and Fersht,A.R. (2008) Structure of tumor suppressor p53 and its intrinsically disordered N-terminal transactivation domain. Proceedings of the National Academy of Sciences , 105 , 5762–5767. Hoyos,D., Greenbaum,B. and Levine,A.J. (2022) The genotypes and phenotypes of missense mutations in the proline domain of the p53 protein. Cell Death Differ , 29 , 938–945. Schildkraut,J.M., Goode,E.L., Clyde,M.A., Iversen,E.S., Moorman,P.G., Berchuck,A., Marks,J.R., Lissowska,J., Brinton,L., Peplonska,B., et al. (2009) Single Nucleotide Polymorphisms in the TP53 Region and Susceptibility to Invasive Epithelial Ovarian Cancer. Cancer Research , 69 , 2349–2357. Yaginuma,Y. and Westphal,H. (1992) Abnormal structure and expression of the p53 gene in human ovarian carcinoma cell lines. Cancer Res , 52 , 4196–4199. Willis,S., Villalobos,V.M., Gevaert,O., Abramovitz,M., Williams,C., Sikic,B.I. and Leyland-Jones,B. (2016) Single Gene Prognostic Biomarkers in Ovarian Cancer: A Meta-Analysis. PLoS One , 11 , e0149183. Weberpals,J.I., Pugh,T.J., Marco‐Casanova,P., Goss,G.D., Andrews Wright,N., Rath,P., Torchia,J., Fortuna,A., Jones,G.N., Roudier,M.P., et al. (2021) Tumor genomic, transcriptomic, and immune profiling characterizes differential response to first‐line platinum chemotherapy in high grade serous ovarian cancer. Cancer Med , 10 , 3045–3058. Murga,M., Lecona,E., Kamileri,I., Díaz,M., Lugli,N., Sotiriou,S.K., Anton,M.E., Méndez,J., Halazonetis,T.D. and Fernandez-Capetillo,O. (2016) POLD3 Is Haploinsufficient for DNA Replication in Mice. Molecular Cell , 63 , 877–883. Perez-Riverol,Y., Bai,J., Bandla,C., García-Seisdedos,D., Hewapathirana,S., Kamatchinathan,S., Kundu,D.J., Prakash,A., Frericks-Zipper,A., Eisenacher,M., et al. (2022) The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. Nucleic Acids Research , 50 , D543–D552. Additional Declarations There is no duality of interest Supplementary Files SupplementalTables.xlsx Supplementalfigures.docx Cite Share Download PDF Status: Published Journal Publication published 30 Sep, 2024 Read the published version in Cell Death & Disease → Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {\"props\":{\"pageProps\":{\"initialData\":{\"identity\":\"rs-3972487\",\"acceptedTermsAndConditions\":true,\"allowDirectSubmit\":false,\"archivedVersions\":[],\"articleType\":\"Article\",\"associatedPublications\":[],\"authors\":[{\"id\":275431352,\"identity\":\"afcc5c4e-07f0-4329-a92e-7fb0e30e8ad0\",\"order_by\":0,\"name\":\"Cardon Tristan\",\"email\":\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABEklEQVRIie3PMUsDMRTA8RcCd0s065NqP0Ogi4Viv8odwnV1dHA4OEgX0dV+i27qlhJwutr1oA4V4eYUQRxETK5wOuQ6O+Q/PRJ+vAQgFPqXkbwdEWAEQO20EafH9mrjA8wR9UuyHUkEMjsJP7H9IXo3JdBNxj1dvJurF+DT5Wu1fVil9zETkFwg40Aj49tymEpUTzVgORkMZ+U6fSwcsQ87yim98xFGJKhIg4As6h3IdTrX0eTTEaG49v6FkcKob0t47cizI0mzZawo7SA5LqQl2GxRllDVEAEdpCQSlzeaYVXT4UyeD1qC2k/i6/jNXH7oPr/NSLWVZyfz1SIH8zXq82nhJe02z9leEAqFQqF9/QBVL1gHjloB0AAAAABJRU5ErkJggg==\",\"orcid\":\"https://orcid.org/0000-0003-1751-0528\",\"institution\":\"Unversité de Lille, Inserm, CHU Lille\",\"correspondingAuthor\":true,\"prefix\":\"\",\"firstName\":\"Cardon\",\"middleName\":\"\",\"lastName\":\"Tristan\",\"suffix\":\"\"},{\"id\":275431353,\"identity\":\"14a02f28-95c2-441d-b25c-ea6260d55c39\",\"order_by\":1,\"name\":\"Diego Garcia-del Rio\",\"email\":\"\",\"orcid\":\"\",\"institution\":\"Unversité de Lille, Inserm, CHU Lille\",\"correspondingAuthor\":false,\"prefix\":\"\",\"firstName\":\"Diego\",\"middleName\":\"Garcia-del\",\"lastName\":\"Rio\",\"suffix\":\"\"},{\"id\":275431354,\"identity\":\"39f8d473-9a0b-41fb-b810-f1b64c980468\",\"order_by\":2,\"name\":\"Mehdi Derhourhi\",\"email\":\"\",\"orcid\":\"\",\"institution\":\"Université de Lille, Inserm/CNRS , Pasteur Institute of Lille\",\"correspondingAuthor\":false,\"prefix\":\"\",\"firstName\":\"Mehdi\",\"middleName\":\"\",\"lastName\":\"Derhourhi\",\"suffix\":\"\"},{\"id\":275431355,\"identity\":\"c30b8f32-e289-4a6f-9b1f-3856f516ef62\",\"order_by\":3,\"name\":\"Amelie Bonnefond\",\"email\":\"\",\"orcid\":\"\",\"institution\":\"Université de Lille\",\"correspondingAuthor\":false,\"prefix\":\"\",\"firstName\":\"Amelie\",\"middleName\":\"\",\"lastName\":\"Bonnefond\",\"suffix\":\"\"},{\"id\":275431356,\"identity\":\"71ff7630-648b-4ae7-9808-a54d16cf000c\",\"order_by\":4,\"name\":\"Sebastien Leblanc\",\"email\":\"\",\"orcid\":\"\",\"institution\":\"Université de Sherbrooke\",\"correspondingAuthor\":false,\"prefix\":\"\",\"firstName\":\"Sebastien\",\"middleName\":\"\",\"lastName\":\"Leblanc\",\"suffix\":\"\"},{\"id\":275431357,\"identity\":\"b4437f6c-0107-4f83-90c8-405350ba67ac\",\"order_by\":5,\"name\":\"Noe Guilloy\",\"email\":\"\",\"orcid\":\"\",\"institution\":\"Université de Sherbrooke\",\"correspondingAuthor\":false,\"prefix\":\"\",\"firstName\":\"Noe\",\"middleName\":\"\",\"lastName\":\"Guilloy\",\"suffix\":\"\"},{\"id\":275431358,\"identity\":\"0d84c99c-6173-4b61-be26-05fc35d07887\",\"order_by\":6,\"name\":\"Xavier Roucou\",\"email\":\"\",\"orcid\":\"https://orcid.org/0000-0001-9370-5584\",\"institution\":\"Université de Sherbrooke\",\"correspondingAuthor\":false,\"prefix\":\"\",\"firstName\":\"Xavier\",\"middleName\":\"\",\"lastName\":\"Roucou\",\"suffix\":\"\"},{\"id\":275431359,\"identity\":\"90c35e91-676f-4336-9d63-6ec0b6a671a0\",\"order_by\":7,\"name\":\"Sven Eyckerman\",\"email\":\"\",\"orcid\":\"\",\"institution\":\"VIB Center for Medical Biotechnology, Ghent University\",\"correspondingAuthor\":false,\"prefix\":\"\",\"firstName\":\"Sven\",\"middleName\":\"\",\"lastName\":\"Eyckerman\",\"suffix\":\"\"},{\"id\":275431360,\"identity\":\"a5c08175-740b-474f-9cdd-ce7a0742060f\",\"order_by\":8,\"name\":\"Kris Gevaert\",\"email\":\"\",\"orcid\":\"\",\"institution\":\"\",\"correspondingAuthor\":false,\"prefix\":\"\",\"firstName\":\"Kris\",\"middleName\":\"\",\"lastName\":\"Gevaert\",\"suffix\":\"\"},{\"id\":275431361,\"identity\":\"98e9e2db-2c1c-478f-9a72-20e84d1be46e\",\"order_by\":9,\"name\":\"Michel Salzet\",\"email\":\"\",\"orcid\":\"https://orcid.org/0000-0003-4318-0817\",\"institution\":\"Unversité de Lille, Inserm, CHU Lille\",\"correspondingAuthor\":false,\"prefix\":\"\",\"firstName\":\"Michel\",\"middleName\":\"\",\"lastName\":\"Salzet\",\"suffix\":\"\"}],\"badges\":[],\"createdAt\":\"2024-02-20 10:25:31\",\"currentVersionCode\":1,\"declarations\":\"\",\"doi\":\"10.21203/rs.3.rs-3972487/v1\",\"doiUrl\":\"https://doi.org/10.21203/rs.3.rs-3972487/v1\",\"draftVersion\":[],\"editorialEvents\":[{\"content\":\"https://doi.org/10.1038/s41419-024-07046-1\",\"type\":\"published\",\"date\":\"2024-09-30T04:00:00+00:00\"}],\"editorialNote\":\"\",\"failedWorkflow\":false,\"files\":[{\"id\":54387825,\"identity\":\"a2d0af01-1844-41e2-b84f-e579a8db00fc\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:55\",\"extension\":\"png\",\"order_by\":1,\"title\":\"Figure 1\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":2686902,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eLFQ analysis workflow. (A) Illustration of the Proteome Discoverer analysis steps used. Each child processing step corresponds to the interrogation using the cell-specific database. (B) Workflow nodes present in each processing child step.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig1.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/9543da51ca00eeb13c215d3f.png\"},{\"id\":54387827,\"identity\":\"09f5a30a-756d-4017-8983-768ddf989f64\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:56\",\"extension\":\"png\",\"order_by\":2,\"title\":\"Figure 2\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":2276649,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eDESeq2 transcripts analysis. (A) Venn diagram displaying the number of exclusive and shared transcripts between the three cell lines. (B) Hierarchical clustering heatmap showing the different transcript clusters that can be observed among the three cell lines. Z-score range from -1.3509 (green) to 1.3523 (red).\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig2.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/30f36fed944ba091d8c9d505.png\"},{\"id\":54387834,\"identity\":\"7af0f240-68a8-4667-8075-59203e01cca1\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:56\",\"extension\":\"png\",\"order_by\":3,\"title\":\"Figure 3\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":1653884,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eDESeq2 gene analysis. (A) Venn diagram displaying the number of exclusive and shared genes between the three cell lines. (B) Pie chart displaying the ratios of the different types of RNAs sequenced. (C) Hierarchical clustering heatmap showing the different gene clusters that can be observed among the three cell lines. Z-score range from -1.351 (green) to 1.3496 (red).\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig3.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/71ba8f3b88d21feaa3fca90e.png\"},{\"id\":54387833,\"identity\":\"5e901e65-19fb-4316-9c31-7326b49ad855\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:56\",\"extension\":\"png\",\"order_by\":4,\"title\":\"Figure 4\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":1082734,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eWT and variant proteins predicted by OpenCustomDB. For each cell line and database, the fractions of AltProts, RefProts, novel isoforms and their variants are displayed.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig4.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/91f760fe197cfe035ef8c8f5.png\"},{\"id\":54388456,\"identity\":\"8b95e462-128a-4a93-9a3c-d98bffb3e915\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:32:56\",\"extension\":\"png\",\"order_by\":5,\"title\":\"Figure 5\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":969583,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eTypes of AltProts predicted by OpenCustomDB. The percentages of ncRNA, CDS frameshifts, 3’ and 5’UTR derived AltProts are displayed for each database and cell line.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig5.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/a66ce67c0335cea8161ace07.png\"},{\"id\":54387832,\"identity\":\"1945f359-a1d2-4d10-a36a-6a7dcf5909c2\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:56\",\"extension\":\"png\",\"order_by\":6,\"title\":\"Figure 6\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":2996838,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eAnalysis of the identified proteins. (A) Venn diagrams displaying the number of exclusive and shared proteins identified between the three cell lines. (B) Bar plot displaying the fractions of WT and variant RefProts, novel isoforms and AltProts identified in each cell line.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig6.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/b0146e217b85962d106d266f.png\"},{\"id\":54387838,\"identity\":\"2eac08ed-e749-4571-91df-2591db8827ae\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:57\",\"extension\":\"png\",\"order_by\":7,\"title\":\"Figure 7\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":2735254,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eSubcellular compartment distribution and characteristics of identified AltProts. (A) Venn diagram displaying the distribution of AltProts identified in the different subcellular fractions. (B) RNA origin and (C) molecular weight distribution of the identified AltProts.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig7.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/6d9870ac4e979a45a80b9a8a.png\"},{\"id\":54387835,\"identity\":\"5a922d22-e5aa-4637-85d3-5831ca89f5bb\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:56\",\"extension\":\"png\",\"order_by\":8,\"title\":\"Figure 8\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":1783241,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eAltProts with significantly changed levels exclusively in one of two cancerous cell lines or common in both (ANOVA, FDR \\u0026lt;0.05). For each cell line, the subcellular compartment, the AltProts upregulated (red) and downregulated (green) are shown.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig8.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/9c34f768d9669b31278672e2.png\"},{\"id\":54387831,\"identity\":\"24ece3cc-cf04-4100-af75-ea7a7f19efd3\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:56\",\"extension\":\"png\",\"order_by\":9,\"title\":\"Figure 9\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":2186951,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eRefProts and genes significantly varied (ANOVA, FDR \\u0026lt;0.05) in the central carbon metabolism in the cancer pathway. (A) Central carbon metabolism in cancer, up and downregulation in both cancerous cells. (B) Central carbon metabolism in cancer, up and downregulation only in SKOV-3. (C) Central carbon metabolism in cancer, up and downregulation only in PEO-4.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig9.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/db16ff57181e48d703b02529.png\"},{\"id\":54388457,\"identity\":\"4e094585-2f8e-420b-808f-bc53b3df6fe7\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:32:57\",\"extension\":\"png\",\"order_by\":10,\"title\":\"Figure 10\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":1729861,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eGO molecular function enrichment network generated with ClueGO in Cytoscape. GO enrichment was generated from the accession numbers of Supplemental figure 4. AltProts are marked in orange and RefProts in blue. Enriched GO terms are displayed as hexagons. KEGG pathways are displayed as octagons and crosslinks are marked in blue (SKOV-3 cells), purple (PEO-4 cells) and green (T1074 cells) dashed lines.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig10.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/bcbd3234ea8348cb36607ccc.png\"},{\"id\":54388455,\"identity\":\"9c15f877-c1c9-4cda-abea-2795dad915aa\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:32:56\",\"extension\":\"png\",\"order_by\":11,\"title\":\"Figure 11\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":547713,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eSynthesis of AltProts from the DHX8 gene (A) List of transcripts referenced in Ensembl. (B) Zoom on DHX8-204 described to translate to “K7EJH9”, a predicted protein from TrEMBL without the 5’UTR part or methionine as the first amino acid, when IP_715944 is described from the overlap between the CDS and the 3’UTR. As a result, the mutation is only observable by the proteogenomic construction, as it would be considered a silent mutation due to its position in the UTR part of K7EJH9.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig11.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/98b5aee217492351fad0c916.png\"},{\"id\":54387837,\"identity\":\"edefe0a6-525d-4e6a-af88-0f51874acf01\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:56\",\"extension\":\"png\",\"order_by\":12,\"title\":\"Figure 12\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":2479312,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eIP_183088 (AltMAPK8) predicted models docked to the human polymerase delta holoenzyme complex. (A) The interaction of IP_183088 and the full POLD complex is shown. The distance between the two lysines involved in the crosslink is 24.59 Å. (B) Zoom of the interaction of IP_183088 and POLD3. The surface representation shows the possible placement of IP_183088 at the interaction site of POLD3 and POLD2.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"Fig12.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/28008f27dff2753351604d7e.png\"},{\"id\":65671327,\"identity\":\"f0f1c326-fdf9-405b-9ebe-a72fc320ca4a\",\"added_by\":\"auto\",\"created_at\":\"2024-10-01 07:10:32\",\"extension\":\"pdf\",\"order_by\":0,\"title\":\"\",\"display\":\"\",\"copyAsset\":false,\"role\":\"manuscript-pdf\",\"size\":23696559,\"visible\":true,\"origin\":\"\",\"legend\":\"\",\"description\":\"\",\"filename\":\"manuscript.pdf\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/7a0a6369-0794-4cd0-bf31-e9db47b9b1c0.pdf\"},{\"id\":54387826,\"identity\":\"8e36818d-c1de-4ee1-b137-1e37cb51d7d8\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:55\",\"extension\":\"xlsx\",\"order_by\":1,\"title\":\"\",\"display\":\"\",\"copyAsset\":false,\"role\":\"supplement\",\"size\":7511812,\"visible\":true,\"origin\":\"\",\"legend\":\"\",\"description\":\"\",\"filename\":\"SupplementalTables.xlsx\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/c92128284bb977027a6869fc.xlsx\"},{\"id\":54387828,\"identity\":\"f94cf099-dc08-4798-a307-b268ed89aeb4\",\"added_by\":\"auto\",\"created_at\":\"2024-04-09 18:24:56\",\"extension\":\"docx\",\"order_by\":2,\"title\":\"\",\"display\":\"\",\"copyAsset\":false,\"role\":\"supplement\",\"size\":2961681,\"visible\":true,\"origin\":\"\",\"legend\":\"\",\"description\":\"\",\"filename\":\"Supplementalfigures.docx\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3972487/v1/605924c98016b4bd1a26950a.docx\"}],\"financialInterests\":\"There is no duality of interest\",\"formattedTitle\":\"Deciphering the ghost proteome in ovarian cancer cells by deep proteogenomic characterization\",\"fulltext\":[{\"header\":\"INTRODUCTION\",\"content\":\"\\u003cp\\u003eHistorically, protein sequence databases have only considered proteins to originate from the coding regions of mRNA molecules (CDS) (\\u003cspan citationid=\\\"CR1\\\" class=\\\"CitationRef\\\"\\u003e1\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR2\\\" class=\\\"CitationRef\\\"\\u003e2\\u003c/span\\u003e). However, we now know that the sequences of many products of transcript translation are not stored in such databases (\\u003cspan citationid=\\\"CR3\\\" class=\\\"CitationRef\\\"\\u003e3\\u003c/span\\u003e). Such translated transcripts include small open reading frames (smORFs) (\\u003cspan additionalcitationids=\\\"CR5\\\" citationid=\\\"CR4\\\" class=\\\"CitationRef\\\"\\u003e4\\u003c/span\\u003e\\u0026ndash;\\u003cspan citationid=\\\"CR6\\\" class=\\\"CitationRef\\\"\\u003e6\\u003c/span\\u003e), which translate to short encoding proteins (SEPs) (\\u003cspan citationid=\\\"CR7\\\" class=\\\"CitationRef\\\"\\u003e7\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR8\\\" class=\\\"CitationRef\\\"\\u003e8\\u003c/span\\u003e) with a length of less than 100 amino acids. Additionally, alternative proteins (AltProt) (\\u003cspan additionalcitationids=\\\"CR10\\\" citationid=\\\"CR9\\\" class=\\\"CitationRef\\\"\\u003e9\\u003c/span\\u003e\\u0026ndash;\\u003cspan citationid=\\\"CR11\\\" class=\\\"CitationRef\\\"\\u003e11\\u003c/span\\u003e) are translated from alternative ORFs (AltORFs) present in non-coding regions, including the 5' and 3'UTRs, overlapping a CDS with a\\u0026thinsp;+\\u0026thinsp;1 or +\\u0026thinsp;2 reading, or present in non-coding RNAs (ncRNAs). In contrast to SEPs, AltProts are not limited to a maximum length of 100 amino acids. Synthesis of such proteins may result from leaky scanning and reinitiation of ribosomes as described by Marylin Kozak (\\u003cspan citationid=\\\"CR12\\\" class=\\\"CitationRef\\\"\\u003e12\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR13\\\" class=\\\"CitationRef\\\"\\u003e13\\u003c/span\\u003e). However, such underlying mechanisms remain poorly understood and, importantly, they were not considered when the first protein databases were built, explaining the absence of quite some protein sequences in the most-often used protein sequence databases such as Swiss-Prot. Nevertheless, an effort has been made to make such databases more comprehensive, notably by integrating predicted protein sequences (TrEmbl) (\\u003cspan citationid=\\\"CR14\\\" class=\\\"CitationRef\\\"\\u003e14\\u003c/span\\u003e) which increase the size of the (theoretical) proteome. Yet, the used prediction rules are restrictive and do not consider the concept of AltProts. To tackle this, databases holding predicted sequence for AltProts such as OpenProt (\\u003cspan citationid=\\\"CR9\\\" class=\\\"CitationRef\\\"\\u003e9\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR15\\\" class=\\\"CitationRef\\\"\\u003e15\\u003c/span\\u003e) have been created. With such databases AltProts can now be identified from bottom-up proteomic datasets. However, although such databases consider the presence of the \\\"ghost proteome\\\", they do not consider mutations and neither the transcriptomic expression of samples. To overcome these limitations, OpenCustomDB(\\u003cspan citationid=\\\"CR16\\\" class=\\\"CitationRef\\\"\\u003e16\\u003c/span\\u003e), is a new tool that uses RNA-seq data to generate sample-specific protein sequence databases incorporating AltProts and their genetic variants. Such a proteogenomic approach coupled with AltProt research, is therefore expected to provide more comprehensive views on cellular proteomes.\\u003c/p\\u003e \\u003cp\\u003eAltProts are ubiquitously expressed in cells and can carry physiological functions (\\u003cspan citationid=\\\"CR17\\\" class=\\\"CitationRef\\\"\\u003e17\\u003c/span\\u003e). Several AltProts have been linked to several pathways such as protein synthesis (\\u003cspan additionalcitationids=\\\"CR19\\\" citationid=\\\"CR18\\\" class=\\\"CitationRef\\\"\\u003e18\\u003c/span\\u003e\\u0026ndash;\\u003cspan citationid=\\\"CR20\\\" class=\\\"CitationRef\\\"\\u003e20\\u003c/span\\u003e), DNA repair (\\u003cspan citationid=\\\"CR8\\\" class=\\\"CitationRef\\\"\\u003e8\\u003c/span\\u003e) and innate immunity (\\u003cspan citationid=\\\"CR17\\\" class=\\\"CitationRef\\\"\\u003e17\\u003c/span\\u003e). AltProts have also been linked to pathologies (\\u003cspan citationid=\\\"CR21\\\" class=\\\"CitationRef\\\"\\u003e21\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR22\\\" class=\\\"CitationRef\\\"\\u003e22\\u003c/span\\u003e) such as cancers (glioblastoma, breast, ovarian and colorectal cancer) (\\u003cspan additionalcitationids=\\\"CR24 CR25\\\" citationid=\\\"CR23\\\" class=\\\"CitationRef\\\"\\u003e23\\u003c/span\\u003e\\u0026ndash;\\u003cspan citationid=\\\"CR26\\\" class=\\\"CitationRef\\\"\\u003e26\\u003c/span\\u003e) and amyotrophic lateral sclerosis (Alt-FUS) (\\u003cspan citationid=\\\"CR27\\\" class=\\\"CitationRef\\\"\\u003e27\\u003c/span\\u003e). Although their identification has been facilitated by specific enrichment and detection strategies (\\u003cspan citationid=\\\"CR19\\\" class=\\\"CitationRef\\\"\\u003e19\\u003c/span\\u003e, \\u003cspan additionalcitationids=\\\"CR29\\\" citationid=\\\"CR28\\\" class=\\\"CitationRef\\\"\\u003e28\\u003c/span\\u003e\\u0026ndash;\\u003cspan citationid=\\\"CR30\\\" class=\\\"CitationRef\\\"\\u003e30\\u003c/span\\u003e), for the overall majority of AltProts, their functions remains to be elucidated, yet targeted approaches have shed light on the function of a few AltProts (\\u003cspan citationid=\\\"CR20\\\" class=\\\"CitationRef\\\"\\u003e20\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR29\\\" class=\\\"CitationRef\\\"\\u003e29\\u003c/span\\u003e, \\u003cspan additionalcitationids=\\\"CR32\\\" citationid=\\\"CR31\\\" class=\\\"CitationRef\\\"\\u003e31\\u003c/span\\u003e\\u0026ndash;\\u003cspan citationid=\\\"CR33\\\" class=\\\"CitationRef\\\"\\u003e33\\u003c/span\\u003e). Recently, we have demonstrated the effectiveness of a protein crosslink strategy coupled to mass spectrometry (XL-MS) to annotate AltProt functions. XL-MS enabled us to identify interactions that are very close in space from 5.3 \\u0026Aring; (\\u003cspan citationid=\\\"CR34\\\" class=\\\"CitationRef\\\"\\u003e34\\u003c/span\\u003e) to 30 \\u0026Aring; (\\u003cspan citationid=\\\"CR35\\\" class=\\\"CitationRef\\\"\\u003e35\\u003c/span\\u003e), and by identifying crosslinked peptides between AltProts and known proteins, it completed our understanding of the function of these new proteins.\\u003c/p\\u003e \\u003cp\\u003eOvarian cancer (OvCa) is considered a stealth killer due to its misdiagnosis and extended chemoresistance to treatment. In 2021, OvCa was the 8th most frequently diagnosed and source of fatal cancer in women (\\u003cspan citationid=\\\"CR36\\\" class=\\\"CitationRef\\\"\\u003e36\\u003c/span\\u003e). The high mortality rate of OvCa is related to its late detection. In the initial stages of the pathology, few unspecific symptoms are evident and diagnostic methods are not sufficiently effective (\\u003cspan citationid=\\\"CR37\\\" class=\\\"CitationRef\\\"\\u003e37\\u003c/span\\u003e). The current standard treatment is based on surgery or chemotherapy. For advanced stage tumours, debulking surgery and subsequent adjuvant chemotherapy is needed (carboplatin combined with paclitaxel is most commonly used). With this combination of treatments, up to 80% of patients will go into remission, but around 65% will relapse. Radical strategies such as oophorectomy and salpingectomy are recommended for avoiding recurrence (\\u003cspan citationid=\\\"CR38\\\" class=\\\"CitationRef\\\"\\u003e38\\u003c/span\\u003e).\\u003c/p\\u003e \\u003cp\\u003eAmong the metabolic pathways involved in cancer. The Kyoto Encyclopedia of Genes and Genomes (KEGG) (\\u003cspan citationid=\\\"CR39\\\" class=\\\"CitationRef\\\"\\u003e39\\u003c/span\\u003e) summarized different metabolic pathways. Among the central carbon metabolism in cancer (hsa05230) summarizes the metabolic changes that take place in cancer cells to facilitate their growth and survival (\\u003cspan citationid=\\\"CR40\\\" class=\\\"CitationRef\\\"\\u003e40\\u003c/span\\u003e). This pathway involves the conversion of glucose and glutamine into intermediate molecules, which are then used to synthesize the necessary macromolecules for the replication of cancer cell biomass and genome. The Warburg effect (\\u003cspan citationid=\\\"CR41\\\" class=\\\"CitationRef\\\"\\u003e41\\u003c/span\\u003e), a key feature of this pathway, is characterized by the heightened utilization of glucose and glutamine by cancer cells. This phenomenon describes the extensive glucose consumption, high rates of glycolysis, and conversion of a significant portion of glucose into lactic acid even in the presence of sufficient oxygen (\\u003cspan citationid=\\\"CR42\\\" class=\\\"CitationRef\\\"\\u003e42\\u003c/span\\u003e). More recently, it has been realized that the Warburg effect also encompasses an increased reliance on glutamine. Along the signalling pathways that regulate c-MYC, HIF-1, and p53, numerous other oncogenes and tumour suppressor genes are clustered(\\u003cspan citationid=\\\"CR40\\\" class=\\\"CitationRef\\\"\\u003e40\\u003c/span\\u003e).\\u003c/p\\u003e \\u003cp\\u003eWe hypothesized that molecular characterization of OvCa at the proteomic level might help to improve patient care and treatment. In this context, studying AltProts may shed light on mechanisms that are not yet completely understood yet have an impact on OvCa pathology. Therefore, we here describe a proteogenomic approach to characterize the ghost proteome of two OvCa cell lines and an immortalized epithelial ovarian cell line. This approach allowed us to identify differential expression of RefProts, novel isoforms, AltProts and their transcripts. Additionally, the subcellular location, characteristics and interactors of several AltProts were mapped.\\u003c/p\\u003e\"},{\"header\":\"MATERIAL AND METHODS\",\"content\":\"\\u003cdiv id=\\\"Sec3\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eCell culture\\u003c/h2\\u003e \\u003cp\\u003eHuman PEO-4 ovarian cancer cells were cultured in Roswell Park Memorial Institute (RPMI) 1640 medium (Thermo Fisher Scientific), supplemented with 10% fetal bovine serum (Thermo Fisher Scientific), 2 mM L-glutamine (Thermo Fisher Scientific) and 100 U/mL penicillin-streptomycin (Thermo Fisher Scientific). Human SKOV-3 ovarian cancer cells were cultured in McCoy's 5A (modified) medium (Thermo Fisher Scientific), supplemented with 10% fetal bovine serum and 100 U/mL penicillin-streptomycin. Human immortalized ovarian epithelial cells SV-40 (T1074) were cultured in Prigrow I medium (Applied Biological Materials), supplemented with 10% fetal bovine serum and 100 U/mL penicillin-streptomycin. The three cell lines were grown in a humidified air incubator at 37\\u0026deg;C under an atmosphere of 5% CO\\u003csub\\u003e2\\u003c/sub\\u003e. Aliquots of three million cells were harvested by trypsin-EDTA (0.05%, phenol red) (Thermo Fisher Scientific), centrifuged at 100 x g for 5 min at 20\\u0026deg;C and washed three times with DPBS (Thermo Fisher Scientific).\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec4\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eCell line specific database creation\\u003c/h2\\u003e \\u003cp\\u003e \\u003cem\\u003eTotal RNA sequencing (RNA-Seq).\\u003c/em\\u003e RNA was extracted from four replicates of three million cells from each cell line employing the NucleoSpin RNA Mini kit for RNA purification (MACHEREY-NAGEL), following the vendor\\u0026rsquo;s protocol. 1 \\u0026micro;g of RNA was utilized for library preparation using RiboNaut rRNA Depletion Kit and Rapid Directional RNAseq Kit 2.0 (PerkinElmer). Nine cycles of PCR were performed during this preparation. Library sequencing was carried out using the NovaSeq6000 sequencing platform (Illumina; SP flow cell) following a 2x75 paired-end mode. Demultiplexing was performed using bcl2fastq v2.20.0.422. Subsequent fastq trimming utilized trimmomatic v0.39 with parameters MINLEN:35 and AVGQUAL:20. The mapping and counting steps were executed using RSEM v1.3.1 along with STAR v2.7.3a, referencing genome version hg38 and GTF from Gencode v39. Differential analysis was conducted through DESeq2 v1.24.0, employing R v3.6.3.\\u003c/p\\u003e \\u003cp\\u003e \\u003cem\\u003eCustomized protein database generation with OpenCustomDB.\\u003c/em\\u003e RNA-Seq reads were aligned to the reference genome GRCh38.p12 using STAR version 2.7.3a with default parameters except for \\u0026lsquo;\\u0026ndash;outSAMprimaryFlag: AllBestScore,\\u0026ndash;outFilterMismatchNmax: 5, \\u0026ndash;alignSJoverhangMin 10, \\u0026ndash;alignMatesGapMax 200 000, \\u0026ndash;alignIntronMax 200 000, \\u0026ndash;alignSJstitchMismatchNmax \\u0026ldquo;5\\u0026thinsp;\\u0026minus;\\u0026thinsp;1 5 5\\u0026rdquo;,\\u0026ndash;bamRemoveDuplicatesType UniqueIdenticalNotMulti\\u0026rsquo;. Transcript expression was quantified in transcripts per million (tpm) with Kallisto version 0.46.0 with default parameters. Variant calling files (VCF) were generated from BAM files with FreeBayes version 1.3.1 with the setting \\u0026ldquo;\\u0026ndash;min-alternate-count\\u0026rdquo; set to 5. SNPs and Indels with FreeBayes quality of less than 20 were filtered out with an internal Python script. Variations were inserted in the corresponding transcripts with the variant annotator OpenVar. Next, the transcripts quantified by Kallisto were arranged in descending order based on their expression level (top 100,000 transcripts). Subsequently, OpenProt-annotated proteins linked to these transcripts were incorporated into the customized database until 100,000 entries (100K DB) were reached, as described by Guilloy \\u003cem\\u003eet al\\u003c/em\\u003e. (\\u003cspan citationid=\\\"CR16\\\" class=\\\"CitationRef\\\"\\u003e16\\u003c/span\\u003e). Upon adding a protein variant to the database, the corresponding reference protein without any variation was simultaneously included to account for potential heterozygosity.\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec5\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eChemical protein cross-linking and subcellular fractionation\\u003c/h2\\u003e \\u003cp\\u003e \\u003cem\\u003eIn cellulo chemical cross-linking.\\u003c/em\\u003e The cross-linking methodology was described in Garcia-del Rio \\u003cem\\u003eet al.\\u003c/em\\u003e (\\u003cspan citationid=\\\"CR17\\\" class=\\\"CitationRef\\\"\\u003e17\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR30\\\" class=\\\"CitationRef\\\"\\u003e30\\u003c/span\\u003e). To prepare a 50 mM stock solution of disuccinimidyl sulfoxide (DSSO, Thermo Fisher Scientific), dry DMSO (Sigma-Aldrich) was used. Three million cells of each cell line were resuspended in 200 \\u0026micro;L of DPBS. The crosslinking reaction was carried out with 2 mM of DSSO (final concentration) at 37\\u0026deg;C with end-over-end stirring. After one hour, the reaction was quenched by adding 10 \\u0026micro;L of 500 mM Tris-HCl pH 8.5 and gently stirring for 30 min.\\u003c/p\\u003e \\u003cp\\u003e \\u003cem\\u003eProtein subcellular fractionation.\\u003c/em\\u003e The subcellular fractionation methodology was also described in our previous work (\\u003cspan citationid=\\\"CR17\\\" class=\\\"CitationRef\\\"\\u003e17\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR30\\\" class=\\\"CitationRef\\\"\\u003e30\\u003c/span\\u003e). In brief, three replicates of three million cells that underwent crosslinking were pelleted and the supernatant was removed. The Subcellular Protein Fractionation Kit for Cultured Cells (Thermo Fisher Scientific) was used to isolate five distinct protein cell compartments: cytoplasmic, membrane, nuclear, chromatin-bound and cytoskeletal proteins. Each fraction was extracted following the manufacturer\\u0026rsquo;s instructions and stored at -80\\u0026deg;C until use.\\u003c/p\\u003e \\u003cp\\u003e \\u003cem\\u003eFilter Aided Sample Preparation (FASP) and digestion.\\u003c/em\\u003e Each subcellular fraction was transferred to a 50 kDa molecular weight cut-off Amicon filter (Merck) and concentrated by centrifugation (14,000 g x 15 min at 4\\u0026deg;C). Proteins were denatured by adding 100 mL of a denaturing buffer (8 M urea (Euromedex), 100 mM Tris-HCl (Interchim), pH 8.5). Reduction was performed by adding 100 mL of 100 mM dithiothreitol (VWR Life Science) in the denaturing buffer and incubating at 56\\u0026deg;C for 40 min. Alkylation was then done by adding 100 mL of 50 mM iodoacetamide (Sigma-Aldrich) in the denaturing buffer at room temperature (RT) for 30 min in the dark. After alkylation, three washes with 200 \\u0026micro;L of 50 mM ammonium bicarbonate buffer were performed. Sequential digestion was performed in each fraction by adding 40 \\u0026micro;L of 40 ng/\\u0026micro;L trypsin/Lys-C Mix, Mass Spec Grade (Promega) to the Amicon filter and incubating at 37\\u0026deg;C overnight, followed by 25 \\u0026micro;L of 40 ng/\\u0026micro;L chymotrypsin, Sequencing Grade (Promega) at room temperature for 4 h. Finally, the resulting peptides were recuperated by adding 50 \\u0026micro;L of ammonium bicarbonate buffer and centrifugating for 15 min at 14,000 x g. Finally, this flowthrough was acidified with 0.1% TFA (Sigma-Aldrich) and vacuum dried.\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec6\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eNano LC-MS/MS analysis\\u003c/h2\\u003e \\u003cp\\u003eThe peptides of each replicate were suspended in 20 \\u0026micro;L of 0.1% TFA and desalted using a ZipTip with C18 resin (Merck), following the manufacturer's instructions. Afterwards, the samples were vacuum-dried and resuspended in 20 \\u0026micro;L of a solution containing acetonitrile (ACN, Carlo Erba Reagents) and 0.1% formic acid (2:98 v/v, TCI America). Five microliters of the resulting peptide solution were analysed on a nanoAcquity (Waters) coupled to a Q Exactive mass spectrometer (Thermo Fisher Scientific), as described in (\\u003cspan citationid=\\\"CR24\\\" class=\\\"CitationRef\\\"\\u003e24\\u003c/span\\u003e).\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec7\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eLabel-free quantification (LFQ) data analysis\\u003c/h2\\u003e \\u003cp\\u003e \\u003cem\\u003eProcessing workflow.\\u003c/em\\u003e The raw data obtained by nanoLC-MS/MS analysis were analysed using Proteome Discoverer V2.5 (Thermo Fisher Scientific). For each subcellular compartment, a different LFQ analysis was performed. Here, three processing steps (for each cell line\\u0026rsquo;s replicates) were employed using Minor Feature Detector and three iterative Sequest HT nodes (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig1\\\" class=\\\"InternalRef\\\"\\u003e1\\u003c/span\\u003eA). The detailed parameters of the Sequest HT node are described in (\\u003cspan citationid=\\\"CR30\\\" class=\\\"CitationRef\\\"\\u003e30\\u003c/span\\u003e). In the first Sequest HT node, the top 100,000 sequences derived from RNA-seq experiments (100K DB) were utilized. Next, a percolator with a relaxed 0.05 FDR and strict 0.01 FDR was applied. A spectrum confidence filter was applied before moving on to the next Sequest HT node, discarding any spectra with a confidence rating worse than high. In the second Sequest HT node, the full transcript-derived database (Full DB) from OpenCustomDB was used, minus the sequences contained in the 100K DB. The same parameters were used for a second percolator and spectrum confidence filter. Finally, in the third Sequest HT node, OpenProt was used to interrogate the sequences not found in the two previous databases (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig1\\\" class=\\\"InternalRef\\\"\\u003e1\\u003c/span\\u003eB).\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003e \\u003cem\\u003eConsensus workflow.\\u003c/em\\u003e The five different subcellular fractionation MSF files were subjected to independent consensus workflows. At the feature mapper node, chromatographic alignment was performed with a maximum retention time shift of 10 min, a 10 ppm mass tolerance and coarse tuning. Unique and razor peptides were used at the precursor ions quantifier node. Protein groups were considered for peptide uniqueness and shared quant results were used. Precursor abundance was based on intensity without any threshold. Total peptide amount was used for normalization mode without scaling mode. All peptides were used for normalization and protein roll-up. Modified peptides (methionine oxidation, N-terminus acetylation and cysteine carbamidomethylation) were excluded for pairwise ratios. At the PSM grouper node, the site probability threshold was set to 75. The strict and relaxed FDRs were set at 0.01 and 0.05, respectively, at the peptide validator node. Validation was based on the q-value, and automatic target/decoy selection was used for PSM level FDR calculation based on score. At the peptides and protein filter node, the peptide confidence was set to medium with six amino acids per peptide. Additionally, a minimum of one peptide was set. A strict (0.01) and relaxed (0.05) FDR confidence threshold were set at the protein FDR validator. The results were filtered for RefProts, AltProts and novel isoforms (\\u003cspan citationid=\\\"CR9\\\" class=\\\"CitationRef\\\"\\u003e9\\u003c/span\\u003e). Briefly, a RefProt is a protein matching a NCBI RefSeq, Ensembl or UniProt protein entry. A novel isoform is a protein encoded by the same gene as a RefProt with a significant level of identity (over 80% of protein sequence identity with the RefProt over 50% of the length). An AltProt does not have any significant similarity with a RefProt.\\u003c/p\\u003e \\u003cp\\u003e \\u003cem\\u003eProtein identification.\\u003c/em\\u003e The master protein files were uploaded as a text file to Perseus v.1.6.10.43. The abundance matrix was annotated into three categories based on the cell lines used: SKOV-3, PEO-4 and T1074. Next to count an identification, proteins needed to be identified in 70% of the replicates from at least one cell line and the groups were averaged. A numeric Venn diagram was used to identify the unique RefProts, AltProts and novel isoforms in each compartment for each cell line.\\u003c/p\\u003e \\u003cp\\u003e \\u003cem\\u003eStatistical analysis workflow.\\u003c/em\\u003e The master protein files were uploaded as a text file to Perseus v.1.6.10.43. As a first step, log2 transformation and categorical annotation were performed on the normalized abundance values matrix, with cell lines SKOV-3, PEO-4 and T1074. To consider a valid identification, proteins needed to be identified in 70% of the replicates from each cell line. Moreover, missing values were replaced with low values of the normal distribution. An ANOVA multiple sample test was performed using a Benjamini-Hochberg FDR q-value cutoff of 0.05. Non-significant values were filtered out, and a Z-score processing was applied without grouping. To ensure quality control, a principal component analysis (PCA) was conducted with a Benjamini-Hochberg FDR cutoff of 0.05. Finally, hierarchical clustering employing Pearson correlation was applied to the averaged Z-scores to identify the different protein clusters.\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec8\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eCrosslinking data analysis\\u003c/h2\\u003e \\u003cp\\u003e \\u003cem\\u003eProcessing workflow.\\u003c/em\\u003e The RAW data obtained by nano LC-MS/MS analysis were analysed using Proteome Discoverer V2.5 (Thermo Fisher Scientific). The detailed parameters for the Sequest HT and XlinkX nodes are described in (\\u003cspan citationid=\\\"CR24\\\" class=\\\"CitationRef\\\"\\u003e24\\u003c/span\\u003e). The triple Sequest HT nodes mentioned earlier were utilized. Instead of a percolator, a target decoy PSM validator was used after each Sequest HT node. A concatenated target decoy strategy was employed, with strict (0.01) and relaxed (0.05) FDR targets.\\u003c/p\\u003e \\u003cp\\u003e \\u003cem\\u003eConsensus workflow.\\u003c/em\\u003e The resulting crosslinking MSF files were submitted to a consensus workflow of which the parameters are described in detail in (\\u003cspan citationid=\\\"CR24\\\" class=\\\"CitationRef\\\"\\u003e24\\u003c/span\\u003e).\\u003c/p\\u003e \\u003c/div\\u003e\"},{\"header\":\"RESULTS\",\"content\":\"\\u003cp\\u003eFor this study, we selected three cell line models to investigate differences in the reference proteome, novel isoforms and the alternative proteome. Two of these cell lines (PEO-4 and SKOV-3 cells) are derived from ascitic fluid from ovarian adenocarcinomas. Particularly, PEO-4 cells have a high-grade serous histology and were collected after clinical resistance from a patient who previously received cisplatin, 5-fluorouracil and chlorambucil treatment (\\u003cspan citationid=\\\"CR43\\\" class=\\\"CitationRef\\\"\\u003e43\\u003c/span\\u003e). PEO-4 cells have been xenografted into immune-deprived mice and found to be tumorigenic (\\u003cspan citationid=\\\"CR44\\\" class=\\\"CitationRef\\\"\\u003e44\\u003c/span\\u003e). SKOV-3 cells are clear cell carcinoma cells and resistant to tumour necrosis factor, diphtheria toxin, cisplatin and adriamycin (\\u003cspan citationid=\\\"CR45\\\" class=\\\"CitationRef\\\"\\u003e45\\u003c/span\\u003e). According to Hernandez \\u003cem\\u003eet al.\\u003c/em\\u003e (\\u003cspan citationid=\\\"CR46\\\" class=\\\"CitationRef\\\"\\u003e46\\u003c/span\\u003e) and Hallas-Potts \\u003cem\\u003eet al.\\u003c/em\\u003e (\\u003cspan citationid=\\\"CR47\\\" class=\\\"CitationRef\\\"\\u003e47\\u003c/span\\u003e), PEO-4 cells have a lower tumorigenicity than SKOV-3 cells when injected in nude mice. The T1074 ovarian cancer cell line was immortalized by SV40 virus and originally derived from normal human ovarian surface epithelial cells.\\u003c/p\\u003e \\u003cdiv id=\\\"Sec10\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eDifferential gene expression analysis\\u003c/h2\\u003e \\u003cp\\u003eIn order to generate custom databases using OpenCustomDB, RNA-Seq data is required. From these reads, the assessment of differential gene expression can be performed. Mapping the RNA-Seq reads to the genome using RSEM and STAR enabled the identification of 117,636 transcripts expressed in 70% of four replicates between cell lines. Of these, 96,442 transcripts were shared by the three cell lines. Additionally, 1567, 2391, and 1780 transcripts were only identified in T1074, PEO-4 and SKOV-3 cells respectively (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig2\\\" class=\\\"InternalRef\\\"\\u003e2\\u003c/span\\u003eA). Total RNA-seq data analysis showed that 37,197 transcripts were differentially expressed (DESeq2, FDR\\u0026thinsp;\\u0026lt;\\u0026thinsp;0.05). Hierarchical clustering (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig2\\\" class=\\\"InternalRef\\\"\\u003e2\\u003c/span\\u003eB and Supplemental Table\\u0026nbsp;1) indicated six main transcript clusters: upregulation in PEO-4 (cluster 1, 3117) in SKOV-3 (cluster 2, 3220), or in both PEO-4 and SKOV-3 (cluster 3, 1138 transcripts); and downregulation in SKOV-3 (cluster 4), in PEO-4 (cluster 5), and in both cancerous cells (cluster 6, 12,129 transcripts).\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eMapping RNA-Seq reads to the human genome Hg38 allowed us to find 29,245 expressed genes among the three cell lines. Among these expressed genes, 420, 407 and 540 were identified to be specific for T1074, SKOV-3 and PEO-4 cells respectively (Figure. 3A). Figure\\u0026nbsp;\\u003cspan refid=\\\"Fig3\\\" class=\\\"InternalRef\\\"\\u003e3\\u003c/span\\u003eB displays the different categories of genes annotated and the major category of these genes were annotated as non-coding (pseudogenes and lncRNAs, 60.9%), while approximately 37% of the genes were annotated as coding genes. Hierarchical clustering was performed on the expression values obtained from the DESeq2 workflow. A total of 17,368 genes were identified as significantly differentially expressed between the three cell lines (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig3\\\" class=\\\"InternalRef\\\"\\u003e3\\u003c/span\\u003eC and Supplemental Table\\u0026nbsp;2), and of these, 2142 and 1949 genes were upregulated in PEO-4 and SKOV-3 cells respectively. On the other hand, 3345 and 2692 genes were downregulated in PEO-4 and SKOV-3 cells respectively. Between the two cancerous cell lines, 632 genes were identified as upregulated and 6608 as downregulated.\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec11\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eRNA-Seq based databases\\u003c/h2\\u003e \\u003cp\\u003eWe used RNA-Seq data from the ovarian epithelial cell (T1074) and the OvCa cell lines (PEO-4 and SKOV-3) to generate two cell-specific protein databases for each cell line. Figure\\u0026nbsp;\\u003cspan refid=\\\"Fig4\\\" class=\\\"InternalRef\\\"\\u003e4\\u003c/span\\u003e summarizes the protein types of the sequences stored in these databases. The distribution is similar for the three cell lines used and the custom 100K DB contained around 15% of wild-type (WT) RefProts, 2% of variant RefProts, 5% of WT novel isoforms, less than 1% of variant novel isoforms, 73% of WT AltProts and 5% of variant AltProts (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig4\\\" class=\\\"InternalRef\\\"\\u003e4\\u003c/span\\u003e).\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eThe OpenCustomDB workflow was used to generate comprehensive transcript databases (Full DB) without limiting the maximum number of entries to 100,000. These databases included 448,569, 443,177 and 437,568 entries for T1074, PEO-4 and SKOV-3 cells, respectively. For example, for T1074 cells, 68,759 WT RefProts (15.33%), 5366 variant RefProts (1.2%), 43,609 WT novel isoforms (9.7%), 2529 variant isoforms (0.6%), 319,612 WT AltProts (71.3%) and 8694 variant AltProts (1.9%) were stored in the database. Similar ratios were observed for PEO-4 and SKOV-3 cells (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig4\\\" class=\\\"InternalRef\\\"\\u003e4\\u003c/span\\u003e).\\u003c/p\\u003e \\u003cp\\u003eOf the AltProts predicted, we mapped their transcriptomic origin by extracting information from OpenProt (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig5\\\" class=\\\"InternalRef\\\"\\u003e5\\u003c/span\\u003e). AltORFs overlapping a CDS in a shifted reading frame, or in 3\\u0026rsquo;UTRs and ncRNA were found to be the main sources of predicted AltProts.\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eAdditionally, a comparison was performed between the databases across the three cell lines (see Supplemental Fig.\\u0026nbsp;1). In total, 282,287 AltProts were found to overlap across the three cell lines and, 15,109, 11,026 and 8897 unique AltProts were predicted in T1074, PEO-4 and SKOV-3 cells, respectively. Among the cancerous cell lines, 8055 AltProts were found to overlap. Approximately 39,000 sequences of novel isoforms were predicted to be shared across the three cell lines, with specific novel isoforms also identified in each cell line and in both cancerous cells. Almost 60,000 RefProts were found to overlap across all cell lines, with approximately 6000 being specific for each cell line. The same analysis was performed on the 100K DB, with 52,483 AltProts, 3116 novel isoforms and 10,346 RefProts being predicted to overlap across all three cell lines. A main advantage of these databases is that they contain predicted AltProt variants specific of each sample; for instance, 4321 specific AltProt variants were predicted for PEO-4 cells and, 4355 for SKOV-3 and 3540 for T1074 cells. This also shows that both cancerous cells have an increased number of transcript variants, which may be translated into mutated AltProts.\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec12\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eProteome analysis of subcellular compartments\\u003c/h2\\u003e \\u003cp\\u003eTo evaluate the deeper differences in the proteome of these three different cell lines. The MS/MS data sets obtained from analysing each subcellular proteome of the three cell lines were analysed using Proteome Discoverer V2.5. Three different child processing workflows that contained three sequential Sequest HT (\\u003cspan citationid=\\\"CR48\\\" class=\\\"CitationRef\\\"\\u003e48\\u003c/span\\u003e) nodes were used with the databases as described in the \\u003cspan refid=\\\"Sec2\\\" class=\\\"InternalRef\\\"\\u003ematerial and methods\\u003c/span\\u003e section. We considered a protein as identified when it was present in at least one subcellular compartment in 70% of the replicates of at least one cell line. Figure\\u0026nbsp;\\u003cspan refid=\\\"Fig6\\\" class=\\\"InternalRef\\\"\\u003e6\\u003c/span\\u003eA displays the distributions of all identified proteins. 6301 RefProts were identified in T1074 cells, 6268 in PEO-4 cells and 6319 in SKOV-3 cells. Among the identified RefProts, 234 (T1074 cells), 224 (PEO-4 cells) and 233 (SKOV-3 cells) were variants of RefProts. In addition, 137 novel isoforms were identified in T1074 cells, and 136 in PEO-4 and SKOV-3 cells. A total of 8 variants of novel isoforms were annotated in T1074 cells, and 9 in SKOV-3 and PEO-4 cells. Finally, over 500 AltProts were identified in each cell line with similar numbers of AltProts identified in SKOV-3 cells (577), T1074 (556) and PEO-4 cells (549). The number of AltProt variants identified was 12 for PEO-4 cells, and 13 for T1074 and SKOV-3 cells. Additionally, the distribution of WT and variant proteins is shown in Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig6\\\" class=\\\"InternalRef\\\"\\u003e6\\u003c/span\\u003eB.\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eSubcellular fractionation was used to link (a) cellular compartment(s) to identified AltProts (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig7\\\" class=\\\"InternalRef\\\"\\u003e7\\u003c/span\\u003eA). The membrane-bound fraction of all three cell lines contained the highest number of identified AltProts. In Figs.\\u0026nbsp;\\u003cspan refid=\\\"Fig7\\\" class=\\\"InternalRef\\\"\\u003e7\\u003c/span\\u003eB and C, some general descriptions of the identified AltProts are displayed. Here, the majority of the AltProts identified possess a 3\\u0026rsquo;UTR origin. Additionally, the vast majority (80.9%) have a molecular weight less than 10 KDa.\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eIn addition, we identified cell line-specific RefProts, novel isoforms and AltProts. In T1074 cells, nine specific AltProts were identified, including the variant AltProt IP_290059@Asp99fs, which was found in the cytoskeletal fraction. SKOV-3 cells also had nine cell-specific AltProts, but without any variants, and PEO-4 cells had two specific wild-type AltProts identified. The characteristics of the cell line-specific AltProts are described in Supplemental Table\\u0026nbsp;3. Overall, 508 AltProts were identified shared by all three cell lines, including 11 variants.\\u003c/p\\u003e \\u003cp\\u003eAmong the identifications, 30 AltProts were identified in both cancerous cell lines. The variant IP_715944@Leu44Pro was identified in the cytoskeletal fraction of both cell lines. The WT AltProt IP_715944 is a 4.82 KDa protein composed by 47 amino acids. It is coded in the \\u003cem\\u003eDHX8\\u003c/em\\u003e gene. The variant of this AltProt is the result of a base substitution (c.131T\\u0026thinsp;\\u0026gt;\\u0026thinsp;C) observed in the transcript ENST00000587574, which changed the proline at position 44 to a leucine. To verify the impact of the mutation, the sequence was analysed using protein BLAST (\\u003cspan citationid=\\\"CR49\\\" class=\\\"CitationRef\\\"\\u003e49\\u003c/span\\u003e), InterProScan (\\u003cspan citationid=\\\"CR50\\\" class=\\\"CitationRef\\\"\\u003e50\\u003c/span\\u003e) and Phobious (\\u003cspan citationid=\\\"CR51\\\" class=\\\"CitationRef\\\"\\u003e51\\u003c/span\\u003e). No significant similarity or any change in the predicted domains were identified.\\u003c/p\\u003e \\u003cp\\u003eNext, we performed a label-free quantitative analysis on the subcellular proteomes (n\\u0026thinsp;=\\u0026thinsp;4), which led to the identification of 1,022 RefProts with significantly altered levels (ANOVA, q-value\\u0026thinsp;\\u0026lt;\\u0026thinsp;0.05) in the cytoplasmic fraction, 995 in the membrane-bound fraction, 561 in the nuclear fraction, and 159 in the chromatin and 590 in cytoskeletal fractions. The used RNA-Seq derived databases allowed us to identify and quantify variant proteins, and 88 RefProt variants were found at significantly different levels in the three cell lines. Of these variants, 39 were found in the cytoplasm, 39 in membrane-bound structures, 15 in the nucleus, 6 in the chromatin fraction and 23 in the cytoskeleton. Note, that 22 of the 88 RefProt variants were found in more than one cellular fraction.\\u003c/p\\u003e \\u003cp\\u003eHierarchical clustering (Supplemental Fig.\\u0026nbsp;2A and Supplemental Table\\u0026nbsp;4) pointed to six main groups of proteins: up-regulation in (\\u003cspan citationid=\\\"CR1\\\" class=\\\"CitationRef\\\"\\u003e1\\u003c/span\\u003e) PEO-4 cells, (\\u003cspan citationid=\\\"CR2\\\" class=\\\"CitationRef\\\"\\u003e2\\u003c/span\\u003e) SKOV-3 cells, and (\\u003cspan citationid=\\\"CR3\\\" class=\\\"CitationRef\\\"\\u003e3\\u003c/span\\u003e) in both cancerous cells; and down-regulation in (\\u003cspan citationid=\\\"CR4\\\" class=\\\"CitationRef\\\"\\u003e4\\u003c/span\\u003e) SKOV-3 cells, (\\u003cspan citationid=\\\"CR5\\\" class=\\\"CitationRef\\\"\\u003e5\\u003c/span\\u003e) PEO-4 cells, and (\\u003cspan citationid=\\\"CR6\\\" class=\\\"CitationRef\\\"\\u003e6\\u003c/span\\u003e) in both cancerous cells. Table\\u0026nbsp;\\u003cspan refid=\\\"Tab1\\\" class=\\\"InternalRef\\\"\\u003e1\\u003c/span\\u003e displays the number of significantly deregulated WT and RefProt variants quantified in the three cell lines.\\u003c/p\\u003e \\u003cp\\u003e \\u003cdiv class=\\\"gridtable\\\"\\u003e\\u003ctable float=\\\"Yes\\\" id=\\\"Tab1\\\" border=\\\"1\\\"\\u003e \\u003ccaption language=\\\"En\\\"\\u003e \\u003cdiv class=\\\"CaptionNumber\\\"\\u003eTable 1\\u003c/div\\u003e \\u003cdiv class=\\\"CaptionContent\\\"\\u003e \\u003cp\\u003eWild-type and variant RefProts significantly varied (ANOVA, q-value\\u0026thinsp;\\u0026lt;\\u0026thinsp;0.05). The number of WT and variant RefProts is displayed for the six main clusters identified upon LFQ proteomics.\\u003c/p\\u003e \\u003c/div\\u003e \\u003c/caption\\u003e \\u003ccolgroup cols=\\\"4\\\"\\u003e \\u003cdiv align=\\\"left\\\" class=\\\"colspec\\\" colname=\\\"c1\\\" colnum=\\\"1\\\"\\u003e\\u003c/div\\u003e \\u003cdiv align=\\\"left\\\" class=\\\"colspec\\\" colname=\\\"c2\\\" colnum=\\\"2\\\"\\u003e\\u003c/div\\u003e \\u003cdiv align=\\\"char\\\" char=\\\".\\\" class=\\\"colspec\\\" colname=\\\"c3\\\" colnum=\\\"3\\\"\\u003e\\u003c/div\\u003e \\u003cdiv align=\\\"char\\\" char=\\\".\\\" class=\\\"colspec\\\" colname=\\\"c4\\\" colnum=\\\"4\\\"\\u003e\\u003c/div\\u003e \\u003cthead\\u003e \\u003ctr\\u003e \\u003cth align=\\\"left\\\" colspan=\\\"2\\\" nameend=\\\"c2\\\" namest=\\\"c1\\\"\\u003e \\u003cp\\u003eCluster\\u003c/p\\u003e \\u003c/th\\u003e \\u003cth align=\\\"left\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003eWT RefProts\\u003c/p\\u003e \\u003c/th\\u003e \\u003cth align=\\\"left\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003eRefProt variants\\u003c/p\\u003e \\u003c/th\\u003e \\u003c/tr\\u003e \\u003c/thead\\u003e \\u003ctbody\\u003e \\u003ctr\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c1\\\" morerows=\\\"2\\\" rowspan=\\\"3\\\"\\u003e \\u003cp\\u003eUpregulated\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003ePEO-4 cells\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003e482\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003e10\\u003c/p\\u003e \\u003c/td\\u003e \\u003c/tr\\u003e \\u003ctr\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003eSKOV-3 cells\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003e383\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003e6\\u003c/p\\u003e \\u003c/td\\u003e \\u003c/tr\\u003e \\u003ctr\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003eBoth cancerous cells\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003e666\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003e29\\u003c/p\\u003e \\u003c/td\\u003e \\u003c/tr\\u003e \\u003ctr\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c1\\\" morerows=\\\"2\\\" rowspan=\\\"3\\\"\\u003e \\u003cp\\u003eDownregulated\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003ePEO-4 cells\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003e195\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003e4\\u003c/p\\u003e \\u003c/td\\u003e \\u003c/tr\\u003e \\u003ctr\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003eSKOV-3 cells\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003e328\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003e16\\u003c/p\\u003e \\u003c/td\\u003e \\u003c/tr\\u003e \\u003ctr\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003eBoth cancerous cells\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003e1154\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"char\\\" char=\\\".\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003e54\\u003c/p\\u003e \\u003c/td\\u003e \\u003c/tr\\u003e \\u003c/tbody\\u003e \\u003c/colgroup\\u003e \\u003c/table\\u003e\\u003c/div\\u003e \\u003c/p\\u003e \\u003cp\\u003eAn identical hierarchical clustering was performed on novel isoforms, resulting in the identification of 53 wild-type novel isoforms and three novel isoform variants that were significantly varied (ANOVA, q-value\\u0026thinsp;\\u0026lt;\\u0026thinsp;0.05) between the three cell lines (Supplemental Fig.\\u0026nbsp;2B and Supplemental Table\\u0026nbsp;5). One of these novel isoform variants, II_587587@Asn359Asp, was found upregulated in both cancerous cell lines in the cytoplasm and membrane-bound fractions. This protein is a novel isoform expressed from the \\u003cem\\u003ePMPCB\\u003c/em\\u003e gene. A second variant, II_702738@Ala184Thr[Leu79LeuAsn72Asn], was found to be downregulated in SKOV-3 cells in the nuclear fraction. This novel isoform is encoded by the \\u003cem\\u003eWDR18\\u003c/em\\u003e gene and possesses a substitution in position 184 and three silent mutations. II_597059@Glu65GlnAsn139AspAla57ValLys122ArgIle6ValGlu80Lys[Val118Val] was identified as upregulated in SKOV-3 cells in the chromatin-bound fraction. This protein is a novel isoform of HLA-H, which possesses seven mutations, one of which is a silent mutation.\\u003c/p\\u003e \\u003cp\\u003eThe same workflow was used to compare the AltProt profiles between the three studied cell lines. In total, 73 AltProts were found at significantly altered levels and 41 of these were upregulated in the ovarian cancer cells, with 12 upregulated only in PEO-4 cells, nine in SKOV-3 cells, and 20 in both. Four AltProts were found to be downregulated only in PEO-4 cells or only in SKOV-3 cells, while 36 AltProts were downregulated in both cells (Supplemental tables 6 and 7). Figure\\u0026nbsp;\\u003cspan refid=\\\"Fig8\\\" class=\\\"InternalRef\\\"\\u003e8\\u003c/span\\u003e shows the distribution of the significantly altered AltProts over the five different subcellular fractions. We found 11 AltProts to be significantly regulated in more than one unique compartment. IP_067626, IP_070304, IP_108778, IP_147518, IP_178464, IP_213023, IP_246003 and IP_282949 were downregulated in both cancerous cells. Interestingly, IP_582685 (translated from a ncRNA transcript of the pseudogene \\u003cem\\u003eGDI2P1\\u003c/em\\u003e) was identified upregulated at the membrane-bound fraction of both cancerous cells. Moreover, it was also found upregulated in the cytoplasmic and nuclear fractions of SKOV-3 cells. IP_062385 (translated from the 3\\u0026rsquo;UTR part of the transcript ENST00000457946.1 coded by \\u003cem\\u003eZMYM4\\u003c/em\\u003e gene) was found upregulated in both cancerous cells\\u0026rsquo; cytoplasmic fractions, while it was downregulated in the cytoskeletal fraction of these cells. A similar observation was made for IP_774693 (translated from an ncRNA of \\u003cem\\u003eTUBAP2\\u003c/em\\u003e): this AltProt was upregulated in the membrane-bound fractions of the cancerous cells yet, downregulated in their cytoplasmic fractions.\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eNote that only two AltProt variants were found at significantly different levels. IP_174777 is a 53-amino acid AltProt encoded from the 3\\u0026rsquo;UTR RNA of the \\u003cem\\u003eTMEM245\\u003c/em\\u003e gene. During the creation of our databases, a single base substitution (23A\\u0026thinsp;\\u0026gt;\\u0026thinsp;G) in transcript ENST00000374586 led to the prediction of the variant IP_174777@Asn8Ser. This mutant AltProt was identified as significantly downregulated in both cancerous cells, compared to the epithelial ovarian cell line. The second AltProt variant identified as downregulated in the cancerous ovarian cell lines was IP_304294@Leu32fs. The WT AltProt, IP_304294, is a 57-amino acid protein coded by the \\u003cem\\u003eMTMR1\\u003c/em\\u003e gene and is translated from the 3\\u0026rsquo;UTR of the transcript ENST00000445323. A guanine deletion at position 93 results in a reading frame shift at leucine 32. This shortens the protein to 44 amino acids and substituted the last 13 amino acids. For both proteins, a cytoplasmic domain was predicted by Phobius, and this prediction remained unchanged after the frame shift.\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec13\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eProteome and transcriptome functional annotation\\u003c/h2\\u003e \\u003cp\\u003eTo integrate and interpret the data obtained from the differentially expressed reference proteome and transcriptome, we used the Database for Annotation, Visualization and Integrated Discovery (DAVID) (\\u003cspan citationid=\\\"CR52\\\" class=\\\"CitationRef\\\"\\u003e52\\u003c/span\\u003e). This online tool allows users to perform GO term enrichment, cluster redundant enriched terms, visualize enriched pathway maps and extract gene functionality and literature.\\u003c/p\\u003e \\u003cp\\u003eThe RefProts identified as upregulated in cancerous cells were submitted to DAVID and showed that two major cancer-related KEGG pathways (\\u003cspan citationid=\\\"CR39\\\" class=\\\"CitationRef\\\"\\u003e39\\u003c/span\\u003e) were significantly enriched: central carbon metabolism in cancer (hsa05230; p-value: 1.90E-04) and chemical carcinogenesis - reactive oxygen species (hsa05208; p-value: 5.26E-06). The KEGG pathway proteoglycans in cancer (hsa05205; p-value: 0.026) was significantly enriched among the downregulated cancer RefProts.\\u003c/p\\u003e \\u003cp\\u003eRegulated protein clusters in SKOV-3 cells were found significantly enriched for the central carbon metabolism in cancer pathways (p-value: 7.3E-5). On the contrary, no significant enrichment was identified in PEO-4 cells. Based on this difference we presented the protein and transcript expression profiles on an adapted central carbon metabolism pathway in a cancer pathway map (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig9\\\" class=\\\"InternalRef\\\"\\u003e9\\u003c/span\\u003e). The complete list of genes and proteins enriched for this pathway can be found in Supplemental Table\\u0026nbsp;8. One observes a significant upregulation of the NRAS protein in the RAS/RAF/MEK/ERK/c-Myc pathway in SKOV-3 cells (ANOVA q-value: 0.017). On the other hand, its transcript levels were significantly downregulated in PEO-4 cells (ANOVA q-value: 0.0004). Moreover, for the other two members of the oncogene RAS family, no significant variation was found at the proteome level whereas on the transcript level, \\u003cem\\u003eHRAS\\u003c/em\\u003e was downregulated in PEO-4 cells (ANOVA q-value: 3.7E-6) and \\u003cem\\u003eKRAS\\u003c/em\\u003e upregulated in SKOV-3 cells (ANOVA q-value: 5.58E-5). Other differences were observed for the MEK kinases MAP2K1 and MAP2K2; for instance, MAP2K2 was significantly downregulated in both cancerous cells\\u0026rsquo; membrane-bound fraction (ANOVA q-value: 0.005) and downregulated in the PEO-4 cytoskeletal fraction (ANOVA q-value: 0.028). MAP2K1 was found downregulated in PEO-4 cells (ANOVA q-value: 2.29E-6) while its transcript level was found upregulated in SKOV-3 cells (ANOVA q-value: 1.49E-5).\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eIn another part of the central carbon metabolism in cancer pathway, SIRT6 and SIRT3 are considered as cancer associated genes (\\u003cspan additionalcitationids=\\\"CR54\\\" citationid=\\\"CR53\\\" class=\\\"CitationRef\\\"\\u003e53\\u003c/span\\u003e\\u0026ndash;\\u003cspan citationid=\\\"CR55\\\" class=\\\"CitationRef\\\"\\u003e55\\u003c/span\\u003e). It has been found that downregulation of SIRT6 increased ovarian cancer cells growth (\\u003cspan citationid=\\\"CR55\\\" class=\\\"CitationRef\\\"\\u003e55\\u003c/span\\u003e). The transcript levels of SIRT6, a tumour suppressor gene, were found downregulated in PEO-4 cells (ANOVA q-value: 4.65E-6), while the transcript levels of c-Myc, an oncogene, were upregulated in these cells (ANOVA q-value: 3.88E-5). Protein levels of SIRT3, another tumour suppressor gene, were upregulated in both cancerous cells (ANOVA q-value: 0.005), while its transcript levels were found downregulated in PEO-4 cells (ANOVA q-value: 0.0001). The expression of the oncogenic PI3K family was also found significantly regulated among the three cell lines. PIK3R1 was upregulated in both cancerous cells\\u0026rsquo; cytoplasmic fraction (ANOVA q-value: 0.037), while its transcript was only upregulated in SKOV-3 cells (ANOVA q-value: 2.31E-5). Additionally, the transcripts of \\u003cem\\u003ePIK3CB\\u003c/em\\u003e (ANOVA q-value: 0.0001) and \\u003cem\\u003ePIK3R2\\u003c/em\\u003e (ANOVA q-value: 0.005) were also only upregulated in these cells. On the contrary, the \\u003cem\\u003ePIK3CA\\u003c/em\\u003e (ANOVA q-value: 0.001) and \\u003cem\\u003ePIK3CD\\u003c/em\\u003e (ANOVA q-value: 0.0001) transcripts were found downregulated in both cancerous cells.\\u003c/p\\u003e \\u003cp\\u003eOther oncogenes in the central carbon metabolism cancer pathway are members of the \\u003cem\\u003eAKT\\u003c/em\\u003e family. AKT1 protein (ANOVA q-value: 0.0002) and transcript levels (ANOVA q-value: 2E-5) were downregulated in PEO-4 cells. For AKT2 and AKT3, no significant variation in protein expression was found, while their transcript levels were significantly downregulated in both cancerous cells (ANOVA q-value: 0.02 and 3.6E-6).\\u003c/p\\u003e \\u003cp\\u003eWith our proteogenomic workflow, we could identify a variant form of p53 (ENSP00000269305.8: p.Pro72Arg), an amino acid substitution that stems from the c.215C\\u0026thinsp;\\u0026gt;\\u0026thinsp;G variant in \\u003cem\\u003eTP53\\u003c/em\\u003e. This p53 mutant was significantly downregulated in both cancerous cells\\u0026rsquo; cytoplasmic (ANOVA q-value: 0.0036) and cytoskeletal (ANOVA q-value: 0.0096) fractions, while its transcript levels were only significantly downregulated in SKOV-3 cells (ANOVA p-value: 1.17E-10). Three other RefProt variants were identified in this pathway. ENSP00000359991.5: p.Thr238Met, a mutant of PGAM1 was downregulated in both cancerous cells (ANOVA q-value: 0.0013), while two mutants of HKDC1 were upregulated in both cancerous cells; ENSP00000346643.5: p.Thr124Ile, p.Asn917Lys, p.Arg827Trp, p.Trp721Arg, [p.Phe601Phe] (ANOVA q-value: 0.008) and ENSP00000346643.5: p.Thr124Ile, p.Asn917Lys, p.Trp721Arg, [p.Phe601Phe] (ANOVA q-value: 0.023).\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec14\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eCrosslinking network analysis\\u003c/h2\\u003e \\u003cp\\u003eThe computational analysis of the crosslinked samples was carried out as described in (\\u003cspan citationid=\\\"CR30\\\" class=\\\"CitationRef\\\"\\u003e30\\u003c/span\\u003e), which allowed us to generate a protein interaction map in Cytoscape (\\u003cspan citationid=\\\"CR56\\\" class=\\\"CitationRef\\\"\\u003e56\\u003c/span\\u003e) (Supplemental Fig.\\u0026nbsp;3). A total of 90 crosslinks were identified (Supplemental table 9), among them 20 intra-crosslinks were identified, which do not give interactome information, but might be useful for structural studies. In this protein network (Supplemental Fig.\\u0026nbsp;3), 28 protein-protein interactions (PPIs) were found in PEO-4 cells (marked in purple), 27 in SKOV-3 cells (marked in blue) and 35 in T1074 cells (marked in green). From these pairs, 12 crosslink interactions were identified in at least two cell lines. Among all the crosslinked pairs, 20 involved AltProts, four crosslinks were AltProt-AltProt interactions, and 13 AltProt-RefProt crosslinks were identified. The latter were considered most important for our study as they provide hints to an AltProt\\u0026rsquo;s physiological or pathological involvement.\\u003c/p\\u003e \\u003cp\\u003eTo attribute functions to an AltProt from this set of PPIs, we retrieved the known interactions from the STRING (\\u003cspan citationid=\\\"CR57\\\" class=\\\"CitationRef\\\"\\u003e57\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR58\\\" class=\\\"CitationRef\\\"\\u003e58\\u003c/span\\u003e), BioGrid (\\u003cspan citationid=\\\"CR59\\\" class=\\\"CitationRef\\\"\\u003e59\\u003c/span\\u003e) and IntAct (\\u003cspan citationid=\\\"CR60\\\" class=\\\"CitationRef\\\"\\u003e60\\u003c/span\\u003e) databases and included the identified crosslinked interactions (Supplemental Fig.\\u0026nbsp;4). Additionally, for the RefProts that did not present a referenced STRING interaction within the crosslinked network, the addition of three STRING interactors has been performed to expand the network. We observed that seven PPIs had already been described (pink lines): B2M-HLA-B, B2M-HLA-A, ITGA5-ITGA1, TUBA1C-TUBB, HIST3H2A-HIST2H3D, PRC1-ORC1 and VP39-VPS13C. Using this network, a molecular function GO term and KEGG pathway enrichment analysis was performed with the ClueGO App(\\u003cspan citationid=\\\"CR61\\\" class=\\\"CitationRef\\\"\\u003e61\\u003c/span\\u003e) from Cytoscape. The interactions between AltProts and RefProts were displayed along with the enriched GO terms (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig10\\\" class=\\\"InternalRef\\\"\\u003e10\\u003c/span\\u003e). Four direct AltProt-RefProt-GO-term interactions were detected. The AltProt IP_192190 was crosslinked to KIF13A in PEO-4 cells and linked to the vesicle-mediated transport of plasma membrane (GO:0098876), Golgi to plasma membrane protein transport (GO:0043001), protein localization to plasma membrane (GO:0072659) and post-Golgi vesicle mediated transport (GO:0006892). The AltProt IP_136846 was identified as crosslinked to LGALS1 in T1074 cells, which is linked to the GO terms viral entry into host cell (GO:0046718) and biological process involved in interaction with host (GO:0051701). Similarly, IP_235241, crosslinked to ITGA5 in T1074 cells, was linked to the phagosome KEGG pathway (KEGG:04145) and the GO terms virus receptor activity (GO:0001618), biological process involved in interaction with host (GO:0051701) and viral entry into host cell (GO:0046718). Finally, IP_183088 was crosslinked to POLD3 in T1074 and PEO-4 cells. POLD3 is part of the DNA polymerase involved in the replication and reparation of DNA and linked to the UV-damage excision repair (GO:0070914) and response to UV (GO:0009411) GO terms.\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eThree AltProt-GO-term/KEGG pathways indirect links were identified. IP_292259, crosslinked to TMEM260 in T1074 cells, and TMEM260 possesses a STRING interaction with TOGARAM, which is linked to the non-membrane-bounded organelle assembly (GO:0140694), spindle assembly (GO:0051225) and microtubule cytoskeleton organization involved in mitosis (GO:1902850). Additionally, TMEM260 interacts with GOLGA7, which is linked to GO terms related to the vesicle-mediated transport to the plasma membrane. In addition, two AltProts were also identified to be related to these GO terms: IP_105326 and IP_118499. The former was crosslinked to VIM in SKOV-3 cells, and VIM was crosslinked to MACF1, which is linked to vesicle-mediated transport GO terms. IP_118499 was found crosslinked to CNNM3 in SKOV-3 cells, which processes a STRING interaction with CCNL2, which was crosslinked to VPS13C, which is linked to vesicle-mediated transport GO terms.\\u003c/p\\u003e \\u003cp\\u003eTo confirm the probability of the observed interactions, we analysed 3D models of RefProt-AltProts using unguided interaction docking between the two partners (as described in (\\u003cspan citationid=\\\"CR30\\\" class=\\\"CitationRef\\\"\\u003e30\\u003c/span\\u003e)). The structures of the AltProts were predicted using I-Tasser(\\u003cspan citationid=\\\"CR62\\\" class=\\\"CitationRef\\\"\\u003e62\\u003c/span\\u003e), while those of the interactors were predicted using ClusPro (\\u003cspan citationid=\\\"CR63\\\" class=\\\"CitationRef\\\"\\u003e63\\u003c/span\\u003e). The RefProt, for which the structure was predicted by AlphaFold(\\u003cspan citationid=\\\"CR64\\\" class=\\\"CitationRef\\\"\\u003e64\\u003c/span\\u003e), was used as a receptor of the AltProt, which was smaller in structure. By measuring the distance of the predicted interactions, we confirmed the observed interactions from XL-MS with a mean of 23.467 \\u0026Aring; (Supplemental Fig.\\u0026nbsp;5), which is consistent with the distances described in the literature for DSSO, ranging from 5.3 (\\u003cspan citationid=\\\"CR34\\\" class=\\\"CitationRef\\\"\\u003e34\\u003c/span\\u003e) to 30 \\u0026Aring; (\\u003cspan citationid=\\\"CR35\\\" class=\\\"CitationRef\\\"\\u003e35\\u003c/span\\u003e).\\u003c/p\\u003e \\u003c/div\\u003e\"},{\"header\":\"DISCUSSION\",\"content\":\"\\u003cp\\u003eProteogenomics establishes a direct connection between the genome blueprint and the constructed proteome. We utilized this approach to explore potential implications of AltProts in ovarian cancer. We selected the PEO-4 cell line possessing a high-grade serous histology, the SKOV-3 clear cell carcinoma cell line, and the T1074 ovarian epithelial cell line, originally derived from normal human ovarian surface epithelial cells, serving as a non-tumorous control.\\u003c/p\\u003e \\u003cdiv id=\\\"Sec16\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eThe transcriptome as a source of information for the proteomic perspective\\u003c/h2\\u003e \\u003cp\\u003eThe transcriptomic analysis employing DESeq2 to analyse the RNA-seq data enabled us to identify clusters of regulated genes in the cancer cell models. Each cell line showed about 500 uniquely expressed genes. Among the 540 genes uniquely expressed in PEO-4 cells, proto-oncogenes SSX1, SSX2 and SSX2B were found, along with an additional 24 genes related to cancer according to the Gene-Disease Associations Dataset (GAD) (\\u003cspan citationid=\\\"CR65\\\" class=\\\"CitationRef\\\"\\u003e65\\u003c/span\\u003e). Among the 406 genes uniquely expressed in SKOV-3 cells, 23 were related to cancer according to GAD. While transcriptomic analysis provided cell specificity information, the strength of this approach lies in the custom creation of cell-specific databases using OpenCustomDB. These databases contain a larger number of AltProt variants due to a high number of predicted AltProts. The ratio of variant RefProts to WT RefProts was greater than the ratio of variant AltProts to WT AltProts, which can be attributed to differences in sequence length. Longer genomic sequences have higher mutation rates and replication errors. Additionally, predicted AltProts mostly originate from ncRNAs, but mRNA CDS frame shifts and 3'UTRs also contributed significantly to the top 100,000 most abundant transcripts. This suggests a greater potential for ncRNAs to code for AltProts, although there is a larger abundance of mRNAs capable of coding for AltProts.\\u003c/p\\u003e \\u003cp\\u003eThe proteogenomic approach of constructing a custom database, combined with reading frame prediction for AltProt generation, presents analytical challenges. However, our iterative triple SEQUEST HT processing workflow using the 100,000-abundance cut-off database in the first node overcomes the FDR limitations of a 400,000-sequences database (full database) search, which may increase the number of false positives and false negative identifications (\\u003cspan citationid=\\\"CR16\\\" class=\\\"CitationRef\\\"\\u003e16\\u003c/span\\u003e). To not lose possible identifications, such iterative workflows provide a stepwise increase in possible protein identifications by expanding the search space, until the last step with OpenProtDB, where proteins translated from ncRNAs not detected by RNA-Seq can be recovered. Finally, using Percolator, we removed false positive identifications by this semi-supervised machine learning algorithm (\\u003cspan citationid=\\\"CR66\\\" class=\\\"CitationRef\\\"\\u003e66\\u003c/span\\u003e). Percolator effectively estimates the statistical significance of peptide-spectrum matches and assigns confidence scores to identified peptides in a fast and accurate way. It enhances the rate of confident peptide identifications from a collection of tandem mass spectra (\\u003cspan citationid=\\\"CR67\\\" class=\\\"CitationRef\\\"\\u003e67\\u003c/span\\u003e).\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec17\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eA larger view on the proteomic landscape\\u003c/h2\\u003e \\u003cp\\u003eSubcellular fractionation is a validated approach to decrease sample complexity and to maximize resolution in LC-MS/MS analysis. In our previous works (\\u003cspan citationid=\\\"CR17\\\" class=\\\"CitationRef\\\"\\u003e17\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR30\\\" class=\\\"CitationRef\\\"\\u003e30\\u003c/span\\u003e), such subcellular fractionation was proven beneficial for XL-MS workflows and provided better coverage of the proteome compared to analysing whole cell lysates (\\u003cspan citationid=\\\"CR68\\\" class=\\\"CitationRef\\\"\\u003e68\\u003c/span\\u003e). This enhanced the detection of low-abundant proteins (AltProts and crosslinked proteins). Furthermore, subcellular fractionation helps to determine the subcellular localization of AltProts and monitors changes under different cellular conditions (\\u003cspan citationid=\\\"CR69\\\" class=\\\"CitationRef\\\"\\u003e69\\u003c/span\\u003e). For instance, IP_062385 was found to be located in the cytoplasm and upregulated in cancerous cells, while downregulated in their cytoskeleton fractions. This may reflect a functional change linked to cancer, yet targeted studies will be necessary to prove such links between tumour development and AltProts re-localising over different cellular compartments. However, it is important to be note that subcellular fractionation based on the use of protein extraction using different detergents can lead to potential cross-contamination and inaccuracies in downstream data interpretation.\\u003c/p\\u003e \\u003cp\\u003eSubcellular fractionation led to the identification of ~ 6,000 common RefProts among the three cell lines. Over 3% of all identified proteins in each cell line were RefProt variants (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig6\\\" class=\\\"InternalRef\\\"\\u003e6\\u003c/span\\u003eB). However, these ~ 180 RefProt variants require deeper characterization to understand their (pathological) role. Cell line-specific AltProts were also found in all three cell lines, AltProts in SKOV-3 and PEO-4 cells are of interest as potential new protein markers for OvCa. Among them, IP_715944@Leu44Pro (Fig.\\u0026nbsp;\\u003cspan refid=\\\"Fig11\\\" class=\\\"InternalRef\\\"\\u003e11\\u003c/span\\u003e) caught our attention as it is a variant AltProt not predicted in the T1074 RNA-Seq database. Moreover, six additional AltProts from this group were also not predicted, which highlights the importance of a cell-specific analysis to identify new biomarkers.\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eBased on the LFQ proteome analysis data, AltProts were found to be upregulated in all compartments except the cytoskeleton in PEO-4 and SKOV-3 cells, while downregulation of AltProts was only observed in the membrane-bound and nuclear fractions in PEO-4 cells, and in the nuclear and chromatin fractions in SKOV-3 cells. Such differentially expressed AltProts can be important for distinguishing between cancer cell lines. When comparing both cancerous cell lines to T1074 cells, significant downregulation of AltProts was observed in all five compartments. AltProts upregulated in both cancerous cells were present in all compartments except the nucleus. These findings provide some insights into the specific expression of AltProts in high grade serous and non-serous OvCa. Functional domains were predicted for 23 out of 73 AltProts, which can help us understand their potential roles in interactions. Future targeted interactomic approaches such as Virotrap (\\u003cspan citationid=\\\"CR70\\\" class=\\\"CitationRef\\\"\\u003e70\\u003c/span\\u003e), BioID (\\u003cspan citationid=\\\"CR71\\\" class=\\\"CitationRef\\\"\\u003e71\\u003c/span\\u003e) and proximity ligation assays (\\u003cspan citationid=\\\"CR72\\\" class=\\\"CitationRef\\\"\\u003e72\\u003c/span\\u003e) could be used to identify the interaction partners of these AltProts, which may shed light on their involvement in the pathogenic development of OvCa or drug resistance.\\u003c/p\\u003e \\u003cp\\u003e \\u003cb\\u003eInterpretation of the major protein and transcript fluctuations from the three cell-line highlights cancer-related KEGG pathways\\u003c/b\\u003e \\u003c/p\\u003e \\u003cp\\u003eNRAS, a member of the RAS oncogene family, is involved in cell signalling, regulation of cell growth, differentiation and angiogenesis. In ovarian clear cell carcinoma, no NRAS mutations were found in our SKOV-3 cell transcriptome data (\\u003cspan citationid=\\\"CR73\\\" class=\\\"CitationRef\\\"\\u003e73\\u003c/span\\u003e). Overexpression of NRAS was shown to increase tumor aggressiveness in mice (\\u003cspan citationid=\\\"CR74\\\" class=\\\"CitationRef\\\"\\u003e74\\u003c/span\\u003e). KRAS, another member of the RAS oncogene family, was found to be upregulated in SKOV-3 cells and in metastatic lesions in endometrial cancer (\\u003cspan citationid=\\\"CR75\\\" class=\\\"CitationRef\\\"\\u003e75\\u003c/span\\u003e), which is associated with adverse prognosis (\\u003cspan citationid=\\\"CR76\\\" class=\\\"CitationRef\\\"\\u003e76\\u003c/span\\u003e). Downregulation of HRAS has been linked to lower aggressiveness and reduced cell proliferation in certain types of cancer (\\u003cspan citationid=\\\"CR46\\\" class=\\\"CitationRef\\\"\\u003e46\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR77\\\" class=\\\"CitationRef\\\"\\u003e77\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR78\\\" class=\\\"CitationRef\\\"\\u003e78\\u003c/span\\u003e). Another branch of the pathway also shows MEK (mitogen-activated extracellular signal-regulated kinase) which is a kinase cascade pathway that plays a central role in carcinogenesis and the maintenance of several cancers. We found downregulation of MAP2K1 and MAP2K2 in both cancerous cell lines, as also evident from data in The Human Protein Atlas (\\u003cspan citationid=\\\"CR79\\\" class=\\\"CitationRef\\\"\\u003e79\\u003c/span\\u003e). In parallel, related to cancer metabolism, we observed \\u003cem\\u003eSIRT6\\u003c/em\\u003e downregulation and \\u003cem\\u003ec-Myc\\u003c/em\\u003e upregulation in PEO-4 cells. Lower levels of \\u003cem\\u003eSIRT6\\u003c/em\\u003e are associated with poorer prognosis and increased tumour aggressiveness (\\u003cspan citationid=\\\"CR54\\\" class=\\\"CitationRef\\\"\\u003e54\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR55\\\" class=\\\"CitationRef\\\"\\u003e55\\u003c/span\\u003e). \\u003cem\\u003eSIRT6\\u003c/em\\u003e also regulates ribosome metabolism by repressing \\u003cem\\u003ec-Myc\\u003c/em\\u003e activity. As a result, higher levels of \\u003cem\\u003ec-Myc\\u003c/em\\u003e, resulting from downregulation of \\u003cem\\u003eSIRT6\\u003c/em\\u003e, promote energy production and biomolecule synthesis for rapid cell proliferation. On the other hand, \\u003cem\\u003eSIRT3\\u003c/em\\u003e is described as a tumor suppressor gene in OvCa (\\u003cspan citationid=\\\"CR80\\\" class=\\\"CitationRef\\\"\\u003e80\\u003c/span\\u003e) and its expression increases in detached cells and tumor cells from malignant ascites, indicating its pro-metastatic role in OvCa (\\u003cspan citationid=\\\"CR53\\\" class=\\\"CitationRef\\\"\\u003e53\\u003c/span\\u003e). Our proteomic data show upregulation of SIRT3 in both cancerous cells, while \\u003cem\\u003eSIRT3\\u003c/em\\u003e transcripts are downregulated in PEO-4 cells. Discordance between mRNA and protein levels has been observed in various studies (\\u003cspan additionalcitationids=\\\"CR82 CR83\\\" citationid=\\\"CR81\\\" class=\\\"CitationRef\\\"\\u003e81\\u003c/span\\u003e–\\u003cspan citationid=\\\"CR84\\\" class=\\\"CitationRef\\\"\\u003e84\\u003c/span\\u003e), attributed to post-transcriptional regulation, transcript isoform switching and DNA variants (\\u003cspan citationid=\\\"CR82\\\" class=\\\"CitationRef\\\"\\u003e82\\u003c/span\\u003e, \\u003cspan citationid=\\\"CR85\\\" class=\\\"CitationRef\\\"\\u003e85\\u003c/span\\u003e). We found that PIK3R1 (p85α) was upregulated in the tumoral cells, which also corresponds to the identified overexpression of PIK3R1 in an OvCa cohort of 98 patients (\\u003cspan citationid=\\\"CR86\\\" class=\\\"CitationRef\\\"\\u003e86\\u003c/span\\u003e). However, contrary to literature findings (\\u003cspan citationid=\\\"CR87\\\" class=\\\"CitationRef\\\"\\u003e87\\u003c/span\\u003e), transcript levels of PIK3CD were downregulated in both cancerous cell lines. Stronach \\u003cem\\u003eet al\\u003c/em\\u003e. (\\u003cspan citationid=\\\"CR88\\\" class=\\\"CitationRef\\\"\\u003e88\\u003c/span\\u003e) and Liu \\u003cem\\u003eet al\\u003c/em\\u003e. (\\u003cspan citationid=\\\"CR89\\\" class=\\\"CitationRef\\\"\\u003e89\\u003c/span\\u003e) have studied the role of the AKT kinase signalling pathway in OvCa cell proliferation, cell cycle regulation and anti-apoptosis. They discovered that SKOV-3 cells rely on AKT1 for cisplatin resistance, while PEO-4 cells depend on AKT3. In line with this study, in our dataset, both protein and transcript levels of AKT1 were found to be overexpressed in SKOV-3 cells.\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec18\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eOn the importance of identifying variants\\u003c/h2\\u003e \\u003cp\\u003eAmong the significantly deregulated RefProts identified in our study, P53 rs1042522 was found downregulated in both cancer cell lines. The corresponding Pro72Arg substitution in the canonical P53 sequence (UniProtKB: P04637-1) occurs in a proline-rich, intrinsically disordered region (residues 64–92) (\\u003cspan citationid=\\\"CR90\\\" class=\\\"CitationRef\\\"\\u003e90\\u003c/span\\u003e). This region is described as rigid (\\u003cspan citationid=\\\"CR91\\\" class=\\\"CitationRef\\\"\\u003e91\\u003c/span\\u003e) and a substitution of one of the prolines in this region might decrease its stiffness. Moreover, position 72 is part of the binding site of P53 with the oncogenic protein MDM2 (\\u003cspan citationid=\\\"CR92\\\" class=\\\"CitationRef\\\"\\u003e92\\u003c/span\\u003e). Even though there is evidence suggesting that there may be an association between this mutation and OvCa risk, a meta-analysis by Schildkraut \\u003cem\\u003eet al.\\u003c/em\\u003e could not confirm an association with OvCa (\\u003cspan citationid=\\\"CR93\\\" class=\\\"CitationRef\\\"\\u003e93\\u003c/span\\u003e). Additionally, using our proteogenomic approach we were able to confirm the observations of Yaginuma \\u003cem\\u003eet al.\\u003c/em\\u003e (\\u003cspan citationid=\\\"CR94\\\" class=\\\"CitationRef\\\"\\u003e94\\u003c/span\\u003e) describing SKOV-3 as a null-WT-P53 cell line.\\u003c/p\\u003e \\u003cp\\u003eHKDC1 variants were found upregulated for both cancerous cells. Three (rs906219, rs1111335 and rs874556) of the four single nucleotide variations (SNVs) are reported as natural variants of HKDC1 (UniProtKB: Q2TB90). The last, SNV rs138235256 is not reported in UniProt and does not possess any clinical significance so far. Additionally, the variant ENSP00000359991.5@Thr238Met (PGAM1) was identified downregulated in both cancer cells and results from rs202055965 SNV (C \\u0026gt; T).\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec19\\\" class=\\\"Section2\\\"\\u003e \\u003ch2\\u003eXL-MS reveals clues about AltProt functions based on AltProt-RefProt PPIs\\u003c/h2\\u003e \\u003cp\\u003eIP_183088, a 38-amino acid AltProt, is encoded by \\u003cem\\u003eMAPK8\\u003c/em\\u003e and was found to interact with POLD3 in T1074 and PEO-4 cell lines. Figure\\u0026nbsp;\\u003cspan refid=\\\"Fig12\\\" class=\\\"InternalRef\\\"\\u003e12\\u003c/span\\u003eA displays the model of the human polymerase delta holoenzyme complex (PDB: 6s1m). Herein, the four subunits of the complex are shown (POLD1 turquoise, POLD2 green, POLD3 blue and POLD4 yellow), additionally, the proliferating cell nuclear antigen is displayed in light blue and the AltProt IP_183088 in red, together with its crosslinks. Figure\\u0026nbsp;\\u003cspan refid=\\\"Fig12\\\" class=\\\"InternalRef\\\"\\u003e12\\u003c/span\\u003eB zooms in on the crosslinked region of POLD3-IP_183088, revealing that this interaction occurs in the region where POLD2 and POLD3 interact. Our transcriptomic data point to POLD3 downregulation in both cancerous cells. This correlates with the findings of Willes \\u003cem\\u003eet al\\u003c/em\\u003e. who described that POLD3 downregulation is correlated with a poor cancer outcome (\\u003cspan citationid=\\\"CR95\\\" class=\\\"CitationRef\\\"\\u003e95\\u003c/span\\u003e) and those of Weberpals \\u003cem\\u003eet al.\\u003c/em\\u003e who showed that \\u003cem\\u003ePOLD3\\u003c/em\\u003e is overexpressed in patients with high grade serous ovarian carcinoma and with good response to carboplatin/paclitaxel (\\u003cspan citationid=\\\"CR96\\\" class=\\\"CitationRef\\\"\\u003e96\\u003c/span\\u003e). On the other hand, the inhibition of the interaction between POLD3 and POLD2 driven by IP_183088 can reflect two effects. (i) An increase of the mutagenesis in the cells upon reduced activity of the POLD complex and, therefore, errors in DNA replication are more likely to occur and go unrepaired, which can be expected in PEO-4 cells. (ii) A regulatory system of the POLD complex, where the POLD3-IP183088 interaction in T1074 cells could lead to cell apoptosis; Murga \\u003cem\\u003eet al.\\u003c/em\\u003e (\\u003cspan citationid=\\\"CR97\\\" class=\\\"CitationRef\\\"\\u003e97\\u003c/span\\u003e) showed that POLD3 stabilizes the POLD complex and in its absence, the cell is driven to apoptosis. The difficulty of detecting interactions by XL-MS means that we cannot claim that the observed interactions are cell-type specific, but they do provide information about potential protein functions for unreferenced proteins. The use of this approach for studying AltProt thus makes sense, and in the case of IP_183088, allowed us to hypothesize a regulatory function of POLD3-POLD2 interaction, the stability of the POLD complex and therefore an effect in the regulation of DNA replication error repair.\\u003c/p\\u003e \\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eTo conclude, one main advantage of the databases generated by OpenCustomDB is the possibility of predicting and identifying cell-specific proteins in cell lines and, in the future, in patient samples, resulting in a big step forward towards personalized medicine. Subcellular fractionation allowed us to study differences in the reference, alternative and novel isoforms proteome of OvCa cell lines compared to a non-tumoral ovarian epithelial cell. Additionally, it allowed us to identify RefProts variants and understudied AltProts and their variants. The versatility of these databases allowed us to identify AltProt-RefProts PPIs and gave some clue about the function of AltProts, which however need to be validated. In summary, our large-scale characterization study revealed other research targets and demonstrated the complexity of the cell proteome and its largely unmapped ghost proteome.\\u003c/p\\u003e \\u003c/div\\u003e \\u003cdiv id=\\\"Sec21\\\" class=\\\"Section2\\\"\\u003e \\u003cdiv id=\\\"Sec22\\\" class=\\\"Section3\\\"\\u003e \\u003cdiv id=\\\"Sec23\\\" class=\\\"Section4\\\"\\u003e \\u003c/div\\u003e \\u003c/div\\u003e \\u003c/div\\u003e\"},{\"header\":\"Declarations\",\"content\":\"\\u003cp\\u003e\\u003cstrong\\u003eDATA AVAILABILITY\\u003c/strong\\u003e\\u003c/p\\u003e\\n\\u003cp\\u003e\\u003cem\\u003e\\u0026quot;The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE\\u003c/em\\u003e(98)\\u003cem\\u003e\\u0026nbsp;partner repository with the dataset identifier PXD045689\\u0026quot;.\\u003c/em\\u003e\\u003c/p\\u003e\\n\\u003cp\\u003e\\u003cem\\u003eReviewer account details:\\u003c/em\\u003e\\u003c/p\\u003e\\n\\u003cp\\u003e\\u003cem\\u003eUsername: reviewer_pxd045689@ebi.ac.uk\\u003c/em\\u003e\\u003c/p\\u003e\\n\\u003cp\\u003e\\u003cem\\u003ePassword: nI04XiIQ\\u003c/em\\u003e\\u003c/p\\u003e\\n\\u003cp\\u003e\\u003cem\\u003eRNAseq analysis data have been deposited to BioProject (SRA) with dataset identifier: PRJNA1041444 and GEO dataset identifier: GSE248039,\\u0026nbsp;\\u003c/em\\u003e\\u003c/p\\u003e\\n\\u003cp\\u003e\\u003cstrong\\u003e\\u0026nbsp;\\u003c/strong\\u003e\\u003cstrong\\u003eSUPPLEMENTARY DATA\\u003c/strong\\u003e\\u003c/p\\u003e\\n\\u003cp\\u003e\\u003cstrong\\u003e\\u0026nbsp;\\u003c/strong\\u003e\\u003cstrong\\u003eAUTHOR CONTRIBUTIONS\\u003c/strong\\u003e\\u003c/p\\u003e\\n\\u003cp\\u003eDiego Fernando Garcia-del Rio: Conceptualization, Formal analysis, Methodology, Validation, Writing \\u0026amp; editing\\u0026mdash;original draft. Mehdi Derhourhi: Formal analysis, Methodology, Writing. Amelie Bonnefond: Methodology, Writing\\u0026mdash;review \\u0026amp; editing. S\\u0026eacute;bastien Leblanc: Formal analysis, Methodology, Writing\\u0026mdash;review \\u0026amp; editing. No\\u0026eacute; Guilloy:\\u0026nbsp;Methodology, Writing\\u0026mdash;review \\u0026amp; editing. Xavier Roucou: Methodology, Writing\\u0026mdash;review \\u0026amp; editing. Kris Gevaert: review \\u0026amp; editing, Funding. Sven Eyckerman: review \\u0026amp; editing, Funding. Michel Salzet: Conceptualization, review \\u0026amp; editing, Funding. Tristan Cardon: Conceptualization, Methodology, Validation, Writing \\u0026amp; editing\\u0026mdash;original draft.\\u003c/p\\u003e\\n\\u003cp\\u003e\\u003cstrong\\u003e\\u0026nbsp;\\u003c/strong\\u003e\\u003cstrong\\u003eFUNDING\\u003c/strong\\u003e\\u003c/p\\u003e\\n\\u003cp\\u003eThis research was supported by funding from I-SITE, Institut National de la Sant\\u0026eacute; et de la Recherche M\\u0026eacute;dicale (Inserm) and Universit\\u0026eacute; de Lille and by The Research Foundation - Flanders (FWO), project number G008018N.\\u003c/p\\u003e\\n\\u003cp\\u003e\\u003cstrong\\u003e\\u0026nbsp;\\u003c/strong\\u003e\\u003cstrong\\u003eCONFLICT OF INTEREST\\u003c/strong\\u003e\\u003c/p\\u003e\\n\\u003cp\\u003eThe authors declare no competing interests.\\u003c/p\\u003e\"},{\"header\":\"References\",\"content\":\"\\u003col\\u003e\\n\\u003cli\\u003eThe UniProt Consortium (2015) UniProt: a hub for protein information. \\u003cem\\u003eNucleic Acids Research\\u003c/em\\u003e, \\u003cstrong\\u003e43\\u003c/strong\\u003e, D204\\u0026ndash;D212.\\u003c/li\\u003e\\n\\u003cli\\u003eBreuza,L., Poux,S., Estreicher,A., Famiglietti,M.L., Magrane,M., Tognolli,M., Bridge,A., Baratin,D., Redaschi,N., and UniProt Consortium (2016) The UniProtKB guide to the human proteome. \\u003cem\\u003eDatabase (Oxford)\\u003c/em\\u003e, \\u003cstrong\\u003e2016\\u003c/strong\\u003e, bav120.\\u003c/li\\u003e\\n\\u003cli\\u003eMouilleron,H., Delcourt,V. and Roucou,X. (2016) Death of a dogma: eukaryotic mRNAs can code for more than one protein. \\u003cem\\u003eNucleic Acids Res\\u003c/em\\u003e, \\u003cstrong\\u003e44\\u003c/strong\\u003e, 14\\u0026ndash;23.\\u003c/li\\u003e\\n\\u003cli\\u003eHao,Y., Zhang,L., Niu,Y., Cai,T., Luo,J., He,S., Zhang,B., Zhang,D., Qin,Y., Yang,F., \\u003cem\\u003eet al.\\u003c/em\\u003e (2018) SmProt: a database of small proteins encoded by annotated coding and non-coding RNA loci. \\u003cem\\u003eBriefings in Bioinformatics\\u003c/em\\u003e, \\u003cstrong\\u003e19\\u003c/strong\\u003e, 636\\u0026ndash;643.\\u003c/li\\u003e\\n\\u003cli\\u003eGalindo,M.I., Pueyo,J.I., Fouix,S., Bishop,S.A. and Couso,J.P. (2007) Peptides Encoded by Short ORFs Control Development and Define a New Eukaryotic Gene Family. \\u003cem\\u003ePLOS Biology\\u003c/em\\u003e, \\u003cstrong\\u003e5\\u003c/strong\\u003e, e106.\\u003c/li\\u003e\\n\\u003cli\\u003eAlbuquerque,J.P., Tobias-Santos,V., Rodrigues,A.C., Mury,F.B. and Fonseca,R.N. da (2015) small ORFs: A new class of essential genes for development. \\u003cem\\u003eGenet. Mol. Biol.\\u003c/em\\u003e, \\u003cstrong\\u003e38\\u003c/strong\\u003e, 278\\u0026ndash;283.\\u003c/li\\u003e\\n\\u003cli\\u003eRuiz-Orera,J., Messeguer,X., Subirana,J.A. and Alba,M.M. (2014) Long non-coding RNAs as a source of new peptides. \\u003cem\\u003eeLife\\u003c/em\\u003e, \\u003cstrong\\u003e3\\u003c/strong\\u003e, e03523.\\u003c/li\\u003e\\n\\u003cli\\u003eSlavoff,S.A., Heo,J., Budnik,B.A., Hanakahi,L.A. and Saghatelian,A. (2014) A human short open reading frame (sORF)-encoded polypeptide that stimulates DNA end joining. \\u003cem\\u003eJ Biol Chem\\u003c/em\\u003e, \\u003cstrong\\u003e289\\u003c/strong\\u003e, 10950\\u0026ndash;10957.\\u003c/li\\u003e\\n\\u003cli\\u003eBrunet,M.A., Brunelle,M., Lucier,J.-F., Delcourt,V., Levesque,M., Grenier,F., Samandi,S., Leblanc,S., Aguilar,J.-D., Dufour,P., \\u003cem\\u003eet al.\\u003c/em\\u003e (2018) OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes. \\u003cem\\u003eNucleic Acids Research\\u003c/em\\u003e, 10.1093/nar/gky936.\\u003c/li\\u003e\\n\\u003cli\\u003eCardon,T., Fournier,I. and Salzet,M. (2021) Shedding Light on the Ghost Proteome. \\u003cem\\u003eTrends in Biochemical Sciences\\u003c/em\\u003e, \\u003cstrong\\u003e46\\u003c/strong\\u003e, 239\\u0026ndash;250.\\u003c/li\\u003e\\n\\u003cli\\u003eBrunet,M.A. and Roucou,X. (2019) Mass Spectrometry-Based Proteomics Analyses Using the OpenProt Database to Unveil Novel Proteins Translated from Non-Canonical Open Reading Frames. \\u003cem\\u003eJoVE (Journal of Visualized Experiments)\\u003c/em\\u003e, 10.3791/59589.\\u003c/li\\u003e\\n\\u003cli\\u003eKozak,M. (1999) Initiation of translation in prokaryotes and eukaryotes. \\u003cem\\u003eGene\\u003c/em\\u003e, \\u003cstrong\\u003e234\\u003c/strong\\u003e, 187\\u0026ndash;208.\\u003c/li\\u003e\\n\\u003cli\\u003eKozak,M. (2006) Rethinking some mechanisms invoked to explain translational regulation in eukaryotes. \\u003cem\\u003eGene\\u003c/em\\u003e, \\u003cstrong\\u003e382\\u003c/strong\\u003e, 1\\u0026ndash;11.\\u003c/li\\u003e\\n\\u003cli\\u003eBoeckmann,B., Bairoch,A., Apweiler,R., Blatter,M.-C., Estreicher,A., Gasteiger,E., Martin,M.J., Michoud,K., O\\u0026rsquo;Donovan,C., Phan,I., \\u003cem\\u003eet al.\\u003c/em\\u003e (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. \\u003cem\\u003eNucleic Acids Research\\u003c/em\\u003e, \\u003cstrong\\u003e31\\u003c/strong\\u003e, 365\\u0026ndash;370.\\u003c/li\\u003e\\n\\u003cli\\u003eBrunet,M.A., Lucier,J.-F., Levesque,M., Leblanc,S., Jacques,J.-F., Al-Saedi,H.R.H., Guilloy,N., Grenier,F., Avino,M., Fournier,I., \\u003cem\\u003eet al.\\u003c/em\\u003e (2021) OpenProt 2021: deeper functional annotation of the coding potential of eukaryotic genomes. \\u003cem\\u003eNucleic Acids Research\\u003c/em\\u003e, \\u003cstrong\\u003e49\\u003c/strong\\u003e, D380\\u0026ndash;D388.\\u003c/li\\u003e\\n\\u003cli\\u003eGuilloy,N., Brunet,M.A., Leblanc,S., Jacques,J.-F., Hardy,M.-P., Ehx,G., Lanoix,J., Thibault,P., Perreault,C. and Roucou,X. (2023) OpenCustomDB: Integration of Unannotated Open Reading Frames and Genetic Variants to Generate More Comprehensive Customized Protein Databases. \\u003cem\\u003eJ. Proteome Res.\\u003c/em\\u003e, \\u003cstrong\\u003e22\\u003c/strong\\u003e, 1492\\u0026ndash;1500.\\u003c/li\\u003e\\n\\u003cli\\u003eGarcia-del Rio,D.F., Cardon,T., Eyckerman,S., Fournier,I., Bonnefond,A., Gevaert,K. and Salzet,M. (2023) Employing non-targeted interactomics approach and subcellular fractionation to increase our understanding of the ghost proteome. \\u003cem\\u003eiScience\\u003c/em\\u003e, \\u003cstrong\\u003e26\\u003c/strong\\u003e.\\u003c/li\\u003e\\n\\u003cli\\u003eCao,X., Khitun,A., Harold,C.M., Bryant,C.J., Zheng,S.-J., Baserga,S.J. and Slavoff,S.A. (2022) Nascent alt-protein chemoproteomics reveals a pre-60S assembly checkpoint inhibitor. \\u003cem\\u003eNat Chem Biol\\u003c/em\\u003e, \\u003cstrong\\u003e18\\u003c/strong\\u003e, 643\\u0026ndash;651.\\u003c/li\\u003e\\n\\u003cli\\u003eCardon,T., Salzet,M., Franck,J. and Fournier,I. (2019) Nuclei of HeLa cells interactomes unravel a network of ghost proteins involved in proteins translation. \\u003cem\\u003eBiochimica et Biophysica Acta (BBA) - General Subjects\\u003c/em\\u003e, \\u003cstrong\\u003e1863\\u003c/strong\\u003e, 1458\\u0026ndash;1470.\\u003c/li\\u003e\\n\\u003cli\\u003eD\\u0026rsquo;Lima,N.G., Ma,J., Winkler,L., Chu,Q., Loh,K.H., Corpuz,E.O., Budnik,B.A., Lykke-Andersen,J., Saghatelian,A. and Slavoff,S.A. (2017) A human microprotein that interacts with the mRNA decapping complex. \\u003cem\\u003eNat Chem Biol\\u003c/em\\u003e, \\u003cstrong\\u003e13\\u003c/strong\\u003e, 174\\u0026ndash;180.\\u003c/li\\u003e\\n\\u003cli\\u003eMatsumoto,A., Pasut,A., Matsumoto,M., Yamashita,R., Fung,J., Monteleone,E., Saghatelian,A., Nakayama,K.I., Clohessy,J.G. and Pandolfi,P.P. (2017) mTORC1 and muscle regeneration are regulated by the LINC00961-encoded SPAR polypeptide. \\u003cem\\u003eNature\\u003c/em\\u003e, \\u003cstrong\\u003e541\\u003c/strong\\u003e, 228\\u0026ndash;232.\\u003c/li\\u003e\\n\\u003cli\\u003eStein,C.S., Jadiya,P., Zhang,X., McLendon,J.M., Abouassaly,G.M., Witmer,N.H., Anderson,E.J., Elrod,J.W. and Boudreau,R.L. (2018) Mitoregulin: A lncRNA-Encoded Microprotein that Supports Mitochondrial Supercomplexes and Respiratory Efficiency. \\u003cem\\u003eCell Rep\\u003c/em\\u003e, \\u003cstrong\\u003e23\\u003c/strong\\u003e, 3710-3720.e8.\\u003c/li\\u003e\\n\\u003cli\\u003eCardon,T., Ozcan,B., Aboulouard,S., Kobeissy,F., Duhamel,M., Rodet,F., Fournier,I. and Salzet,M. (2020) Epigenetic Studies Revealed a Ghost Proteome in PC1/3 KD Macrophages under Antitumoral Resistance Induced by IL-10. \\u003cem\\u003eACS Omega\\u003c/em\\u003e, 10.1021/acsomega.0c02530.\\u003c/li\\u003e\\n\\u003cli\\u003eDelcourt,V., Franck,J., Leblanc,E., Narducci,F., Robin,Y.-M., Gimeno,J.-P., Quanico,J., Wisztorski,M., Kobeissy,F., Jacques,J.-F., \\u003cem\\u003eet al.\\u003c/em\\u003e (2017) Combined Mass Spectrometry Imaging and Top-down Microproteomics Reveals Evidence of a Hidden Proteome in Ovarian Cancer. \\u003cem\\u003eEBioMedicine\\u003c/em\\u003e, \\u003cstrong\\u003e21\\u003c/strong\\u003e, 55\\u0026ndash;64.\\u003c/li\\u003e\\n\\u003cli\\u003eHuang,J.-Z., Chen,M., Chen,D., Gao,X.-C., Zhu,S., Huang,H., Hu,M., Zhu,H. and Yan,G.-R. (2017) A Peptide Encoded by a Putative lncRNA HOXB-AS3 Suppresses Colon Cancer Growth. \\u003cem\\u003eMolecular Cell\\u003c/em\\u003e, \\u003cstrong\\u003e68\\u003c/strong\\u003e, 171-184.e6.\\u003c/li\\u003e\\n\\u003cli\\u003ePolycarpou-Schwarz,M., Gro\\u0026szlig;,M., Mestdagh,P., Schott,J., Grund,S.E., Hildenbrand,C., Rom,J., Aulmann,S., Sinn,H.-P., Vandesompele,J., \\u003cem\\u003eet al.\\u003c/em\\u003e (2018) The cancer-associated microprotein CASIMO1 controls cell proliferation and interacts with squalene epoxidase modulating lipid droplet formation. \\u003cem\\u003eOncogene\\u003c/em\\u003e, \\u003cstrong\\u003e37\\u003c/strong\\u003e, 4750\\u0026ndash;4768.\\u003c/li\\u003e\\n\\u003cli\\u003eBrunet,M.A., Jacques,J.-F., Nassari,S., Tyzack,G.E., McGoldrick,P., Zinman,L., Jean,S., Robertson,J., Patani,R. and Roucou,X. (2020) The FUS gene is dual‐coding with both proteins contributing to FUS‐mediated toxicity. \\u003cem\\u003eEMBO reports\\u003c/em\\u003e, 10.15252/embr.202050640.\\u003c/li\\u003e\\n\\u003cli\\u003eCao,X., Chen,Y., Khitun,A. and Slavoff,S.A. (2023) BONCAT-based Profiling of Nascent Small and Alternative Open Reading Frame-encoded Proteins. \\u003cem\\u003eBio Protoc\\u003c/em\\u003e, \\u003cstrong\\u003e13\\u003c/strong\\u003e, e4585.\\u003c/li\\u003e\\n\\u003cli\\u003eSlavoff,S.A., Mitchell,A.J., Schwaid,A.G., Cabili,M.N., Ma,J., Levin,J.Z., Karger,A.D., Budnik,B.A., Rinn,J.L. and Saghatelian,A. (2013) Peptidomic discovery of short open reading frame\\u0026ndash;encoded peptides in human cells. \\u003cem\\u003eNat Chem Biol\\u003c/em\\u003e, \\u003cstrong\\u003e9\\u003c/strong\\u003e, 59\\u0026ndash;64.\\u003c/li\\u003e\\n\\u003cli\\u003eGarcia-del Rio,D.F., Fournier,I., Cardon,T. and Salzet,M. (2023) Protocol to identify human subcellular alternative protein interactions using cross-linking mass spectrometry. \\u003cem\\u003eSTAR Protocols\\u003c/em\\u003e, \\u003cstrong\\u003e4\\u003c/strong\\u003e, 102380.\\u003c/li\\u003e\\n\\u003cli\\u003eVanderperre,B., Staskevicius,A.B., Tremblay,G., McCoy,M., O\\u0026rsquo;Neill,M.A., Cashman,N.R. and Roucou,X. (2011) An overlapping reading frame in the PRNP gene encodes a novel polypeptide distinct from the prion protein. \\u003cem\\u003eThe FASEB Journal\\u003c/em\\u003e, \\u003cstrong\\u003e25\\u003c/strong\\u003e, 2373\\u0026ndash;2386.\\u003c/li\\u003e\\n\\u003cli\\u003eZhang,Q., Vashisht,A.A., O\\u0026rsquo;Rourke,J., Corbel,S.Y., Moran,R., Romero,A., Miraglia,L., Zhang,J., Durrant,E., Schmedt,C., \\u003cem\\u003eet al.\\u003c/em\\u003e (2017) The microprotein Minion controls cell fusion and muscle formation. \\u003cem\\u003eNat Commun\\u003c/em\\u003e, \\u003cstrong\\u003e8\\u003c/strong\\u003e, 15664.\\u003c/li\\u003e\\n\\u003cli\\u003eYosten,G.L.C., Liu,J., Ji,H., Sandberg,K., Speth,R. and Samson,W.K. (2016) A 5\\u0026prime;-upstream short open reading frame encoded peptide regulates angiotensin type 1a receptor production and signalling via the \\u0026beta;-arrestin pathway. \\u003cem\\u003eThe Journal of Physiology\\u003c/em\\u003e, \\u003cstrong\\u003e594\\u003c/strong\\u003e, 1601\\u0026ndash;1605.\\u003c/li\\u003e\\n\\u003cli\\u003eKao,A., Chiu,C., Vellucci,D., Yang,Y., Patel,V.R., Guan,S., Randall,A., Baldi,P., Rychnovsky,S.D. and Huang,L. (2011) Development of a Novel Cross-linking Strategy for Fast and Accurate Identification of Cross-linked Peptides of Protein Complexes. \\u003cem\\u003eMol Cell Proteomics\\u003c/em\\u003e, \\u003cstrong\\u003e10\\u003c/strong\\u003e, M110.002212.\\u003c/li\\u003e\\n\\u003cli\\u003eHevler,J.F., Lukassen,M.V., Cabrera-Orefice,A., Arnold,S., Pronker,M.F., Franc,V. and Heck,A.J.R. (2021) Selective cross-linking of coinciding protein assemblies by in-gel cross-linking mass spectrometry. \\u003cem\\u003eThe EMBO Journal\\u003c/em\\u003e, \\u003cstrong\\u003e40\\u003c/strong\\u003e, e106174.\\u003c/li\\u003e\\n\\u003cli\\u003eBerek,J.S., Renz,M., Kehoe,S., Kumar,L. and Friedlander,M. (2021) Cancer of the ovary, fallopian tube, and peritoneum: 2021 update. \\u003cem\\u003eInternational Journal of Gynecology \\u0026amp; Obstetrics\\u003c/em\\u003e, \\u003cstrong\\u003e155\\u003c/strong\\u003e, 61\\u0026ndash;85.\\u003c/li\\u003e\\n\\u003cli\\u003eWentzensen,N., Poole,E.M., Trabert,B., White,E., Arslan,A.A., Patel,A.V., Setiawan,V.W., Visvanathan,K., Weiderpass,E., Adami,H.-O., \\u003cem\\u003eet al.\\u003c/em\\u003e (2016) Ovarian Cancer Risk Factors by Histologic Subtype: An Analysis From the Ovarian Cancer Cohort Consortium. \\u003cem\\u003eJ Clin Oncol\\u003c/em\\u003e, \\u003cstrong\\u003e34\\u003c/strong\\u003e, 2888\\u0026ndash;2898.\\u003c/li\\u003e\\n\\u003cli\\u003eStewart,C., Ralyea,C. and Lockwood,S. (2019) Ovarian Cancer: An Integrated Review. \\u003cem\\u003eSeminars in Oncology Nursing\\u003c/em\\u003e, \\u003cstrong\\u003e35\\u003c/strong\\u003e, 151\\u0026ndash;156.\\u003c/li\\u003e\\n\\u003cli\\u003eKanehisa,M. and Goto,S. (2000) KEGG: kyoto encyclopedia of genes and genomes. \\u003cem\\u003eNucleic Acids Res\\u003c/em\\u003e, \\u003cstrong\\u003e28\\u003c/strong\\u003e, 27\\u0026ndash;30.\\u003c/li\\u003e\\n\\u003cli\\u003eSoga,T. (2013) Cancer metabolism: Key players in metabolic reprogramming. \\u003cem\\u003eCancer Science\\u003c/em\\u003e, \\u003cstrong\\u003e104\\u003c/strong\\u003e, 275\\u0026ndash;281.\\u003c/li\\u003e\\n\\u003cli\\u003eWarburg,O. (1925) The Metabolism of Carcinoma Cells1. \\u003cem\\u003eThe Journal of Cancer Research\\u003c/em\\u003e, \\u003cstrong\\u003e9\\u003c/strong\\u003e, 148\\u0026ndash;163.\\u003c/li\\u003e\\n\\u003cli\\u003eVander Heiden,M.G., Cantley,L.C. and Thompson,C.B. (2009) Understanding the Warburg effect: the metabolic requirements of cell proliferation. \\u003cem\\u003eScience\\u003c/em\\u003e, \\u003cstrong\\u003e324\\u003c/strong\\u003e, 1029\\u0026ndash;1033.\\u003c/li\\u003e\\n\\u003cli\\u003eWolf,C.R., Hayward,I.P., Lawrie,S.S., Buckton,K., McIntyre,M.A., Adams,D.J., Lewis,A.D., Scott,A.R.R. and Smyth,J.F. (1987) Cellular heterogeneity and drug resistance in two ovarian adenocarcinoma cell lines derived from a single patient. \\u003cem\\u003eInternational Journal of Cancer\\u003c/em\\u003e, \\u003cstrong\\u003e39\\u003c/strong\\u003e, 695\\u0026ndash;702.\\u003c/li\\u003e\\n\\u003cli\\u003eLangdon,S.P., Lawrie,S.S., Hay,F.G., Hawkes,M.M., McDonald,A., Hayward,I.P., Schol,D.J., Hilgers,J., Leonard,R.C.F. and Smyth,J.F. Characterization and Properties of Nine Human Ovarian Adenocarcinoma Cell Lines.\\u003c/li\\u003e\\n\\u003cli\\u003eFogh,J., Fogh,J.M. and Orfeo,T. (1977) One hundred and twenty-seven cultured human tumor cell lines producing tumors in nude mice. \\u003cem\\u003eJ Natl Cancer Inst\\u003c/em\\u003e, \\u003cstrong\\u003e59\\u003c/strong\\u003e, 221\\u0026ndash;226.\\u003c/li\\u003e\\n\\u003cli\\u003eHernandez,L., Kim,M.K., Lyle,L.T., Bunch,K.P., House,C.D., Ning,F., Noonan,A.M. and Annunziata,C.M. (2016) Characterization of ovarian cancer cell lines as in vivo models for preclinical studies. \\u003cem\\u003eGynecol Oncol\\u003c/em\\u003e, \\u003cstrong\\u003e142\\u003c/strong\\u003e, 332\\u0026ndash;340.\\u003c/li\\u003e\\n\\u003cli\\u003eHallas-Potts,A., Dawson,J.C. and Herrington,C.S. (2019) Ovarian cancer cell lines derived from non-serous carcinomas migrate and invade more aggressively than those derived from high-grade serous carcinomas. \\u003cem\\u003eSci Rep\\u003c/em\\u003e, \\u003cstrong\\u003e9\\u003c/strong\\u003e, 5515.\\u003c/li\\u003e\\n\\u003cli\\u003eTabb,D.L., Eng,J.K. and Yates,J.R. (2001) Protein Identification by SEQUEST. In James,P. (ed), \\u003cem\\u003eProteome Research: Mass Spectrometry\\u003c/em\\u003e, Principles and Practice. Springer, Berlin, Heidelberg, pp. 125\\u0026ndash;142.\\u003c/li\\u003e\\n\\u003cli\\u003eMcGinnis,S. and Madden,T.L. (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. \\u003cem\\u003eNucleic Acids Res\\u003c/em\\u003e, \\u003cstrong\\u003e32\\u003c/strong\\u003e, W20-25.\\u003c/li\\u003e\\n\\u003cli\\u003eJones,P., Binns,D., Chang,H.-Y., Fraser,M., Li,W., McAnulla,C., McWilliam,H., Maslen,J., Mitchell,A., Nuka,G., \\u003cem\\u003eet al.\\u003c/em\\u003e (2014) InterProScan 5: genome-scale protein function classification. \\u003cem\\u003eBioinformatics\\u003c/em\\u003e, \\u003cstrong\\u003e30\\u003c/strong\\u003e, 1236\\u0026ndash;1240.\\u003c/li\\u003e\\n\\u003cli\\u003eK\\u0026auml;ll,L., Krogh,A. and Sonnhammer,E.L.L. (2004) A Combined Transmembrane Topology and Signal Peptide Prediction Method. \\u003cem\\u003eJournal of Molecular Biology\\u003c/em\\u003e, \\u003cstrong\\u003e338\\u003c/strong\\u003e, 1027\\u0026ndash;1036.\\u003c/li\\u003e\\n\\u003cli\\u003eSherman,B.T., Hao,M., Qiu,J., Jiao,X., Baseler,M.W., Lane,H.C., Imamichi,T. and Chang,W. (2022) DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update). \\u003cem\\u003eNucleic Acids Research\\u003c/em\\u003e, \\u003cstrong\\u003e50\\u003c/strong\\u003e, W216\\u0026ndash;W221.\\u003c/li\\u003e\\n\\u003cli\\u003eDong,X.-C., Jing,L.-M., Wang,W.-X. and Gao,Y.-X. (2016) Down-regulation of SIRT3 promotes ovarian carcinoma metastasis. \\u003cem\\u003eBiochem Biophys Res Commun\\u003c/em\\u003e, \\u003cstrong\\u003e475\\u003c/strong\\u003e, 245\\u0026ndash;250.\\u003c/li\\u003e\\n\\u003cli\\u003eSebasti\\u0026aacute;n,C., Zwaans,B.M.M., Silberman,D.M., Gymrek,M., Goren,A., Zhong,L., Ram,O., Truelove,J., Guimaraes,A.R., Toiber,D., \\u003cem\\u003eet al.\\u003c/em\\u003e (2012) The histone deacetylase SIRT6 is a tumor suppressor that controls cancer metabolism. \\u003cem\\u003eCell\\u003c/em\\u003e, \\u003cstrong\\u003e151\\u003c/strong\\u003e, 1185\\u0026ndash;1199.\\u003c/li\\u003e\\n\\u003cli\\u003eZhang,J., Yin,X.-J., Xu,C.-J., Ning,Y.-X., Chen,M., Zhang,H., Chen,S.-F. and Yao,L.-Q. (2015) The histone deacetylase SIRT6 inhibits ovarian cancer cell proliferation via down-regulation of Notch 3 expression. \\u003cem\\u003eEur Rev Med Pharmacol Sci\\u003c/em\\u003e, \\u003cstrong\\u003e19\\u003c/strong\\u003e, 818\\u0026ndash;824.\\u003c/li\\u003e\\n\\u003cli\\u003eShannon,P., Markiel,A., Ozier,O., Baliga,N.S., Wang,J.T., Ramage,D., Amin,N., Schwikowski,B. and Ideker,T. (2003) Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. \\u003cem\\u003eGenome Res.\\u003c/em\\u003e, \\u003cstrong\\u003e13\\u003c/strong\\u003e, 2498\\u0026ndash;2504.\\u003c/li\\u003e\\n\\u003cli\\u003eJensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T., Julien,P., Roth,A., Simonovic,M., \\u003cem\\u003eet al.\\u003c/em\\u003e (2009) STRING 8--a global view on proteins and their functional interactions in 630 organisms. \\u003cem\\u003eNucleic Acids Res\\u003c/em\\u003e, \\u003cstrong\\u003e37\\u003c/strong\\u003e, D412-416.\\u003c/li\\u003e\\n\\u003cli\\u003eDoncheva,N.T., Morris,J.H., Gorodkin,J. and Jensen,L.J. (2019) Cytoscape StringApp: Network Analysis and Visualization of Proteomics Data. \\u003cem\\u003eJ Proteome Res\\u003c/em\\u003e, \\u003cstrong\\u003e18\\u003c/strong\\u003e, 623\\u0026ndash;632.\\u003c/li\\u003e\\n\\u003cli\\u003eOughtred,R., Rust,J., Chang,C., Breitkreutz,B.-J., Stark,C., Willems,A., Boucher,L., Leung,G., Kolas,N., Zhang,F., \\u003cem\\u003eet al.\\u003c/em\\u003e (2021) The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions. \\u003cem\\u003eProtein Sci\\u003c/em\\u003e, \\u003cstrong\\u003e30\\u003c/strong\\u003e, 187\\u0026ndash;200.\\u003c/li\\u003e\\n\\u003cli\\u003eOrchard,S., Ammari,M., Aranda,B., Breuza,L., Briganti,L., Broackes-Carter,F., Campbell,N.H., Chavali,G., Chen,C., del-Toro,N., \\u003cem\\u003eet al.\\u003c/em\\u003e (2014) The MIntAct project\\u0026mdash;IntAct as a common curation platform for 11 molecular interaction databases. \\u003cem\\u003eNucleic Acids Research\\u003c/em\\u003e, \\u003cstrong\\u003e42\\u003c/strong\\u003e, D358\\u0026ndash;D363.\\u003c/li\\u003e\\n\\u003cli\\u003eBindea,G., Mlecnik,B., Hackl,H., Charoentong,P., Tosolini,M., Kirilovsky,A., Fridman,W.-H., Pag\\u0026egrave;s,F., Trajanoski,Z. and Galon,J. (2009) ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. \\u003cem\\u003eBioinformatics\\u003c/em\\u003e, \\u003cstrong\\u003e25\\u003c/strong\\u003e, 1091\\u0026ndash;1093.\\u003c/li\\u003e\\n\\u003cli\\u003eYang,J., Yan,R., Roy,A., Xu,D., Poisson,J. and Zhang,Y. (2015) The I-TASSER Suite: protein structure and function prediction. \\u003cem\\u003eNat Methods\\u003c/em\\u003e, \\u003cstrong\\u003e12\\u003c/strong\\u003e, 7\\u0026ndash;8.\\u003c/li\\u003e\\n\\u003cli\\u003eKozakov,D., Hall,D.R., Xia,B., Porter,K.A., Padhorny,D., Yueh,C., Beglov,D. and Vajda,S. (2017) The ClusPro web server for protein\\u0026ndash;protein docking. \\u003cem\\u003eNat Protoc\\u003c/em\\u003e, \\u003cstrong\\u003e12\\u003c/strong\\u003e, 255\\u0026ndash;278.\\u003c/li\\u003e\\n\\u003cli\\u003eJumper,J., Evans,R., Pritzel,A., Green,T., Figurnov,M., Ronneberger,O., Tunyasuvunakool,K., Bates,R., Ž\\u0026iacute;dek,A., Potapenko,A., \\u003cem\\u003eet al.\\u003c/em\\u003e (2021) Highly accurate protein structure prediction with AlphaFold. \\u003cem\\u003eNature\\u003c/em\\u003e, \\u003cstrong\\u003e596\\u003c/strong\\u003e, 583\\u0026ndash;589.\\u003c/li\\u003e\\n\\u003cli\\u003eBecker,K.G., Barnes,K.C., Bright,T.J. and Wang,S.A. (2004) The genetic association database. \\u003cem\\u003eNat Genet\\u003c/em\\u003e, \\u003cstrong\\u003e36\\u003c/strong\\u003e, 431\\u0026ndash;432.\\u003c/li\\u003e\\n\\u003cli\\u003eK\\u0026auml;ll,L., Canterbury,J.D., Weston,J., Noble,W.S. and MacCoss,M.J. (2007) Semi-supervised learning for peptide identification from shotgun proteomics datasets. \\u003cem\\u003eNat Methods\\u003c/em\\u003e, \\u003cstrong\\u003e4\\u003c/strong\\u003e, 923\\u0026ndash;925.\\u003c/li\\u003e\\n\\u003cli\\u003eThe,M., MacCoss,M.J., Noble,W.S. and K\\u0026auml;ll,L. (2016) Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0. \\u003cem\\u003eJ. Am. Soc. Mass Spectrom.\\u003c/em\\u003e, \\u003cstrong\\u003e27\\u003c/strong\\u003e, 1719\\u0026ndash;1727.\\u003c/li\\u003e\\n\\u003cli\\u003ePaulo,J.A., Gaun,A., Kadiyala,V., Ghoulidi,A., Banks,P.A., Conwell,D.L. and Steen,H. (2013) Subcellular Fractionation Enhances Proteome Coverage of Pancreatic Duct Cells. \\u003cem\\u003eBiochim Biophys Acta\\u003c/em\\u003e, \\u003cstrong\\u003e1834\\u003c/strong\\u003e, 791\\u0026ndash;797.\\u003c/li\\u003e\\n\\u003cli\\u003eNa,Z., Dai,X., Zheng,S.-J., Bryant,C.J., Loh,K.H., Su,H., Luo,Y., Buhagiar,A.F., Cao,X., Baserga,S.J., \\u003cem\\u003eet al.\\u003c/em\\u003e (2022) Mapping subcellular localizations of unannotated microproteins and alternative proteins with MicroID. \\u003cem\\u003eMolecular Cell\\u003c/em\\u003e, \\u003cstrong\\u003e82\\u003c/strong\\u003e, 2900-2911.e7.\\u003c/li\\u003e\\n\\u003cli\\u003eEyckerman,S., Titeca,K., Van Quickelberghe,E., Cloots,E., Verhee,A., Samyn,N., De Ceuninck,L., Timmerman,E., De Sutter,D., Lievens,S., \\u003cem\\u003eet al.\\u003c/em\\u003e (2016) Trapping mammalian protein complexes in viral particles. \\u003cem\\u003eNat Commun\\u003c/em\\u003e, \\u003cstrong\\u003e7\\u003c/strong\\u003e, 11416.\\u003c/li\\u003e\\n\\u003cli\\u003eRoux,K.J., Kim,D.I., Burke,B. and May,D.G. (2018) BioID: A Screen for Protein-Protein Interactions. \\u003cem\\u003eCurr Protoc Protein Sci\\u003c/em\\u003e, \\u003cstrong\\u003e91\\u003c/strong\\u003e, 19.23.1-19.23.15.\\u003c/li\\u003e\\n\\u003cli\\u003eAlam,M.S. (2018) Proximity Ligation Assay (PLA). \\u003cem\\u003eCurr Protoc Immunol\\u003c/em\\u003e, \\u003cstrong\\u003e123\\u003c/strong\\u003e, e58.\\u003c/li\\u003e\\n\\u003cli\\u003eTherachiyil,L., Anand,A., Azmi,A., Bhat,A., Korashy,H.M. and Uddin,S. (2022) Role of RAS signaling in ovarian cancer. \\u003cem\\u003eF1000Res\\u003c/em\\u003e, \\u003cstrong\\u003e11\\u003c/strong\\u003e, 1253.\\u003c/li\\u003e\\n\\u003cli\\u003eZheng,Z.-Y., Elsarraj,H., Lei,J.T., Hong,Y., Anurag,M., Feng,L., Kennedy,H., Shen,Y., Lo,F., Zhao,Z., \\u003cem\\u003eet al.\\u003c/em\\u003e (2022) Elevated NRAS expression during DCIS is a potential driver for progression to basal-like properties and local invasiveness. \\u003cem\\u003eBreast Cancer Research\\u003c/em\\u003e, \\u003cstrong\\u003e24\\u003c/strong\\u003e, 68.\\u003c/li\\u003e\\n\\u003cli\\u003eBirkeland,E., Wik,E., Mj\\u0026oslash;s,S., Hoivik,E.A., Trovik,J., Werner,H.M.J., Kusonmano,K., Petersen,K., Raeder,M.B., Holst,F., \\u003cem\\u003eet al.\\u003c/em\\u003e (2012) KRAS gene amplification and overexpression but not mutation associates with aggressive and metastatic endometrial cancer. \\u003cem\\u003eBr J Cancer\\u003c/em\\u003e, \\u003cstrong\\u003e107\\u003c/strong\\u003e, 1997\\u0026ndash;2004.\\u003c/li\\u003e\\n\\u003cli\\u003eZhou,J.-D., Yao,D.-M., Li,X.-X., Zhang,T.-J., Zhang,W., Ma,J.-C., Guo,H., Deng,Z.-Q., Lin,J. and Qian,J. (2017) KRAS overexpression independent of RAS mutations confers an adverse prognosis in cytogenetically normal acute myeloid leukemia. \\u003cem\\u003eOncotarget\\u003c/em\\u003e, \\u003cstrong\\u003e8\\u003c/strong\\u003e, 66087\\u0026ndash;66097.\\u003c/li\\u003e\\n\\u003cli\\u003eJung,J., Cho,K.-J., Naji,A.K., Clemons,K.N., Wong,C.O., Villanueva,M., Gregory,S., Karagas,N.E., Tan,L., Liang,H., \\u003cem\\u003eet al.\\u003c/em\\u003e (2019) HRAS-driven cancer cells are vulnerable to TRPML1 inhibition. \\u003cem\\u003eEMBO reports\\u003c/em\\u003e, \\u003cstrong\\u003e20\\u003c/strong\\u003e, e46685.\\u003c/li\\u003e\\n\\u003cli\\u003eMiglietta,G., Gouda,A.S., Cogoi,S., Pedersen,E.B. and Xodo,L.E. (2015) Nucleic Acid Targeted Therapy: G4 Oligonucleotides Downregulate HRAS in Bladder Cancer Cells through a Decoy Mechanism. \\u003cem\\u003eACS Med. Chem. Lett.\\u003c/em\\u003e, \\u003cstrong\\u003e6\\u003c/strong\\u003e, 1179\\u0026ndash;1183.\\u003c/li\\u003e\\n\\u003cli\\u003eNovember 2020,19 (2020) The Human Protein Atlas: A 20-year journey into the body. \\u003cem\\u003eScience | AAAS\\u003c/em\\u003e.\\u003c/li\\u003e\\n\\u003cli\\u003eOuyang,S., Zhang,Q., Lou,L., Zhu,K., Li,Z., Liu,P. and Zhang,X. (2022) The Double-Edged Sword of SIRT3 in Cancer and Its Therapeutic Applications. \\u003cem\\u003eFrontiers in Pharmacology\\u003c/em\\u003e, \\u003cstrong\\u003e13\\u003c/strong\\u003e.\\u003c/li\\u003e\\n\\u003cli\\u003eChen,G., Gharib,T.G., Huang,C.-C., Taylor,J.M.G., Misek,D.E., Kardia,S.L.R., Giordano,T.J., Iannettoni,M.D., Orringer,M.B., Hanash,S.M., \\u003cem\\u003eet al.\\u003c/em\\u003e (2002) Discordant Protein and mRNA Expression in Lung Adenocarcinomas *. \\u003cem\\u003eMolecular \\u0026amp; Cellular Proteomics\\u003c/em\\u003e, \\u003cstrong\\u003e1\\u003c/strong\\u003e, 304\\u0026ndash;313.\\u003c/li\\u003e\\n\\u003cli\\u003eBauernfeind,A.L. and Babbitt,C.C. (2017) The predictive nature of transcript expression levels on protein expression in adult human brain. \\u003cem\\u003eBMC Genomics\\u003c/em\\u003e, \\u003cstrong\\u003e18\\u003c/strong\\u003e, 322.\\u003c/li\\u003e\\n\\u003cli\\u003ePerl,K., Ushakov,K., Pozniak,Y., Yizhar-Barnea,O., Bhonker,Y., Shivatzki,S., Geiger,T., Avraham,K.B. and Shamir,R. (2017) Reduced changes in protein compared to mRNA levels across non-proliferating tissues. \\u003cem\\u003eBMC Genomics\\u003c/em\\u003e, \\u003cstrong\\u003e18\\u003c/strong\\u003e, 305.\\u003c/li\\u003e\\n\\u003cli\\u003eFukao,Y. (2015) Discordance between protein and transcript levels detected by selected reaction monitoring. \\u003cem\\u003ePlant Signal Behav\\u003c/em\\u003e, \\u003cstrong\\u003e10\\u003c/strong\\u003e, e1017697.\\u003c/li\\u003e\\n\\u003cli\\u003eBrion,C., Lutz,S.M. and Albert,F.W. (2020) Simultaneous quantification of mRNA and protein in single cells reveals post-transcriptional effects of genetic variation. \\u003cem\\u003eeLife\\u003c/em\\u003e, \\u003cstrong\\u003e9\\u003c/strong\\u003e, e60645.\\u003c/li\\u003e\\n\\u003cli\\u003eDe Marco,C., Rinaldo,N., Bruni,P., Malzoni,C., Zullo,F., Fabiani,F., Losito,S., Scrima,M., Marino,F.Z., Franco,R., \\u003cem\\u003eet al.\\u003c/em\\u003e (2013) Multiple genetic alterations within the PI3K pathway are responsible for AKT activation in patients with ovarian carcinoma. \\u003cem\\u003ePLoS One\\u003c/em\\u003e, \\u003cstrong\\u003e8\\u003c/strong\\u003e, e55362.\\u003c/li\\u003e\\n\\u003cli\\u003eWang,G., Yang,X., Li,C., Cao,X., Luo,X. and Hu,J. (2014) PIK3R3 Induces Epithelial-to-Mesenchymal Transition and Promotes Metastasis in Colorectal Cancer. \\u003cem\\u003eMolecular Cancer Therapeutics\\u003c/em\\u003e, \\u003cstrong\\u003e13\\u003c/strong\\u003e, 1837\\u0026ndash;1847.\\u003c/li\\u003e\\n\\u003cli\\u003eStronach,E.A., Chen,M., Maginn,E.N., Agarwal,R., Mills,G.B., Wasan,H. and Gabra,H. (2011) DNA-PK mediates AKT activation and apoptosis inhibition in clinically acquired platinum resistance. \\u003cem\\u003eNeoplasia\\u003c/em\\u003e, \\u003cstrong\\u003e13\\u003c/strong\\u003e, 1069\\u0026ndash;1080.\\u003c/li\\u003e\\n\\u003cli\\u003eLiu,Q., Turner,K.M., Alfred Yung,W.K., Chen,K. and Zhang,W. (2014) Role of AKT signaling in DNA repair and clinical response to cancer therapy. \\u003cem\\u003eNeuro Oncol\\u003c/em\\u003e, \\u003cstrong\\u003e16\\u003c/strong\\u003e, 1313\\u0026ndash;1323.\\u003c/li\\u003e\\n\\u003cli\\u003eArlt,C., Ihling,C.H. and Sinz,A. (2015) Structure of full-length p53 tumor suppressor probed by chemical cross-linking and mass spectrometry. \\u003cem\\u003ePROTEOMICS\\u003c/em\\u003e, \\u003cstrong\\u003e15\\u003c/strong\\u003e, 2746\\u0026ndash;2755.\\u003c/li\\u003e\\n\\u003cli\\u003eWells,M., Tidow,H., Rutherford,T.J., Markwick,P., Jensen,M.R., Mylonas,E., Svergun,D.I., Blackledge,M. and Fersht,A.R. (2008) Structure of tumor suppressor p53 and its intrinsically disordered N-terminal transactivation domain. \\u003cem\\u003eProceedings of the National Academy of Sciences\\u003c/em\\u003e, \\u003cstrong\\u003e105\\u003c/strong\\u003e, 5762\\u0026ndash;5767.\\u003c/li\\u003e\\n\\u003cli\\u003eHoyos,D., Greenbaum,B. and Levine,A.J. (2022) The genotypes and phenotypes of missense mutations in the proline domain of the p53 protein. \\u003cem\\u003eCell Death Differ\\u003c/em\\u003e, \\u003cstrong\\u003e29\\u003c/strong\\u003e, 938\\u0026ndash;945.\\u003c/li\\u003e\\n\\u003cli\\u003eSchildkraut,J.M., Goode,E.L., Clyde,M.A., Iversen,E.S., Moorman,P.G., Berchuck,A., Marks,J.R., Lissowska,J., Brinton,L., Peplonska,B., \\u003cem\\u003eet al.\\u003c/em\\u003e (2009) Single Nucleotide Polymorphisms in the TP53 Region and Susceptibility to Invasive Epithelial Ovarian Cancer. \\u003cem\\u003eCancer Research\\u003c/em\\u003e, \\u003cstrong\\u003e69\\u003c/strong\\u003e, 2349\\u0026ndash;2357.\\u003c/li\\u003e\\n\\u003cli\\u003eYaginuma,Y. and Westphal,H. (1992) Abnormal structure and expression of the p53 gene in human ovarian carcinoma cell lines. \\u003cem\\u003eCancer Res\\u003c/em\\u003e, \\u003cstrong\\u003e52\\u003c/strong\\u003e, 4196\\u0026ndash;4199.\\u003c/li\\u003e\\n\\u003cli\\u003eWillis,S., Villalobos,V.M., Gevaert,O., Abramovitz,M., Williams,C., Sikic,B.I. and Leyland-Jones,B. (2016) Single Gene Prognostic Biomarkers in Ovarian Cancer: A Meta-Analysis. \\u003cem\\u003ePLoS One\\u003c/em\\u003e, \\u003cstrong\\u003e11\\u003c/strong\\u003e, e0149183.\\u003c/li\\u003e\\n\\u003cli\\u003eWeberpals,J.I., Pugh,T.J., Marco‐Casanova,P., Goss,G.D., Andrews Wright,N., Rath,P., Torchia,J., Fortuna,A., Jones,G.N., Roudier,M.P., \\u003cem\\u003eet al.\\u003c/em\\u003e (2021) Tumor genomic, transcriptomic, and immune profiling characterizes differential response to first‐line platinum chemotherapy in high grade serous ovarian cancer. \\u003cem\\u003eCancer Med\\u003c/em\\u003e, \\u003cstrong\\u003e10\\u003c/strong\\u003e, 3045\\u0026ndash;3058.\\u003c/li\\u003e\\n\\u003cli\\u003eMurga,M., Lecona,E., Kamileri,I., D\\u0026iacute;az,M., Lugli,N., Sotiriou,S.K., Anton,M.E., M\\u0026eacute;ndez,J., Halazonetis,T.D. and Fernandez-Capetillo,O. (2016) POLD3 Is Haploinsufficient for DNA Replication in Mice. \\u003cem\\u003eMolecular Cell\\u003c/em\\u003e, \\u003cstrong\\u003e63\\u003c/strong\\u003e, 877\\u0026ndash;883.\\u003c/li\\u003e\\n\\u003cli\\u003ePerez-Riverol,Y., Bai,J., Bandla,C., Garc\\u0026iacute;a-Seisdedos,D., Hewapathirana,S., Kamatchinathan,S., Kundu,D.J., Prakash,A., Frericks-Zipper,A., Eisenacher,M., \\u003cem\\u003eet al.\\u003c/em\\u003e (2022) The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. \\u003cem\\u003eNucleic Acids Research\\u003c/em\\u003e, \\u003cstrong\\u003e50\\u003c/strong\\u003e, D543\\u0026ndash;D552.\\u003c/li\\u003e\\n\\u003c/ol\\u003e\"}],\"fulltextSource\":\"\",\"fullText\":\"\",\"funders\":[],\"hasAdminPriorityOnWorkflow\":false,\"hasManuscriptDocX\":true,\"hasOptedInToPreprint\":true,\"hasPassedJournalQc\":\"\",\"hasAnyPriority\":false,\"hideJournal\":false,\"highlight\":\"\",\"institution\":\"\",\"isAcceptedByJournal\":true,\"isAuthorSuppliedPdf\":false,\"isDeskRejected\":\"\",\"isHiddenFromSearch\":false,\"isInQc\":false,\"isInWorkflow\":false,\"isPdf\":false,\"isPdfUpToDate\":true,\"isWithdrawnOrRetracted\":false,\"journal\":{\"display\":true,\"email\":\"info@researchsquare.com\",\"identity\":\"cell-death-and-disease\",\"isNatureJournal\":false,\"hasQc\":false,\"allowDirectSubmit\":false,\"externalIdentity\":\"cddis\",\"sideBox\":\"Learn more about [Cell Death \\u0026 Disease](http://www.nature.com/cddis/)\",\"snPcode\":\"41419\",\"submissionUrl\":\"https://mts-cddis.nature.com/cgi-bin/main.plex\",\"title\":\"Cell Death \\u0026 Disease\",\"twitterHandle\":\"\",\"acdcEnabled\":true,\"dfaEnabled\":true,\"editorialSystem\":\"ejp\",\"reportingPortfolio\":\"Nature AJ\",\"inReviewEnabled\":true,\"inReviewRevisionsEnabled\":true},\"keywords\":\"\",\"lastPublishedDoi\":\"10.21203/rs.3.rs-3972487/v1\",\"lastPublishedDoiUrl\":\"https://doi.org/10.21203/rs.3.rs-3972487/v1\",\"license\":{\"name\":\"CC BY 4.0\",\"url\":\"https://creativecommons.org/licenses/by/4.0/\"},\"manuscriptAbstract\":\"\\u003cp\\u003eProteogenomics is becoming a powerful tool in personalized medicine by linking genomics, transcriptomics and mass spectrometry (MS)-based proteomics. Due to increasing evidence of alternative open reading frame-encoded proteins (AltProts), proteogenomics has a high potential to unravel the characteristics, variants and expression levels of the alternative proteome, in addition to already annotated proteins (RefProts). To obtain a broader view of the proteome of ovarian cancer cells compared to ovarian epithelial cells, cell-specific total RNA-sequencing profiles and customized protein databases were generated. In total, 128 RefProts and 30 AltProts were identified exclusively in SKOV-3 and PEO-4 cells. Among them, an AltProt variant of IP_715944, translated from \\u003cem\\u003eDHX8\\u003c/em\\u003e, was found mutated (p.Leu44Pro). We show high variation in protein expression levels of RefProts and AltProts in different subcellular compartments. The presence of 117 RefProt and two AltProt variants was described, along with their possible implications in the different physiological/pathological characteristics. To identify the possible involvement of AltProts in cellular processes, crosslinking-MS (XL-MS) was performed in each cell line to identify AltProt-RefProt interactions. This approach revealed an interaction between POLD3 and the AltProt IP_183088, which after molecular docking, was placed between POLD3-POLD2 binding sites, highlighting its possibility of the involvement in DNA replication and repair.\\u003c/p\\u003e\",\"manuscriptTitle\":\"Deciphering the ghost proteome in ovarian cancer cells by deep proteogenomic characterization\",\"msid\":\"\",\"msnumber\":\"\",\"nonDraftVersions\":[{\"code\":1,\"date\":\"2024-04-09 18:24:50\",\"doi\":\"10.21203/rs.3.rs-3972487/v1\",\"editorialEvents\":[{\"type\":\"communityComments\",\"content\":0}],\"status\":\"published\",\"journal\":{\"display\":true,\"email\":\"info@researchsquare.com\",\"identity\":\"cell-death-and-disease\",\"isNatureJournal\":false,\"hasQc\":false,\"allowDirectSubmit\":false,\"externalIdentity\":\"cddis\",\"sideBox\":\"Learn more about [Cell Death \\u0026 Disease](http://www.nature.com/cddis/)\",\"snPcode\":\"41419\",\"submissionUrl\":\"https://mts-cddis.nature.com/cgi-bin/main.plex\",\"title\":\"Cell Death \\u0026 Disease\",\"twitterHandle\":\"\",\"acdcEnabled\":true,\"dfaEnabled\":true,\"editorialSystem\":\"ejp\",\"reportingPortfolio\":\"Nature AJ\",\"inReviewEnabled\":true,\"inReviewRevisionsEnabled\":true}}],\"origin\":\"\",\"ownerIdentity\":\"14cc9573-fa44-44b4-bd20-e2c609025dca\",\"owner\":[],\"postedDate\":\"April 9th, 2024\",\"published\":true,\"recentEditorialEvents\":[],\"rejectedJournal\":[],\"revision\":\"\",\"amendment\":\"\",\"status\":\"published-in-journal\",\"subjectAreas\":[],\"tags\":[],\"updatedAt\":\"2024-10-01T07:09:45+00:00\",\"versionOfRecord\":{\"articleIdentity\":\"rs-3972487\",\"link\":\"https://doi.org/10.1038/s41419-024-07046-1\",\"journal\":{\"identity\":\"cell-death-and-disease\",\"isVorOnly\":false,\"title\":\"Cell Death \\u0026 Disease\"},\"publishedOn\":\"2024-09-30 04:00:00\",\"publishedOnDateReadable\":\"September 30th, 2024\"},\"versionCreatedAt\":\"2024-04-09 18:24:50\",\"video\":\"\",\"vorDoi\":\"10.1038/s41419-024-07046-1\",\"vorDoiUrl\":\"https://doi.org/10.1038/s41419-024-07046-1\",\"workflowStages\":[]},\"version\":\"v1\",\"identity\":\"rs-3972487\",\"journalConfig\":\"researchsquare\"},\"__N_SSP\":true},\"page\":\"/article/[identity]/[[...version]]\",\"query\":{\"redirect\":\"/article/rs-3972487\",\"identity\":\"rs-3972487\",\"version\":[\"v1\"]},\"buildId\":\"qtupq5eGEP_6zYnWcrvyt\",\"isFallback\":false,\"isExperimentalCompile\":false,\"dynamicIds\":[84888],\"gssp\":true,\"scriptLoader\":[]}","source_license":"CC-BY-4.0","license_restricted":false}