{"paper_id":"c700bca0-a65d-4b22-917e-970d6745cb8f","body_text":"1 \n \nSubcellular proteomics of Paradiplonema papillatum reveals digestive 1 \ncapacity of the cell membrane and the plasticity of peroxisomes across 2 \neuglenozoans  3 \n 4 \nMichael Hammond1,2,3,*, Orsola Iorillo1,2, Drahomíra Faktorová1,2, Michaela Svobodová1, 5 \nBungo Akiyoshi4, Tim Licknack3, Yu-Ping Poh3, Julius Lukeš1,2 and Jeremy G. Wideman3,*   6 \n 7 \n1Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovce 8 \n(Budweis), Czech Republic 9 \n2Faculty of Science, University of South Bohemia, České Budějovce (Budweis), Czech 10 \nRepublic 11 \n3Center for Mechanisms of Evolution, Biodesign Institute, School of Life Sciences, Arizona 12 \nState University, Tempe, Arizona, USA 13 \n4Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, 14 \nUK 15 \n 16 \n* Corresponding authors 17 \nKeywords 18 \nDiplonemids, subcellular proteomics, cell membrane, metabolism,  19 \n 20 \nAbstract 21 \nDiplonemids are among the most diverse and abundant protists in the deep ocean, have 22 \nextremely complex and ancient cellular systems, and exhibit unique metabolic capacities. 23 \nDespite this, we know very little about this major group of eukaryotes. To establish a model 24 \norganism for comprehensive investigation, we performed subcellular proteomics on 25 \nParadiplonema papillatum and localized 4,870 proteins to 22 cellular compartments. We 26 \nadditionally confirmed the predicted location of several proteins by epitope tagging and 27 \nfluorescence microscopy. To probe the metabolic capacities of P . papillatum, we explored the 28 \nproteins predicted to the cell membrane compartment in our subcellular proteomics dataset. 29 \nOur data revealed an accumulation of many carbohydrate active enzymes (CAZymes). Our 30 \npredictions suggest that these CAZymes are exposed to extracellular space, supporting 31 \nproposals that diplonemids may specialize in breaking down carbohydrates in plant and algal 32 \ncell walls. Further exploration of carbohydrate metabolism revealed an evolutionary 33 \ndivergence in the function of glycosomes (modified peroxisomes) in diplonemids versus 34 \nkinetoplastids. Our subcellular proteome provides a resource for future investigations into the 35 \nunique cell biology of diplonemids.                       36 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n2 \n \n 37 \nIntroduction 38 \nDiplonemids are unicellular, heterotrophic eukaryotes, which constitute one of the most 39 \nabundant and species-rich protist groups within the world’s oceans (Flegontova, et al. 2016; 40 \nGawryluk, et al. 2016). In addition, recent investigations show a comprehensive distribution 41 \nof diplonemids in freshwater environments (Mukherjee, et al. 2020), as well as in all pelagic 42 \nzones of the ocean (Obiol, et al. 2020; Lax, et al. 2024). Global metabarcoding estimates > 43 \n67,000 species of diplonemids worldwide, and therefore, they are presumed to be key 44 \necological players in all marine ecosystems (Tashyreva, et al. 2022).  45 \nDespite their importance, our knowledge of diplonemid nutrition strategies, ecological roles 46 \nas well as their molecular and cellular biology remains limited. Beyond general heterotrophy 47 \n(Prokopchuk, et al. 2022), investigating their lifestyles and specific feeding modes remains 48 \nchallenging, partly due to the difficulty in observing diplonemid behavior in nature. By 49 \ncontrast, the relative ease by which diplonemids can be established in stable axenic cultures 50 \n(typically in protein-rich media) is promising, and makes them amenable to an expanding 51 \nrange of genomic, transcriptomic and proteomic experiments (Škodová-Sveráková, et al. 52 \n2021; Valach, Moreira, et al. 2023). Such techniques are necessary to further characterize 53 \ndiplonemids’ cellular and ecological functions. 54 \nA high-quality nuclear genome is available for the diplonemid Paradiplonema papillatum 55 \n(formerly Diplonema) (Valach, Moreira, et al. 2023), with two recent assemblies now 56 \navailable for Diplonema japonicum (Tashyreva, Faktorová, Stříbrná, et al. 2025) and 57 \nRhynchopus euleeides (Tashyreva, Faktorová, Horák, et al. 2025), in addition to several 58 \npreviously existing transcriptomes (Tashyreva, et al. 2022). However, P . papillatum remains 59 \nthe only genetically tractable diplonemid, enabling functional investigations by gene deletion 60 \n(Faktorová, Kaur, et al. 2020), endogenous tagging of proteins (Akiyoshi, et al. 2025), and 61 \nimmunoprecipitation (Valach, Benz, et al. 2023). Such tractability has allowed the 62 \ninvestigation of P . papillatum respiratory complexes (Valach, et al. 2018), mitochondrial 63 \nribosomes (Valach, Benz, et al. 2023), and kinetochores (Benz, et al. 2024). Diplonemids 64 \nretain many genes that can be traced to the last eukaryotic common ancestor (LECA), 65 \nincluding rare, restricted homologs referred to as jotnarlogs (Záhonová, et al. 2025). Thus, 66 \ndiplonemids may prove particularly informative for understanding the complexities of the 67 \nancestral eukaryote (Richards, et al. 2024). 68 \nAmong the many protein-coding genes predicted from its genome, an unexpected finding in 69 \nP . papillatum was the identification of several hundred carbohydrate active enzymes 70 \n(CAZymes), with the capacity to digest pectin, cellulose, and -1,3 glycans among other 71 \ncarbohydrates (Valach, Moreira, et al. 2023). This expanded CAZyme repertoire is 72 \nparticularly prominent compared to their relatives, Euglenida and Kinetoplastea (Valach, 73 \nMoreira, et al. 2023). Such presence implies a proclivity of P . papillatum (and potentially 74 \nother diplonemids) towards digestion of cell wall components of plants and algae. However, 75 \nit is unclear how these organisms can specifically digest the cell walls of photosynthetic 76 \neukaryotes. Osmotrophy has been proposed (Prokopchuk, et al. 2022), through secreting 77 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n3 \n \nenzymes to their exterior, as well as phagotrophy, internally ingesting components of their 78 \nprey. Though P . papillatum is a tractable species, tagging and visualizing hundreds of 79 \nCAZymes to determine their localization is unrealistic. We therefore sought to perform 80 \nsubcellular proteomics to localize CAZymes to various intracellular compartments. 81 \nHere, we use a subcellular proteomics workflow similar to localization of organelle proteins 82 \nby isotope tagging via differential ultracentrifugation (LOPIT-DC) (Geladaki, et al. 2019), to 83 \nproduce the first subcellular proteome of a diplonemid. With our data, we classified 4,870 84 \nproteins to 22 cellular compartments in P . papillatum. We validated several predicted 85 \nlocations by epitope and fluorescent tagging. Our subcellular proteome provided a clearly 86 \nresolved cluster of cell membrane proteins enriched with secreted CAZymes. We suggest 87 \nthese enzymes can actively degrade plant and algal cell walls, initially at the cell’s exterior. 88 \nWe also show an ability for internal carbohydrate processing with various secreted CAZymes 89 \ndistributed to the lysosomal compartments, and expand on traditional carbohydrate 90 \nmetabolism across glycosomes and the cytoplasm, demonstrating their diverged 91 \ncompartmentalization from their sister clade Kinetoplastea (Opperdoes and Michels 1993). 92 \nFinally, we reveal an extensive mitochondrial capacity for varied amino acid digestion, 93 \nforegrounding the metabolic versatility of this model diplonemid. Our localization of 94 \nthousands of P . papillatum proteins provides a repository of information that will extend our 95 \nknowledge of diplonemids, facilitating an exploration of their unusual cell biology and 96 \nfunction.    97 \n                  98 \nResults and Discussion 99 \nSubcellular proteomics allows predictive clustering of P . papillatum proteins into 22 100 \ndistinct compartments 101 \nTo obtain a subcellular map of P . papillatum, we used a modified workflow adapted from a 102 \nLOPIT-DC protocol described previously (Geladaki, et al. 2019). Briefly, cells were grown 103 \naxenically in ‘Diplo’ media (sea water supplemented with 10% Fetal Bovine Serum and 1 g 104 \ntryptone). Approximately 9.9 x 108 cells per sample were collected and lysed in detergent-free 105 \nlysis buffer in a nitrogen cavitator (250 psi for 10 min). Cell lysates underwent differential 106 \ncentrifugation resulting in 11 distinct fractions, including initial unlysed cells. We used 107 \nwestern blot analysis using antibodies against ATP synthase subunit  from Trypanosoma 108 \nbrucei (Šubrtová, et al. 2015), mammalian Grp75 (Joseph, et al. 2013) and Grp78 (Chou, et 109 \nal. 2020) to ensure fractional proteomic profiles were distinct (Suppl. Fig. 1). 110 \nLabel-free quantification (LFQ) analysis was followed by peptide data analysis in 111 \nProteomeDiscoverer (Orsburn 2021) and with R, primarily via the pRoloc package (Breckels, 112 \net al. 2016). Data was quantified against the nuclear and mitochondrial genomes of P . 113 \npapillatum (Valach, Moreira, et al. 2023). After quality control, 4,870 unique proteins were 114 \ndetected in this dataset. Following normalization, proteins lacking peptide coverage in all 115 \nfractions underwent and imputation via ‘neighbor averaging’ (1,285 proteins) as well as 116 \n‘zero’ methods (2,073 proteins). 117 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n4 \n \nTo predict cellular localization for the P . papillatum subcellular proteome, we manually 118 \ncurated a set of 368 proteins constituting markers with canonical localizations (e.g. 119 \nmitochondrion, flagellum, cytosol), specific functions (e.g. membrane trafficking 120 \ncompartments) or those with inferred localization data, corresponding to a total of 22 distinct 121 \nsubcellular compartments or protein complexes (Table S1). Using a median svm cutoff (Table 122 \nS2), we predicted sub-localization of 2,435 proteins (Fig. 1A,B), with the remainder 123 \nadditionally classified to these compartments with lower confidence (Table S3; Suppl. Fig. 124 \n2). To further corroborate our designated clusters, we mapped predicted target signals and 125 \nprotein features onto the t-SNE distributions (Fig. 1C). Mitochondrial target peptides (mTP, 126 \npredicted via TargetP2.0) (Armenteros, et al. 2019) are abundant across the three 127 \nmitochondrial clusters—matrix, protein complexes and membrane-enriched. Signal peptides, 128 \npredicted via SignalP6.0 (Teufel, et al. 2022) show enrichment across soluble lysosome, cell 129 \nmembrane, endoplasmic reticulum (ER)/Golgi clusters, as well as endocytic and 130 \nmultivesicular membrane trafficking compartments. Finally, transmembrane domains (TMD), 131 \npredicted via DeepTMHMM (Hallgren, et al. 2022) correlate to the various membrane-132 \nenriched clusters of the diplonemid cell.     133 \nNext, we highlighted proteins that exhibit differences in abundance when P. papillatum was 134 \ngrown in different media: ‘Diplo’ versus ‘Hemi’ media (sea water supplemented with 10 ml 135 \ninactivated horse serum and 1 ml/L LB medium), and oxygen abundant versus depleted 136 \nconditions (Fig. 1D) (Škodová-Sveráková, et al. 2021). Cells grown in nutrient-rich ‘Diplo’ 137 \nmedium show enrichment for proteins predicted to the cytosolic ribosome and cell membrane 138 \nclusters, including sodium/potassium exchangers and sterol transporters. The nutrient-poorer 139 \n‘Hemi’ medium showed notable enrichment across multiple clusters, including the 140 \nproteasome, cytosol, soluble lysosome and mitochondrial regions (Fig. 1D). Equally, aerobic 141 \nconditions resulted in the enrichment of several hypothetical cell membrane components, 142 \nsubunits of mitochondrial complex IV, as well as various soluble lysosomal proteases. By 143 \ncontrast, anaerobic conditions induced enrichment across clusters of the cytosol, cytosolic 144 \nribosomes, mitochondrial matrix, and translation initiation factors 2 and 3 (Fig. 1D).       145 \n     146 \nEndogenous tagging confirms subcellular localizations inferred from proteomic data 147 \nTo validate designated clusters, we successfully performed endogenous tagging with either 148 \nV5 or YFP epitopes on 12 proteins predicted or classified to various cell compartments, 149 \nwhich typically lack both annotation and homologs outside diplonemids (Fig. 2). Such 150 \nproteins were ultimately located to the flagella (Fig. 2A), cytoplasm (B,C), mitochondrion 151 \n(D,E), ER/Golgi (F), nucleus (G,H,I), nucleolus (J), endocytic membrane trafficking (K) and, 152 \nfinally, the cell membrane (L), encompassing nine defined clusters in total (Table S4).  153 \nMitochondrial proteins DIPPA_24150 and DIPPA_15120 co-localize with the organellar 154 \nDNA within this reticulated mitochondrion (Suppl. Fig. 3) at the cell periphery (Figs. 2D and 155 \nE). In turn, the tagged ER/Golgi candidate DIPPA_04811 shows a signal surrounding the 156 \nnuclear DNA, while also branching and extending into the cell posterior (Fig. 2F). Next, we 157 \nvalidated four proteins assigned to the nucleus, which show different sub-localizations by 158 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n5 \n \nimmunofluorescence analysis (IFA) within this compartment (Figs. 2G-J). The first nuclear 159 \ncandidate (DIPPA_16310) has a patchy distribution on the outermost periphery of the nuclear 160 \nDNA (Fig. 2G). Unlike the novel ER/Golgi protein (Fig. 2F), this candidate does not extend 161 \nbeyond the nucleus, and hence, likely constitutes a novel nuclear membrane component. A 162 \nsecond nuclear candidate (DIPPA_32825) co-localizes with the chromatin signal of the 163 \nnucleus (Fig. 2H), similar to the general nuclear signal of a third selected nuclear protein 164 \n(DIPPA_24937) (Fig. 2I). The last nuclear candidate (DIPPA_00315) displays a confined 165 \ndistribution within the nucleus, corresponding to the nucleolus (Fig. 2J). Similarly, this 166 \nprotein’s uncharacterized homolog within the kinetoplastid T. brucei (Tb927.3.2750) also 167 \ndisplays a nucleolus-like signal when tagged via green fluorescent protein (Billington, et al. 168 \n2023).  169 \nOne protein (DIPPA_21158) classified to the ‘endocytic membrane trafficking’ compartment, 170 \nseemingly exhibits dual localization, with an ER-like pattern similar to DIPPA_04811 (Fig. 171 \n2F), while also showing enrichment towards and encompassing the cell cytopharynx (Fig. 172 \n2K). Finally, a protein predicted to the cell membrane cluster (DIPPA_16504) (Fig. 2), shows 173 \na signal enriched across the cell outline, excepting the apical papilla (Fig. 2L). This protein 174 \npossesses a signal peptide and a TMD, both of which are enriched for proteins predicted to 175 \nthe cell membrane (Fig. 1C). This cell membrane cluster also exhibits an accumulation of 176 \npredicted signal peptides in tandem with glycosylphosphatidylinositol (GPI)-attachment 177 \ndomains (Suppl. Fig. 4), further supporting the validity of this newly defined cluster.  178 \n 179 \nSecreted CAZymes localize to the cell membrane and lysosomes  180 \nCarbohydrate-Active Enzymes (CAZymes) are particularly abundant in P . papillatum, 181 \nsuggesting complex digestive capabilities against plant and algal cell wall carbohydrates 182 \n(Valach, Moreira, et al. 2023). Through our subcellular dataset, we show a notable proportion 183 \nof CAZymes enriched with signal peptides localized with high confidence to the cell 184 \nmembrane and the lysosome (Fig. 3). Schematic diagrams of these cell membrane CAZymes 185 \nshow the presence of a C-terminal TMD and/or GPI anchor sites, preceded by the catalytic 186 \ndomains of associated enzymes. This topology indicates that the CAZyme domains are 187 \nexposed to extracellular space and thus expected to digest external carbohydrate substrates 188 \n(Fig. 3A). Enzymatic domains present include pectin esterase, pectin lyase and glycosyl 189 \nhydrolases, from which we construct a digestion pathway on the cell membrane to externally 190 \ndegrade methylated pectin to galacturonic acid monomers (Fig. 3A). Some CAZymes of the 191 \ncell membrane lack the predicted TMDs or GPI anchors, such as glycosyl hydrolase, which 192 \ndegrades hemicellulose to glucose, xylose and galactose. It remains a possibility that such 193 \nCAZymes are released into the extracellular space or simply lack identifiable motifs for cell 194 \nanchorage. 195 \nCandidate sugar transporters, recently identified through genome analysis(Valach, Moreira, et 196 \nal. 2023), were not localized to the cell membrane cluster, rather being assigned to the 197 \nER/Golgi and glycosome compartments (Table S3). Thus, we propose that instead of being 198 \npassaged directly to the cytoplasm across the cell membrane, digested or partially digested 199 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n6 \n \ncarbohydrate substrates are engulfed through the cytopharynx, leading to trafficking through 200 \nthe endocytic vesicles, which have been observed prominently budding off from this 201 \ndistinctive structure in diplonemids (Tashyreva, et al. 2023). Within the endocytic membrane 202 \ntrafficking cluster of this dataset, we also identified one secretory CAZyme (Fig. 3B). 203 \nEndocytosed contents are typically passaged to the lysosomal compartments, for which we 204 \nalso define a corresponding cluster of soluble proteins containing numerous signal peptide-205 \nbearing CAZymes, with the ability to digest various forms of pectin and other polysaccharide 206 \nchains, such as sucrose and glycosides (Fig. 3C). We additionally predict one sugar 207 \ntransporter (DIPPA_16016.mRNA.1) to the multivesicular membrane trafficking body, 208 \nenriched for V-type ATPases and other membranous components of the lysosome, suggesting 209 \neventual saccharide transport from these organelles to the cytosol and possibly other 210 \ncompartments.            211 \nGiven that these analyzed cells were grown in the protein-rich ‘Diplo’ medium, we did not 212 \nnecessarily expect an abundance of CAZymes in our extractions. Nonetheless, we detected a 213 \ntotal of 94 different enzymes across our subcellular dataset, 55 of which were not recorded in 214 \nprevious studies (Table S5) (Škodová-Sveráková, et al. 2021; Valach, Moreira, et al. 2023). 215 \nThe proteomic presence of these enzymes in a mostly carbohydrate-depleted medium 216 \nsuggests that most CAZymes are permanently expressed regardless of substrate availability. 217 \nWe further note that in previous cultivation studies, the lysosomal CAZymes identified in this 218 \nstudy showed conditional enrichment, while the newly identified CAZymes of the cell 219 \nmembrane do not change in the face of different conditions or media (Fig. 1D) (Škodová-220 \nSveráková, et al. 2021). Such constitutive presence supports suggestions recently made for 221 \nplants and algae being the primary food source of P . papillatum in nature, potentially making 222 \nuse of both carbohydrates on the external cell walls, as well as the internal proteinaceous 223 \nenergy sources (Valach, Moreira, et al. 2023). 224 \nThe soluble lysosome contains a chitinase (Fig. 3C), while in the endocytic trafficking 225 \ncompartment we documented a complementary glucuromannan-digesting GH92, which 226 \ncombined suggests a proclivity for fungal cell wall digestion (Fig. 3B). The single 227 \nobservation of P . papillatum regarding its in natura behavior comes from its initial isolation 228 \nfrom drifting eelgrass (Porter 1973), a plant that is known to harbor various fungal 229 \ncohabitants on its surface (Newell 1981). This documented enzymatic sub-localization 230 \nappears consistent with such a supposition for varied sources of prey. Interestingly, a single 231 \nCAZyme member, xylan-α-glucuronidase, is predicted with high confidence to the cytosol, 232 \ndespite the presence of an N-terminal signal sequence. This enzyme is additionally predicted 233 \nto have been acquired via horizontal gene transfer from a bacterial endosymbiont, for which 234 \ndiplonemids have shown a propensity for acquisition (George, et al. 2022; Tashyreva, 235 \nV otýpka, et al. 2025), though absent from the extant P . papillatum. 236 \n 237 \nSubcellular distribution of glycolysis/gluconeogenesis enzymes reveals novel glycosomal 238 \ninsights for P . papillatum 239 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n7 \n \nDiplonemids and their sister lineage, the mostly parasitic kinetoplastids, are categorized as 240 \nglycomonads due to their shared compartmentalization of part of their glycolytic pathways in 241 \nspecialized peroxisomes called glycosomes (Michels, et al. 2006), yet the extent to which 242 \nthese organelles retain the same function in both lineages is unclear. Kinetoplastids localize 243 \nthe first seven steps of glycolysis to glycosomes (Opperdoes and Michels 1993), while in 244 \ndiplonemids peroxisomal targeting signals (PTS) are predicted in six enzymatic steps 245 \nsuggesting a similar metabolic arrangement, although only five enzyme have been 246 \nexperimentally confirmed to colocalize with known peroxisomal proteins (Makiuchi, et al. 247 \n2011; Morales, et al. 2016). Here, we use our subcellular proteomics dataset to partly confirm 248 \nand expand on previous analyses (Fig. 4). As our glycosome showed fractional similarity 249 \nwith other organelle clusters, we only confirmed two enzymatic steps to this organelle, one of 250 \nwhich (step III) representing a newly described designation in P . papillatum (Fig. 4A). 251 \nHowever, we confirmed the cytosolic localization of four enzymes, glyceraldehyde 3-252 \nphosphate dehydrogenase (GADPH, step VI), phosphoglycerate mutase (PGAM, step VIII), 253 \nenolase (step IX) and pyruvate kinase (Step Xa) (Morales, et al. 2016).  254 \nCertain enzymes were not detected in previous investigations; thus, it came as a surprise that 255 \nwe detected in the glycosome both phosphofructokinase (PFK) and fructose 1,6-256 \nbiphosphatase (FBP), which typically participate in glycolysis (IIIa) and gluconeogenesis 257 \n(IIIb), respectively (Fig. 4A; Suppl. Fig. 5). Their localization demonstrates a capacity for 258 \nthis organelle to mediate both directions of this pathway (Fig. 4A). The genome of P . 259 \npapillatum encodes two PFKs (Morales, et al. 2016), with PFK1 (DIPPA_21987) being a 260 \nPPi-dependent variant horizontally acquired from a bacterium (Škodová-Sveráková, et al. 261 \n2021), which is typically able to function in an ATP-poor environment. PFK1 also shows the 262 \npotential to engage in gluconeogenesis (Škodová-Sveráková, et al. 2021), which along with 263 \nFBP further supports the capacity of P . papillatum ‘glycosomes’ to perform steps of 264 \ngluconeogenesis. We further note the prediction of a TMD in PFK1, the presence of which 265 \nrepresents an unusual feature for enzymes of this pathway (Fig. 4A), though not without 266 \nprecedent (Jirsová, et al. 2025). We propose that the N-terminal TMD allows insertion of the 267 \nenzyme from within the glycosome, exposing its enzymatic domains to the organellar lumen 268 \n(Suppl. Fig. 6A). While previous transcriptome analysis recorded an additional PTS1-lacking 269 \nPFK with presumable cytosolic residence (Škodová-Sveráková, et al. 2021), a survey of the 270 \nnow complete genome confirmed only the presence of PFKs furnished with PTS (Valach, 271 \nMoreira, et al. 2023).  272 \nOne copy (DIPPA_70192) of fructose-biphosphate aldolase (FBA, step IV) was previously 273 \nlocalized to the glycosomes and indeed shows a corresponding fractional pattern in our 274 \ndataset (Fig. 4A; Suppl. Fig. 5). A second FBA (DIPPA_30805), also bearing a PTS2 motif, 275 \ndisplays a more subdued profile with less similarity to the cytosolic and glycosomal 276 \nfractional profiles (Suppl. Fig. 6B). We interpret this as co-localization in both compartments, 277 \na phenomenon described for several peroxisomal proteins across eukaryotes (Freitag, et al. 278 \n2018). Moreover, while in ‘Hemi’ media, the PTS1-bearing copies of G6P and TIM have 279 \nbeen localized to the glycosomes (Morales, et al. 2016), in our dataset they occupy an 280 \nambiguous position that similarly implies a dual localization, which contrasts with their 281 \nconfidently placed cytosolic counterparts which lack a PTS (Fig. 4A; Suppl. Fig. 6C,D).  282 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n8 \n \nWe additionally localized multiple copies of PTS-lacking GADPH distributed to either the 283 \ncell membrane or the mitochondrion, as described elsewhere (Bártulos, et al. 2018). We 284 \ndemonstrate a convincing cytosolic localization for four additional paralogues of enzymes 285 \nlacking a PTS, namely glucose 6-phophate isomerase (G6P, step II), triosephosphate 286 \nisomerase (TIM, step V), phosphoglycerate kinase (PGK, step VII) and PGAM (Fig. 4A). 287 \nWhile such cytosolic localizations may facilitate the glycolytic processing from 288 \nglyceraldehyde-3-phosphate to pyruvate, they alternatively reveal the potential for partial 289 \ncytosolic gluconeogenesis initiated by processing oxaloacetate to phosphoenolpyruvate via 290 \ncytosol-localized PEP carboxykinase (Fig. 4A). Along with other reversible steps of 291 \nglycolysis, proteomic enrichment was reported for this enzyme from cells grown in the 292 \nglucose-depleted ‘Hemi’ medium (Škodová-Sveráková, et al. 2021), reflecting an increased 293 \nuse of gluconeogenesis under such conditions (Table S6).   294 \nMetabolically adjacent to glycolysis and gluconeogenesis is the pentose phosphate pathway 295 \n(PPP), which facilitates the interconversion of simple carbohydrates of different sizes (Fig. 296 \n4B). In kinetoplastids, several PPP enzymes possess a PTS, producing a glycosomal or dual 297 \nglycosomal and cytosolic localization (Kovárová and Barrett 2016). However, P . papillatum 298 \nencodes only a single PTS1-possessing enzyme, phosphogluconolactonase (step II) (Fig. 4B). 299 \nDespite its targeting signal, our dataset suggests the enzyme localizes to the cytoplasm 300 \n(Suppl. Fig. 6E), demonstrating that, similar to the localizations of certain proteins in T. 301 \nbrucei (Güther, et al. 2014), the presence of a PTS does not guarantee peroxisomal targeting 302 \n(Fig. 4B,C). While ribulose-5-phosphate 3-epimerase (step IV of PPP) and trans-aldolase 303 \n(step VI of PPP) are classified to the soluble lysosome, considering the fractional similarity 304 \nof this cluster to that of the cytosol, we regard them as cytosolic (Fig. 4A-C; Suppl. Fig. 5). 305 \nBy contrast, a copy of glucose-6-phosphate dehydrogenase (step I of PPP) shows a 306 \nfractionation profile consistent with that of the endocytic membrane trafficking cluster, which 307 \nwarrants future investigation (Fig. 4B,C; Suppl. Fig. 6F).     308 \nIn summary, our localization of individual steps for glycolysis/gluconeogenesis supports the 309 \nhypothesis that diplonemids separated from kinetoplastids prior to the complete transfer of 310 \nthe first seven steps of these pathways into the glycosomes (Morales, et al. 2016), leaving 311 \nstep VI in the cytosol for diplonemids. Moreover, in these flagellates the cell membrane-312 \nembedded versions of GAPDH underlines compartmentalization of this enzyme distinct from 313 \nthat of its homologs in kinetoplastids (Moloney, et al. 2023). A further distinction is 314 \nrepresented by PEP carboxykinase, which in P . papillatum remains cytosolic and lacks a PTS, 315 \nwith its glycosomal compartmentalization evolved in the kinetoplastid clade only secondarily. 316 \nThe presence of a PTS1 in just a single PPP enzyme likely signifies a remnant of the ancestral 317 \ntrend in glycomonads towards compartmentalization of this pathway which, unlike in 318 \nkinetoplastids, did not continue to progress in diplonemids. 319 \n 320 \nVersatile amino acid digestive capabilities within the mitochondrion     321 \nGluconeogenesis in P . papillatum is presumably supplied by substrates from the amino acid 322 \n(AA) catabolism (Morales, et al. 2016), which we endeavored to resolve with our subcellular 323 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n9 \n \ndataset. Accordingly, we show this protist’s capacity to digest a broad range of AA’s, 324 \nprimarily within the mitochondrion, reminiscent of similar capabilities demonstrated in the 325 \nmitochondrion of its fellow euglenozoan, Euglena gracilis (Hammond, et al. 2020) (Suppl. 326 \nFig. 7). Ultimately, the mitochondrion of P . papillatum appears capable of metabolizing 327 \narginine, aspartate, histidine, glutamate, glycine, isoleucine, leucine, proline, serine, threonine 328 \nand valine into metabolites that can directly feed into the tricarboxylic acid (TCA) cycle, as 329 \nwell as glutamine and cysteine with initial cytosolic processing (Suppl. Fig. 7). We 330 \nadditionally reveal the ability of diplonemids to process the fatty acid propanoate to 331 \npropanoyl-CoA, which allows incorporation into AA intermediate-processing pathways 332 \nwithin the mitochondrion, representing a functional distinction from kinetoplastids (Suppl. 333 \nFig. 7). 334 \n 335 \nConclusions 336 \nPrevious work has demonstrated the global distribution, relative importance, abundance, and 337 \ndiversity of marine diplonemids (Tashyreva, et al. 2022), underscoring the value in clarifying 338 \ntheir ecological roles and biology. Only recently has P. papillatum emerged as a genetically 339 \ntractable species (Faktorová, Kaur, et al. 2020), opening the entire clade to inquiry via 340 \ncellular and molecular methods. Our subcellular proteomics dataset is complementary to 341 \nthese efforts and provides a pathway towards hypothesis-driven research, thereby accelerating 342 \nour understanding of these ecologically and evolutionary important protists (Valach, Benz, et 343 \nal. 2023; Benz, et al. 2024; Akiyoshi, et al. 2025; Záhonová, et al. 2025). In total, our data 344 \nenabled us to localize thousands of proteins to 22 distinct subcellular compartments in P . 345 \npapillatum. The confidence of our data is strengthened by the endogenous tagging of selected 346 \nproteins. 347 \nFrom this wealth of data, we focused specifically on the confidently predicted cluster of cell 348 \nmembrane proteins. In this cluster, we identified an expanded family of CAZymes, 349 \nsupporting recent predictions that P . papillatum primarily preys on plant and algae via 350 \ndegrading their cell walls. CAZymes were also localized to the lysosome, further suggesting 351 \nactive ingestion of complex carbohydrates. The fact that we supplied P . papillatum with 352 \nprotein-rich, carbohydrate-limited media represents an intriguing question for future analysis: 353 \nwhy are CAZymes expressed in the absence of carbohydrates? We speculate that they are 354 \nproduced in anticipation of interacting with these substrates, and hope that in natura studies 355 \nmay now be used to definitively clarify the ecological role for this and other diplonemids.                   356 \nIn conclusion, we have sub-localized thousands of proteins in a model species representing a 357 \nmajor protist group. Given the scarcity of available marine protists that are genetically 358 \ntractable and can be investigated with relative ease (Faktorová, Nisbet, et al. 2020), our data 359 \nprovide a novel and rich resource to explore diplonemids’ unique cell biology and to map 360 \nancestral traits in this free-living heterotrophic flagellate. 361 \n 362 \n 363 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n10 \n \nMaterials and Methods 364 \nKey Resource Table 365 \nREAGENT or RESOURCE SOURCE IDENTIFIER \nAntibodies \nRabbit anti-ATP Synthase-β Zíková et al. \n(Šubrtová, et al. \n2015) \n \nMouse anti-GRP 75 Enzo Cat# SPS-826D, \nRRID:AB_2120451 \n(https://www.antibod\nyregistry.org/AB_212\n0451) \nGRP 78 Novus Cat# NB100-56413, \nRRID:AB_838320 \n(https://www.antibod\nyregistry.org/AB_838\n320)  \nGoat anti-Rabbit IgG (H+L) Secondary Antibody, \nHRP \nInvitrogen Catalog# 31460 \n(https://www.thermof\nisher.com/antibody/pr\noduct/Goat-anti-\nRabbit-IgG-H-L-\nSecondary-Antibody-\nPolyclonal/31460)  \nGoat anti-Mouse IgG (H+L) Secondary Antibody, \nHRP \n \nInvitrogen Catalog# 31430 \n(https://www.thermof\nisher.com/antibody/pr\noduct/Goat-anti-\nMouse-IgG-H-L-\nSecondary-Antibody-\nPolyclonal/31430) \nMouse anti-V5 Monoclonal Antibody (2F11F7) Invitrogen Catalog# 37-7500-\nA555, \nRRID:AB_2610631 \n(https://www.antibod\nyregistry.org/AB_261\n0631) \nRabbit anti-V5 Polyclonal Antibody Sigma-Aldrich Catalog# V8137, \nRRID:AB_261889 \n(https://www.antibod\nyregistry.org/AB_261\n889) \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n11 \n \nGoat anti-Mouse IgG (H+L) Cross-Adsorbed \nSecondary Antibody, Alexa Fluor™ 555 \n \nInvitrogen Catalog# A-21422 \n(https://www.thermof\nisher.com/antibody/pr\noduct/Goat-anti-\nMouse-IgG-H-L-\nCross-Adsorbed-\nSecondary-Antibody-\nPolyclonal/A-21422) \nGoat anti-Rabbit IgG (H+L) Cross-Adsorbed \nSecondary Antibody, Alexa Fluor™ 488 \nInvitrogen Catalog# A-110087 \n(https://www.thermof\nisher.com/antibody/pr\noduct/Goat-anti-\nRabbit-IgG-H-L-\nCross-Adsorbed-\nSecondary-Antibody-\nPolyclonal/A-11008) \nChemicals, peptides, and recombinant proteins \n   \n   \n   \n   \n   \nCritical commercial assays \nPierce™ Dilution-Free™ Rapid Gold BCA Protein \nAssay \nThermo Scientific Catalog# A55860 \nPierce™ Quantitative Peptide Assays & Standards Thermo Scientific  Catalog# 23290 \n   \n   \n   \nDeposited data \nRaw peptide data PRIDE XXXX \nFor protein predictions and annotations see Table \nS3 \n  \n   \n   \n   \nExperimental models: Cell lines \nParadiplonema papillatum Porter(Porter \n1973) \nATCC50162 \n   \n   \n   \n   \nExperimental models: Organisms/strains \nFor cell lines generated for this study see Table S4   \n   \n   \n   \n   \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n12 \n \n   \nOligonucleotides \nFor primers used in this study see Suppl. File 2.   \n   \n   \n   \n   \nRecombinant DNA \npBA3294 vector Akiyoshi et \nal.(Akiyoshi, et \nal. 2025) \n \npDP011 vector Genebank OQ547858 \n   \n   \n   \nSoftware and algorithms \nMicosoft Excel Microsoft https://www.microsof\nt.com \nR and Rstudio Rstudio https://posit.co/downl\noad/rstudio-desktop/ \npROLOC Crook et al. \n(Crook, et al. \n2019) \n \nFiji (Image J) Fiji https://fiji.sc/ \nSignal P 6.0 Teufel et al. \n(Teufel, et al. \n2022) \nhttps://services.health\ntech.dtu.dk/services/S\nignalP-6.0/ \nTarget P 2.0 Armenteros et al.  \n(Armenteros, et \nal. 2019) \nhttps://services.health\ntech.dtu.dk/services/T\nargetP-2.0/ \nDeepTMHMM Hallgren et al. \n(Hallgren, et al. \n2022) \nhttps://services.health\ntech.dtu.dk/services/\nDeepTMHMM-1.0/ \nDeepLOC 2.1 Odum et al. \n(Odum, et al. \n2024) \nhttps://services.health\ntech.dtu.dk/services/\nDeepLoc-2.1/ \nNetGPI 1.1 Gíslason et al. \n(Gíslason, et al. \n2021) \nhttps://services.health\ntech.dtu.dk/services/\nNetGPI-1.1/ \nGhost KOALA Kaneisha et al. \n(Kanehisa, et al. \n2016) \nhttps://www.kegg.jp/g\nhostkoala/ \n   \n   \n 366 \nResource availability 367 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n13 \n \nFurther information and requests for reagents should be directed to and will be fulfilled by 368 \nthe lead contact, Michael Hammond (michael.hammond@paru.cas.cz). 369 \nMaterials availability 370 \nVectors and novel cell lines generated for this study are available from lead contact upon 371 \nrequest. 372 \nExperimental model and study participant details 373 \nP . papillatum (ATCC50162) served as the cell line for both proteomic analysis and cell line 374 \ngeneration. 375 \nStrain, culture conditions and preparation for lysis 376 \nParadiplonema papillatum (ATCC50162) cells were cultivated axenically in Diplo media (36 377 \ng/L sea salts [Sigma], with 1g Tryptone and 10 ml Fetal Bovine Serum [Sigma], 0.22 m 378 \nfilter sterilized) at 22°C. Cell cultures were harvested in a combined volume of 750 ml per 379 \nsample, harvested at ~2.5 x106 cells/ml, processed in triplicates. Cell cultures were 380 \nconcentrated by centrifugation (900xg for 10 min) and pellets were resuspended in 6 ml 381 \ndetergent free lysis buffer (0.25 M sucrose, 10 mM HEPES, pH 7.4, 2 mM EDTA, 2 mM 382 \nMg(OAc) with HaltTM Protease and Phosphatase Inhibitor Cocktail, pre-chilled to 4°C. 383 \nCell lysis and fractionation  384 \nCell suspension underwent lysis via nitrogen cavitation at 250 psi for 10 min (Parr 4639, Parr 385 \nInstrument Co.). Cell lysate was gently released from the chamber to minimize foaming, with 386 \ncollected sample undergoing differential centrifugation following a previously established 387 \nprotocol (Geladaki, et al. 2019). Briefly, cell lysate underwent centrifugation at speeds (Table 388 \n1), and the supernatant was transferred to fresh 2 ml centrifuge tubes and subjected to 389 \nsubsequent centrifugation steps, with pellets from previous spins stored at -80°C after 390 \ncollection, in addition to the supernatant fraction from final spin. Pelleted cell lysate was 391 \nadditionally collected and stored for proteomic analysis.  392 \nTable 1: Fractional protocol for centrifugation as used in this study, adapted from LOPIT-DC 393 \nprotocol (Geladaki, et al. 2019).                  394 \nFraction Centrifuge speed (x g) Spin time (min) \nCell Lysate 200 5 \n1 1,000 10 \n2 3,000 10 \n3 5,000 10 \n4 9,000 15 \n5 12,000 15 \n6 15,000 15 \n7 30,000 20 \n8 79,000 43 \n9 120,000 45 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n14 \n \nSupernatant NA NA \n 395 \nFractional assessment 396 \nTo assess the distribution and enrichment of proteins across P . papillatum fractions, 397 \nimmunoblotting was performed using antibodies against ATP synthase subunit  (kindly 398 \nprovided by A. Zíková) (Šubrtová, et al. 2015), Grp75 (Enzo) (Joseph, et al. 2013)and Grp78 399 \n(Novus) (Chou, et al. 2020). Pellets were resuspended in 2x Laemmli buffer (0.125 M Tris-400 \nHCl, pH 6.8, 4% SDS, 20% glycerol, 0.004% bromophenol blue) without DTT. Fractional 401 \nsamples were quantified using the Pierce™ BCA Protein Assay (Thermo Fisher Sci.). 402 \n10 µg of protein was loaded onto an SDS-PAGE gel (Invitrogen Bolt Bis-Tris Plus Mini 403 \nProtein Gels, 4-12%, 1.0 mm, WedgeWell™ format) along with a protein marker 404 \n(Amersham™ ECL™ Rainbow™ Marker - Full Range). The gel was run for 1 hour at 130 V , 405 \nbriefly washed in 1x PBS buffer, and transferred onto a methanol-activated PVDF membrane 406 \n(iBlot™ 2 Transfer Stacks, PVDF, Invitrogen) using the iBlot 2 Western Blot Transfer Device 407 \n(Invitrogen). The membrane was blocked in 5% non-fat dry milk and 1x PBS buffer for 1 408 \nhour at room temperature, followed by incubation with relevant antibodies diluted (1:10000 409 \nfor ATP-β, and 1:1000 for Grp75 and Grp78) in blocking solution (5% non-fat dry milk in 1x 410 \nPBS buffer). Blots were incubated at room temperature for 1 hour, then overnight at 4°C. The 411 \nfollowing day blots were washed three times for 10 min each in 1x PBS, probed with HRP-412 \nlinked secondary antibodies (31460/31430, Invitrogen) diluted 1:1000 in blocking solution 413 \nfor 1 hour at room temperature, and rinsed again three times for 10 min each in 1x PBS-T. 414 \nDetection was performed using the Pierce ECL Western Blotting Substrate (Thermo Fisher 415 \nSci.), and imaging was conducted with the Azure 600 (Biosystems). 416 \nSample preparation and LC-MS Analysis 417 \nNative protein pellets obtained from differential centrifugation were digested and desalted 418 \nfollowing the protocol for the S-Trap Micro Column (ProtiFi, USA). Protein concentration 419 \nwas quantified using the BCA assay (Thermo Fisher Sci.), while peptide concentration was 420 \nmeasured using a fluorometric kit (Thermo Fisher Sci.). 421 \nLiquid-chromatography tandem mass spectrometry 422 \nLC-MS/MS analyses were performed at the Biosciences Mass Spectrometry Core Facility, 423 \nArizona State University. Data-dependent mass spectra were collected in positive mode using 424 \nan Orbitrap Fusion Lumos mass spectrometer coupled with an UltiMate 3000 UHPLC 425 \n(Thermo Fisher Sci.). Peptides were fractionated on an Easy-Spray LC column (50 cm × 75 426 \nμm ID, PepMap C18, 2 μm, 100 Å) with an upstream trap column. Each sample was 427 \nanalyzed in technical triplicate. LC-MS settings: electrospray potential 1.6 kV , ion transfer 428 \ntube temperature 300°C, and the “Universal” peptide analysis method. Full MS scans (375–429 \n1500 m/z) were acquired at a resolution of 120,000 with three sec cycles. The RF lens was set 430 \nto 30%, AGC to “Standard,” and monoisotopic peak determination included charge states 2–431 \n7. Dynamic exclusion was 60 sec with a 10 ppm mass tolerance. MS/MS spectra were 432 \nacquired in centroid mode with a quadrupole isolation window of 1.6 m/z and CID energy of 433 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n15 \n \n35%. Peptides were eluted over a 240-min gradient at 0.25 µL/min using 2–80% 434 \nacetonitrile/water: 0–3 min (2%), 3–75 min (2–15%), 75–180 min (15–30%), 180–220 min 435 \n(30–35%), 220–225 min (35–80%), 225–240 min (80–85%). 436 \nLC-MS/MS analysis of the digested peptides was performed on an EASY-nLC 1200 (Thermo 437 \nFisher Sci.) coupled to an Orbitrap Eclipse Tribrid mass spectrometer (Thermo Fisher Sci.). 438 \nPeptides were separated on an Aurora UHPLC column (25 cm × 75 µm, 1.6 µm C18, AUR2-439 \n25075C18A, Ion Opticks) with a flow rate of 0.35 µL/min for a total duration of 135 min 440 \nionized at 1.6 kV in the positive ion mode. The gradient was composed of 2% solvent B (5 441 \nmin), 2-6% B (7.5 min), 6-25% B (82.5 min), 25–40% B (30 min), 40-98% B (1min) and 442 \n98% B (15min); solvent A: 2% ACN and 0.2% FA in water; solvent B: 80% ACN and 0.2% 443 \nFA. MS1 scans were acquired at the resolution of 120,000 from 350 to 1,600 m/z, AGC target 444 \n1e6, and maximum injection time 50 ms. MS2 scans were acquired in the ion trap using fast 445 \nscan rate on precursors with 2-7 charge states and quadrupole isolation mode (isolation 446 \nwindow: 0.7 m/z) with higher-energy collisional dissociation (HCD, 30%) activation type. 447 \nDynamic exclusion was set to 30 s. The temperature of ion transfer tube was 300°C and the 448 \nS-lens RF level was set to 30. 449 \nRaw data processing and quantification 450 \nThe LFQ analysis was performed using Proteome Discoverer 2.4 (Thermo Fisher Sci.) based 451 \non the composite database: P . papillatum’ s predicted proteome, and mitochondrial ORFs, 452 \nRaw files were searched with SequestHT using Trypsin as the enzyme, allowing up to three 453 \nmissed cleavages. Peptide length was set to 6–144 amino acids, with precursor ion mass 454 \ntolerance at 20 ppm, fragment mass tolerance at 0.5 Da, and a minimum of one peptide 455 \nidentified. Carbamidomethyl (C) was a fixed modification, while Acetyl (N-terminus), Met-456 \nloss (N-terminus), and oxidation of Met were dynamic modifications. A target/decoy strategy 457 \nand 1.0% FDR were calculated using Percolator. Data were imported into Proteome 458 \nDiscoverer 2.4, and features were detected using the Minora Feature Detector algorithm. The 459 \narea-under-the-curve for aligned ion chromatograms was calculated to determine relative 460 \nabundances. The RAW data have been deposited to the ProteomeXchange Consortium via the 461 \nPRIDE partner repository with the dataset identifier XXXXXX. 462 \nProteins and their corresponding LFQ abundance values were imported into the R 463 \nprogramming language and converted into MSnset object using the Bioconductor packages 464 \nMSnbase (v 2.24.2) and pRoloc (v 1.38.2) (Crook, et al. 2019). The data was examined and 465 \nproteins with low confidence (PSM < 3 and without unique peptides) were filtered out. 466 \nTriplicates were averaged to generate a 33rd dimensional dataset of relative protein 467 \nabundance. The datasets were split into their respective experiments (i.e., 1-11, 12-22, 23-33) 468 \nto perform hybrid imputation and sum-normalization across rows. 469 \nMissing data were imputed first by nearest-neighbor averaging and then imputing zeros for 470 \nall remaining empty cells. Principal component analysis and t-distributed Stochastic 471 \nNeighbor Embedding (t-SNE) were applied for dimensional reduction and data visualization. 472 \nSupervised and unsupervised classification 473 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n16 \n \n268 manually curated marker proteins (Table S1) were used as the training set for a support 474 \nvector machine (SVM) model with the ‘svmOptimization’ and ‘svmClassification’ functions 475 \nin pRoloc package. Initially, 100 rounds of five-fold cross-validation were performed to 476 \noptimize the SVM parameters based on the marker protein abundance profiles. The optimal 477 \nparameters for the SVM classifier were then applied to all proteins in the dataset with a 478 \ncorresponding SVM score whose range is 0-1 with 1 being the score of marker proteins. The 479 \nSVM classifier was then applied to unlabeled data (i.e., non-marker proteins) with 480 \ncorresponding weights applied to each marker class. Each protein was thus classified to one 481 \ncompartment, and any protein whose classification fell below the global median SVM score 482 \nwas reset to ‘unknown’ while the other half of the dataset was considered “predicted” to its 483 \ncorresponding compartment due to their higher SVM scores (Table S3). 484 \nUnsupervised clustering was performed using the K-means (KM) algorithm implemented in 485 \nthe MLearn function from the MLInterfaces package in Rstudio (version 1.78.0). KM 486 \ngenerates k random centroids and includes surrounding data points iteratively such that all 487 \ndata points are included in one of the k clusters and the size of each centroid is minimized. K-488 \nmeans clusters were generated with 22 clusters corresponding to number Rof marker groups 489 \n(Table S3). 490 \nTargeting signal prediction, annotation, and conditional enrichment analysis 491 \nP . papillatum protein database was annotated via blast search against CDS of parasitic 492 \nkinetoplastid Trypanosoma brucei 927 (v66) and free-living Bodo saltans (v66) 493 \n(https://tritrypdb.org/tritrypdb/app) as well as baker’s yeast Saccharomyces cerevisiae 494 \n(559292) (https://blast.ncbi.nlm.nih.gov/Blast.cgi), with a threshold of E-5. Metabolic 495 \npathway analysis was also performed via GhostKoala (Kanehisa, et al. 2016).      496 \nSignal P version 6.0 was used for the prediction of signal peptides, using a confidence 497 \nthreshold of >0.9 (Fig. 1C) (Teufel, et al. 2022), with NetGPI 1.1 additionally used on this 498 \nsubset to determine proteins that additionally possessed predicted C-terminal GPI anchors 499 \n(Gíslason, et al. 2021) (Table S3). Target P 2.0 was used for prediction of mitochondrial 500 \ntarget peptides (Armenteros, et al. 2019), with DeepTMHMM (Hallgren, et al. 2022) used for 501 \npredictions of TMD (Fig.1C) (Table S3). Peroxisomal target signal prediction was conducted 502 \nusing a custom regex script designed by Prof. Fred Oppoerdoes against a broad range of AA 503 \ncombinations with PTS1 determined by the script: [SAGCNP][RHKSNQ][LIVFAMY]$, and 504 \nPTS2 via ^M.[1,10],[RK][LVI].....[HQ][ILA] (Table S3), which were then manually 505 \ninspected for specific enzymes of relevance (Table S5-7). DeepLoc2.1 was additionally used 506 \nto assess protein localization predictions and membranous status (Odum, et al. 2024) (Table 507 \nS3).  508 \nProtein enrichment data for media and conditional cultivation (Škodová-Sveráková, et al. 509 \n2021) was displayed across dataset, including proteins that displayed enrichment status of 510 \nany capacity (Fig. 1D) (Table S3). 511 \nEndogenous tagging and P . papillatum microscopy 512 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n17 \n \nEndogenous C-terminal tagging of cell lines corresponding to 12 proteins within supervised 513 \nprotein clusters were generated to verify predictions (Table S4). 514 \nProteins DIPPA_11651.mRNA.1, DIPPA_15120.mRNA.1, DIPPA_04811.mRNA.1, 515 \nDIPPA_32825.mRNA.1, DIPPA_00315.mRNA.1 underwent tagging via yellow fluorescent 516 \nprotein, using vector pBA3294 (Akiyoshi, et al. 2025). PacI and AscI restriction sites of 517 \npBA3294 were used to insert two ~2 kb homology arms that were amplified from genomic 518 \nDNA by PCR using KOD one polymerase (Merck). Primer sequences are provided in Suppl. 519 \nFile 2. The first fragment corresponds to downstream of the gene ORF (starting just after its 520 \nstop codon) surrounded with PacI and NotI restriction sites, while the second fragment 521 \ncorresponds to the 2 kb DNA fragment starting from 2kb upstream of the stop codon and 522 \nending just before the stop codon surrounded with NotI and AscI. After cutting the fragments 523 \nwith respective restriction enzymes, the two DNA fragments were ligated into pBA3294 that 524 \nwere cut with PacI and AscI. Plasmids were validated by nanopore whole plasmid sequencing 525 \n(Plasmidsaurus). Tagging constructs were linearized by NotI, transfected into P . papillatum 526 \ncells by electroporation, and selected by the addition of 75 µg/mL G418. 527 \nCells were pelleted by centrifugation at 1300 x g for 5 min and fixed by 4% formaldehyde 528 \nsolution diluted in PBS for 5 min. Cells were washed with 1 mL PBS twice, resuspended in a 529 \nsmall volume of DABCO mounting media (1% w/v 1,4-diazabicyclo[2.2.2]octane, 90% 530 \nglycerol, 50 mM sodium phosphate pH 8.0) with 100 ng/mL DAPI, and mounted onto glass 531 \nslides. Images were captured on an Axioimager.Z2 microscope (Zeiss) installed with ZEN 532 \nusing a Hamamatsu ORCA-Flash4.0 camera with 63x objective lenses (1.40 NA). Typically, 533 \n25 z sections spaced 0.24 μm apart were collected. 534 \nProteins DIPPA_07493.mRNA.1, DIPPA_20982.mRNA.1, DIPPA_24150.mRNA.1, 535 \nDIPPA_16310.mRNA.1, DIPPA_24837.mRNA.1, DIPPA_21158.mRNA.1, 536 \nDIPPA_16504.mRNA.1 underwent tagging via 3xV5 epitope, using vector pDP011 537 \n(GeneBank OQ547858) (Faktorová, et al. 2023) (Table S4). A fusion PCR strategy using 538 \nQ5 High-Fidelity DNA Polymerase (NEB Biolabs, M0491S) was used to design and obtain 539 \nthe above DNA constructs, as described previously (Kaur, et al. 2018). Used primers and 540 \nproduct sizes are listed in Suppl. File 2. 1-5 µg of gel-purified and ethanol-precipitated DNA 541 \nconstructs were electroporated into 5 x 107 cells/ml P . papillatum cells as described elsewhere 542 \n(Kaur, et al. 2018; Faktorová, Kaur, et al. 2020). 24 h after electroporation, transfected cells 543 \nunderwent selection in a 24-well plate at 27°C, under increasing concentrations of 544 \nhygromycin (100-225 µg/mL). After 3 weeks, transfectants were selected and expanded into 545 \na volume of 10 ml before downstream analyses.   546 \nTo address subcellular localization of the tagged proteins, an immunofluorescence assay was 547 \nperformed as described previously (Faktorová, et al. 2023). Briefly, 20 to 30 ml of a log 548 \nphase culture was harvested by centrifugation at 1,700 x g for 10 min, resuspended in 500 μl 549 \nof 4% paraformaldehyde (dissolved in sea water), and fixed for 15 min on Superfrost plus 550 \nslides (Thermo Fisher Sci.) at room temperature. After removing the fixative with 1x PBS, 551 \ncells were permeabilized in ice-cold methanol for 10 min and rinsed with 1x PBS. From this 552 \npoint on, the slides were kept in a humid chamber. Next, the slides were blocked in 5.5% 553 \n(w/v) fetal bovine serum in PBS-T for 45 min at room temperature, and the blocking solution 554 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n18 \n \nwas removed by washing the cells two times with 1x PBS. The slides were incubated with 555 \neither mouse anti-V5 or rabbit anti-V5 primary antibody diluted (1:500; Thermo Fisher Sci.) 556 \nin 3% (w/v) bovine serum albumin (Sigma), at 4°C overnight, covered with parafilm. 557 \nAfterwards, the primary antibody was removed by washing the slides three times with PBS-T 558 \nand twice with 1x PBS. AlexaFluor555-labelled goat anti-mouse (1:1000; Invitrogen) or 559 \nAlexaFluor488-labelled goat anti-rabbit (1:1000; Invitrogen) secondary antibody was added 560 \nand incubated at room temperature for 1 hour in the dark, covered with parafilm. After that, 561 \nthe slides were rinsed three times with PBS-T and twice with 1x PBS. All slides were coated 562 \nwith ProLong Gold Antifade Mountant with DNA Stain DAPI (Life Technol.) and mounted. 563 \nSamples were imaged with an Olympus BX63 automated fluorescence microscope equipped 564 \nwith an Olympus DP74 digital camera. Pictures were acquired with the cellSens Dimension 565 \nsoftware (Olympus) and processed through the ImageJ software. 566 \n 567 \nAcknowledgements 568 \nWe thank A. Zíková (Biology Centre) for the anti-ATP synthase subunit  antibodies. This 569 \nwork is supported by the National Science Foundation BII: Mechanisms of Cellular 570 \nEvolution DBI-2119963 (to J.W.), the Czech Grant Agency grants 23-06479X and 25-15298S 571 \n(to J.L.) and a Wellcome Discovery Award 227243/Z/23/Z (to B.A.).       572 \n 573 \nAuthor contributions 574 \nConceptualization, M.H, J.L and J.G.W.; Methodology, M.H, D.F, B.A, and J.G.W.; 575 \nSoftware, M.H, Y.P and T.L.; Validation, M.H.; Formal Analysis, M.H.; Investigation, M.H, 576 \nO.I, D.F, M.S and B.A.; Data Curation, M.H and Y.P.; Writing – Original Draft, M.H, O.I, 577 \nD.F, M.S, B.A, J.L, J.G.W.; Writing – Review & Editing, M.H, J.L and J.G.W.; 578 \nVisualization, M.H.; Supervision, M.H, D.F, J.L and J.G.W.; Project Administration, M.H 579 \nand J.G.W.; Funding Acquisition, M.H, B.A, J.L and J.G.W.     580 \n 581 \nDeclaration of interests 582 \nThe authors declare no competing interests. 583 \n 584 \nReference list 585 \nAkiyoshi B, Faktorová D, Lukeš J. 2025. Discovery of unique mitotic mechanisms in 586 \nParadiplonema papillatum. bioRxiv:2025.2003.2021.644664. 587 \nArmenteros J, Salvatore M, Emanuelsson O, Winther O, von Heijne G, Elofsson A, Nielsen 588 \nH. 2019. Detecting sequence signals in targeting peptides using deep learning. Life Sci 589 \nAlliance 2. 590 \nBenz C, Raas MWD, Tripathi P, Faktorová D, Tromer EC, Akiyoshi B, Lukeš J. 2024. On the 591 \npossibility of yet a third kinetochore system in the protist phylum Euglenozoa. mBio 592 \n15:e02936-02924. 593 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n19 \n \nBillington K, Halliday C, Madden R, Dyer P, Barker A, Moreira-Leite F, Carrington M, 594 \nVaughan S, Hertz-Fowler C, Dean S, et al. 2023. Genome-wide subcellular protein map for 595 \nthe flagellate parasite Trypanosoma brucei. Nature Microbiology 8:533-547. 596 \nBreckels L, Holden S, Wojnar D, Mulvey C, Christoforou A, Groen A, Trotter M, Kohlbacher 597 \nO, Lilley K, Gatto L. 2016. Learning from heterogeneous data sources: An application in 598 \nspatial proteomics. PLos Comput Biol 12. 599 \nBártulos C, Rogers M, Williams T, Gentekaki E, Brinkmann H, Cerff R, Liaud M, Hehl A, 600 \nYarlett N, Gruber A, et al. 2018. Mitochondrial glycolysis in a major lineage of Eukaryotes. 601 \nGenome Biol and Evol 10:2310-2325. 602 \nChou C, Yang R, Chan L, Li C, Sun L, Lee H, Lee P, Sher Y , Ying H, Hung M. 2020. The 603 \nstabilization of PD-L1 by the endoplasmic reticulum stress protein GRP78 in triple-negative 604 \nbreast cancer. Am J Cancer Res 10:2621-2634. 605 \nCrook OM, Breckels LM, Lilley KS, Kirk PDW, Gatto L. 2019. A Bioconductor workflow 606 \nfor the Bayesian analysis of spatial proteomics. F1000Res 8:446. 607 \nFaktorová D, Kaur B, Valach M, Graf L, Benz C, Burger G, Lukeš J. 2020. Targeted 608 \nintegration by homologous recombination enables in situ tagging and replacement of genes in 609 \nthe marine microeukaryote Diplonema papillatum. Environ Microbiol 22:3660-3670. 610 \nFaktorová D, Nisbet R, Robledo J, Casacuberta E, Sudek L, Allen A, Ares M, Aresté C, 611 \nBalestreri C, Barbrook A, et al. 2020. Genetic tool development in marine protists: emerging 612 \nmodel organisms for experimental cell biology. Nature Methods 17:481-494. 613 \nFaktorová D, Záhonová K, Benz C, Dacks J, Field M, Lukeš J. 2023. Functional 614 \ndifferentiation of Sec13 paralogues in the euglenozoan protists. Open Biol 13:220364. 615 \nFlegontova O, Flegontov P, Malviya S, Audic S, Wincker P, de Vargas C, Bowler C, Lukeš J, 616 \nHorák A. 2016. Extreme diversity of diplonemid eukaryotes in the ocean. Curr Biol 26:3060-617 \n3065. 618 \nFreitag J, Stehlik T, Stiebler AC, Bölker M. 2018. The obvious and the hidden: Prediction and 619 \nfunction of fungal peroxisomal matrix proteins. Subcell Biochem 89:139-155. 620 \nGawryluk RMR, Del Campo J, Okamoto N, Strassert JFH, Lukeš J, Richards TA, Worden 621 \nAZ, Santoro AE, Keeling PJ. 2016. Morphological identification and single-cell genomics of 622 \nmarine diplonemids. Curr Biol 26:3053-3059. 623 \nGeladaki A, Britovšek N, Breckels L, Smith T, Vennard O, Mulvey C, Crook O, Gatto L, 624 \nLilley K. 2019. Combining LOPIT with differential ultracentrifugation for high-resolution 625 \nspatial proteomics. Nat Commun 10. 626 \nGeorge EE, Tashyreva D, Kwong WK, Okamoto N, Horák A, Husnik F, Lukeš J, Keeling PJ. 627 \n2022. Gene transfer agents in bacterial endosymbionts of microbial eukaryotes. Genome Biol 628 \nEvol 14,7. 629 \nGíslason M, Nielsen H, Armenteros J, Johansen A. 2021. Prediction of GPI-anchored proteins 630 \nwith pointer neural networks. Curr Res in Biotech 3:6-13. 631 \nGüther M, Urbaniak M, Tavendale A, Prescott A, Ferguson M. 2014. High-confidence 632 \nglycosome proteome for procyclic form Trypanosoma brucei by epitope-tag organelle 633 \nenrichment and SILAC proteomics. Journal of Proteome Res 13:2796-2806. 634 \nHallgren J, Tsirigos K, Pederson M, Armenteros J, Marcatili P, Nielsen H, Krogh A, Winther 635 \nO. 2022. DeepTMHMM predicts alpha and beta transmembrane proteins using deep neural 636 \nnetworks. In. 637 \nHammond M, Nenarokova A, Butenko A, Zoltner M, Dobáková E, Field M, Lukeš J. 2020. A 638 \nuniquely complex mitochondrial proteome from Euglena gracilis. Mol Biol and Evol 639 \n37:2173-2191. 640 \nJirsová D, Licknack TJ, Poh Y-P, Qiu Y , Quan N, Chou T-F, Karr T, Lynch M, Wideman JG. 641 \n2025. Subcellular proteomics of Paramecium tetraurelia reveals mosaic localization of 642 \nglycolysis and gluconeogenesis. bioRxiv:2025.2004.2024.650466. 643 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n20 \n \nJoseph A, Adhihetty P, Wawrzyniak N, Wohlgemuth S, Picca A, Kujoth G, Prolla T, 644 \nLeeuwenburgh C. 2013. Dysregulation of mitochondrial quality control processes contribute 645 \nto sarcopenia in a mouse model of premature aging. PLos One 8. 646 \nKanehisa M, Sato Y , Morishima K. 2016. BlastKOALA and GhostKOALA: KEGG Tools for 647 \nfunctional characterization of genome and metagenome sequences. J Mol Biol 428:726-731. 648 \nKaur B, Valach M, Peña-Diaz P, Moreira S, Keeling P, Burger G, Lukeš J, Faktorová D. 2018. 649 \nTransformation of Diplonema papillatum, the type species of the highly diverse and abundant 650 \nmarine microeukaryotes Diplonemida (Euglenozoa). Environ Microbiol 20:1030-1040. 651 \nKovárová J, Barrett M. 2016. The pentose phosphate pathway in parasitic trypanosomatids. 652 \nTrends Parasitol 32:622-634. 653 \nLax G, Okamoto N, Keeling PJ. 2024. Phylogenomic position of eupelagonemids, abundant, 654 \nand diverse deep-ocean heterotrophs. ISME J 18. 655 \nMakiuchi T, Annoura T, Hashimoto M, Hashimoto T, Aoki T, Nara T. 2011. 656 \nCompartmentalization of a glycolytic enzyme in Diplonema, a non-kinetoplastid 657 \nEuglenozoan. Protist 162:482-489. 658 \nMichels P, Bringaud F, Herman M, Hannaert V . 2006. Metabolic functions of glycosomes in 659 \ntrypanosomatids. Biochimi Biophys Acta-Mol Cell Res 1763:1463-1477. 660 \nMoloney N, Barylyuk K, Tromer E, Crook O, Breckels L, Lilley K, Waller R, MacGregor P. 661 \n2023. Mapping diversity in African trypanosomes using high resolution spatial proteomics. 662 \nNat Commun 14. 663 \nMorales J, Hashimoto M, Williams T, Hirawake-Mogi H, Makiuchi T, Tsubouchi A, Kaga N, 664 \nTaka H, Fujimura T, Koike M, et al. 2016. Differential remodelling of peroxisome function 665 \nunderpins the environmental and metabolic adaptability of diplonemids and kinetoplastids. 666 \nProc R Soc B-Biolog Sci 283. 667 \nMukherjee I, Salcher MM, Andrei A, Kavagutti VS, Shabarova T, Grujčić V , Haber M, 668 \nLayoun P, Hodoki Y , Nakano SI, et al. 2020. A freshwater radiation of diplonemids. Environ 669 \nMicrobiol 22:4658-4668. 670 \nNewell S. 1981. Fungi and bacteria in or on leaves of Eelgrass (Zostera marina L.) from 671 \nChesapeake Bay. Appl Environ Microbiol 41:1219-1224. 672 \nObiol A, Giner CR, Sánchez P, Duarte CM, Acinas SG, Massana R. 2020. A metagenomic 673 \nassessment of microbial eukaryotic diversity in the global ocean. Mol Ecol Resour 20. 674 \nOdum M, Teufel F, Thumuluri V , Armenteros J, Johansen A, Winther O, Nielsen H. 2024. 675 \nDeepLoc 2.1: multi-label membrane protein type prediction using protein language models. 676 \nNucleic Acids Res 52:W215-W220. 677 \nOpperdoes FR, Michels PA. 1993. The glycosomes of the Kinetoplastida. Biochimie 75:231-678 \n234. 679 \nOrsburn B. 2021. Proteome discoverer-A community enhanced data processing suite for 680 \nprotein informatics. Proteomes 9. 681 \nPorter D. 1973. Isonema papillatum   sp.  n., a new colorless marine flagellate:   A light-  and 682 \nelectronmicroscopic study. J Protozool 20:351-356. 683 \nProkopchuk G, Korytár T, Juricová V , Majstorovic J, Horák A, Šimek K, Lukeš J. 2022. 684 \nTrophic flexibility of marine diplonemids-switching from osmotrophy to bacterivory. ISME J 685 \n16:1409-1419. 686 \nRichards T, Eme L, Archibald J, Leonard G, Coelho S, de Mendoza A, Dessimoz C, Dolezal 687 \nP, Fritz-Laylin L, Gabaldon T, et al. 2024. Reconstructing the last common ancestor of all 688 \neukaryotes. Plos Biol 22. 689 \nTashyreva D, Faktorová D, Horák A, Lukeš J, Archibald J, Oatley G, Sinclair E, Santos C, 690 \nPaulini M, Aunin E, et al. 2025. The genome sequences of the diplonemid protist Rhynchopus 691 \neuleeides YPF1915 and its bacterial endosymbiont Candidatus Syngnamydia salmonis 692 \n(Chlamydiota). Wellcome Open Res 10. 693 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n21 \n \nTashyreva D, Faktorová D, Stříbrná E, Horák A, Lukeš J, Archibald JM, Oatley G, Sinclair E, 694 \nAunin E, Gettle N, et al. 2025. The genome sequences of the diplonemid protist Diplonema 695 \njaponicum YFP1604 and its bacterial endosymbiont Ca. Cytomitobacter primus and Ca. 696 \nNesciobacter abundans.  10. 697 \nTashyreva D, Simpson A, Prokopchuk G, Škodová-Sveráková I, Butenko A, Hammond M, 698 \nGeorge E, Flegontova O, Záhonová K, Faktorová D, et al. 2022. Diplonemids-a review on 699 \n\"new\" flagellates on the oceanic block. Protist 173:125868. 700 \nTashyreva D, Týč J, Horák A, Lukeš J. 2023. Ultrastructure and 3D reconstruction of a 701 \ndiplonemid protist (Diplonemea) and its novel membranous organelle. mBio 14:e01921-702 \n01923. 703 \nTashyreva D, V otýpka J, Yabuki A, Horák A, Lukeš J. 2025. Description of new diplonemids 704 \n(Diplonemea, Euglenozoa) and their endosymbionts: Charting the morphological diversity of 705 \nthese poorly known heterotrophic flagellates. Protist 177. 706 \nTeufel F, Armenteros J, Johansen A, Gíslason M, Pihl S, Tsirigos K, Winther O, Brunak S, 707 \nvon Heijne G, Nielsen H. 2022. SignalP 6.0 predicts all five types of signal peptides using 708 \nprotein language models. Nat Biotechnol 40:1023-1025. 709 \nValach M, Benz C, Aguilar L, Gahura O, Faktorová D, Zíková A, Oeffinger M, Burger G, 710 \nGray M, Lukeš J. 2023. Miniature RNAs are embedded in an exceptionally protein-rich 711 \nmitoribosome via an elaborate assembly pathway. Nucleic Acids Res 51:6443-6460. 712 \nValach M, Léveillé-Kunst A, Gray MW, Burger G. 2018. Respiratory chain Complex I of 713 \nunparalleled divergence in diplonemids. J Biol Chem 293:16043-16056. 714 \nValach M, Moreira S, Petitjean C, Benz C, Butenko A, Flegontova O, Nenarokova A, 715 \nProkopchuk G, Batstone T, Lapébie P, et al. 2023. Recent expansion of metabolic versatility 716 \nin Diplonema papillatum, the model species of a highly speciose group of marine eukaryotes. 717 \nBMC Biol 21. 718 \nZáhonová K, Lukeš J, Dacks JB. 2025. Diplonemid protists possess exotic endomembrane 719 \nmachinery, impacting models of membrane trafficking in modern and ancient eukaryotes. 720 \nCurr Biol 35:1508-1520.e1502. 721 \nŠkodová-Sveráková I, Záhonová K, Juricová V , Danchenko M, Moos M, Baráth P, 722 \nProkopchuk G, Butenko A, Lukáčová V , Kohútová L, et al. 2021. Highly flexible metabolism 723 \nof the marine euglenozoan protist Diplonema papillatum. BMC Biol 19:251. 724 \nŠubrtová K, Panicucci B, Zíková A. 2015. ATPaseTb2, a unique membrane-bound FoF1-725 \nATPase component, is essential in bloodstream and dyskinetoplastic trypanosomes. PLoS 726 \nPathog 11:e1004660. 727 \n 728 \n 729 \n 730 \n 731 \n 732 \n 733 \n 734 \n 735 \n 736 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n22 \n \nFigure Legends 737 \n 738 \nFig. 1: Clustered protein predictions of Paradiplonema papillatum align with predicted 739 \nprotein features and clarify conditional enrichment trends. A Neighbor-average imputed 740 \nt-SNE of dataset displaying clustered predictions displayed for 2,797 proteins across 22 cell 741 \ncompartments. Predictions were generated via support vector modelling conducted on 742 \nfractional profiles of marker proteins, applied to the remaining dataset. B Selected fractional 743 \nabundances of marker proteins across one replicate of this experiment, representing distinct 744 \nprofiles that facilitate predictive clustering (SUP, Supernatant). C Software prediction for 745 \nprotein features of signal peptides, transmembrane domains and mitochondrial target peptides 746 \nacross dataset, demonstrating accumulation across certain defined compartments. D Proteins 747 \ndetermined to be enriched in varying nutrient media (Diplo or Hemi) or cultivation conditions 748 \n(aerobic or anaerobic) from a conditional study of P . papillatum (Škodová-Sveráková, et al. 749 \n2021). Additional information for all proteins available in Table S1 and S3.     750 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n23 \n \n 751 \n 752 \nFig. 2: Endogenous tagging of novel proteins confirms supervised cluster predictions. 753 \nTagged proteins highlighted (black) among relevant predicted clusters, resolved on neighbor-754 \naveraged imputed t-SNE. Individual cell lines were generated via endogenous tagging and 755 \nimaged through fluorescence microscopy for comparison with the compartment relevant 756 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n24 \n \nprotein was predicted to. Merged microscopy images showing protein signal (green) merged 757 \nwith nuclear and mitochondrial DNA (blue). All imaged cells are oriented with their apical 758 \nregions facing right and posterior facing left, cell membrane outlines are traced for all images 759 \nexcept for L, showing only trace of the papilla, which lacks signal. Scale bar represents 5m. 760 \nProteins A, E, G and J are resolved in zero and neighbor-averaged imputed t-SNE in Suppl. 761 \nFig. 3, for which separate channels of each cell line are also shown. Further information on 762 \ncell lines is available in Table S4.     763 \n 764 \nFig. 3: Secreted Carbohydrate Active Enzymes (CAZymes) primarily localized on cell 765 \nmembrane and lysosomes. Distribution of signal peptide-enriched CAZymes, which are 766 \npredicted with high confidence on neighbor-average imputed t-SNE, corresponding to 767 \nhighlighted cluster predictions of the cell membrane (A), endocytic membrane trafficking 768 \n(B), lysosome (C) and cytosol (D). Proteins of cell membrane (A) have schematic 769 \nrepresentations showing software predictions for signal peptides, transmembrane domains 770 \n(TMD) and/or GPI attachment sites, which demonstrate extracellular exposure of CAZyme 771 \ndomains in accordance with conventional membrane topology. Bordered outlines indicate 772 \nseparate enzymatic reactions for CAZymes, with carbohydrate substrates and products in 773 \nblack. Further information on CAZymes of P . papillatum is available in Table S5.           774 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n25 \n \n 775 \nFig. 4: Metabolic reconstruction of glycolysis/gluconeogenesis and pentose phosphate 776 \npathway demonstrates altered glucose metabolism in P . papillatum. Localization of 777 \nrelevant enzymes across glycolysis/gluconeogenesis (A) and pentose phosphate pathway (B), 778 \nresolved on neighbor-average imputed t-SNE (C) with relevant localization clusters 779 \nhighlighted. Peroxisomal target sequences (PTS), mitochondrial target peptides (mTP) and 780 \ntransmembrane domains (TMDs) are indicated. Proteins previously localized via anti-sera 781 \nimmunolocalizations indicated with *, metabolite shunts between two pathways indicated 782 \nwith dotted arrows. Split coloring of proteins represents their manual designations to the 783 \ncytosol (24,25,38) or indicates the possibility of glycosomal dual localizations between the 784 \ncytosol and glycosomes (1,2,5,9,12,20), based on inspection of fractionation profiles (Suppl. 785 \nFig. 6) and targeting signals. Protein numbers highlighted in white represent those only 786 \nresolved on zero and neighbor-average imputed t-SNE (Suppl. Figure 5). Further information 787 \nis available in Table S6.   788 \n         789 \nSupplementary Figures, Files and Tables 790 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n26 \n \n 791 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n27 \n \nSuppl. Fig. 1: Immunoblot analysis used to resolve fractional distribution across 792 \ntriplicate samples. 10g of protein has been loaded for each fraction generated via 793 \ndifferential centrifugation in addition to the initial cell lysate (CL). ATP synthase-β antibody 794 \nused at 1:10,000 ratio (A), Grp75 antibody used in 1:1,000 (B) and Grp78 (C) which displays 795 \nnon-specific signal. Unlysed cells (UC), Supernatant (S). Marker band molecular weights 796 \n(kDa) indicated in dark grey on the leftmost lane of blots.   797 \n 798 \nSuppl. Fig. 2: Neighbor-averaged and zero imputed t-SNE of clustered protein 799 \npredictions, protein features and conditional enrichment of dataset. A Full dataset 800 \ndisplaying clustered predictions displayed for 4,780 proteins across 22 cell compartments. 801 \nPredictions were generated via support vector modelling conducted on fractional profiles of 802 \nmarker proteins, applied to the remaining dataset. B Selected fractional abundances of marker 803 \nproteins across one replicate of this experiment, representing distinct profiles that facilitate 804 \npredictive clustering (SUP, Supernatant). C Software prediction for protein features of signal 805 \npeptides, transmembrane domains and mitochondrial target peptides across dataset, 806 \ndemonstrating accumulation across certain defined compartments. D Proteins determined to 807 \nbe enriched in varying nutrient media (Diplo or Hemi) or cultivation conditions (aerobic or 808 \nanaerobic) from a conditional study of P . papillatum (Škodová-Sveráková, et al. 2021). 809 \nAdditional information for all proteins available in Table S1 and S3. 810 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n28 \n \n      811 \n 812 \nSuppl. Fig. 3: Neighbor-averaged and zero imputed t-SNE of endogenous tagged cell 813 \nlines. Tagged proteins highlighted (black) among relevant predicted clusters, resolved on 814 \nneighbor-averaged and zero imputed t-SNE. Individual cell lines were generated via 815 \nendogenous tagging and imaged through fluorescence microscopy for comparison with 816 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n29 \n \ncompartment relevant protein was predicted to. In descending order, panels depict phase 817 \ncontrast, epitope signal (green), nuclear and mitochondrial DNA (blue), with merges below 818 \nadditionally displaying cell membrane outlines traces for all images, excepting L, which 819 \nshows only trace of the papilla, which lacks epitope signal. All imaged cells are oriented with 820 \ntheir apical regions facing right and posterior facing left. Scale bar represents 5m. Further 821 \ninformation on cell lines is available in Table S4.     822 \n 823 \nSuppl. Fig. 4: Cell membrane cluster shows enrichment of proteins possessing both 824 \npredicted signal peptide (SP) glycosylphosphatidylinositol (GPI) anchors. t-SNE imputed 825 \nvia neighbor-averaging (A) as well as zeroed dataset (B). Signal peptides predicted via Signal 826 \nP 6.0 with a confidence threshold greater than 0.9, in tandem with NetGPI 1.1 used for GPI 827 \npredictions. Further information is available in Table S3.   828 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n30 \n \n 829 \nSuppl. Fig. 5: Metabolic reconstruction of glycolysis/gluconeogenesis and pentose 830 \nphosphate pathway on neighbor-averaged and zero imputed t-SNE. Localization of 831 \nrelevant enzymes across glycolysis/gluconeogenesis (A) and pentose phosphate pathway (B), 832 \nresolved on neighbor-average and zero imputed t-SNE (C) with relevant localization clusters 833 \nhighlighted. Peroxisomal target sequences (PTS), mitochondrial target peptides (mTP) and 834 \ntransmembrane domains (TMDs) are indicated. Proteins previously localized via anti-sera 835 \nimmunolocalizations indicated with *, metabolite shunts between two pathways indicated 836 \nwith dotted arrows. Split coloring of proteins represents their manual designations to the 837 \ncytosol (24,25,38) or indicates the possibility of glycosomal dual localizations between the 838 \ncytosol and glycosomes (1,2,5,9,12,20), based on inspection of fractionation profiles (Suppl. 839 \nFig. 6) and targeting signals. Further information is available in Table S6.           840 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n31 \n \n 841 \nSuppl. Fig. 6: Fractional and schematic analysis of specific enzymes mediating 842 \ncarbohydrate metabolism. Schematic depiction of DIPPA_21987, phosphofructokinase 1 843 \nshowing phosphofructokinase (PFK) domains, transmembrane domain (TMD) and 844 \nPeroxisomal Target Signal along with fractional analysis (A), along with fractional profiles of 845 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint \n\n32 \n \nrelevant enzymes across glycolysis/gluconeogenesis (B) compared to marker proteins of the 846 \ncytosol, glycosomes and endocytic membrane trafficking markers.        847 \n 848 \nSuppl. Fig. 7: Metabolic reconstruction of Amino Acid (AAs) breakdown for 849 \nincorporation in the TCA cycle, localized across cell compartments. AAs and metabolites 850 \nof the TCA cycle are indicated in bold. Propanoate metabolism, which involves intermediates 851 \nof certain AA digestion, is also depicted. Split coloring indicates manual annotation for 852 \nspecific enzymes based on certain target peptides or candidate function, on top, versus 853 \ncontrasting predictions below (eg. Enzyme 2: proline dehydrogenase, we designate to the 854 \nmitochondrion, despite low confidence predictions to the nucleus). Further information is 855 \navailable in Table S7.    856 \nSuppl. File 1: Tables S1-7. 857 \nSuppl. File 2: Primer sequences used for endogenous tagging of P . papillatum. 858 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint","source_license":"CC-BY-4.0","license_restricted":false}