Subcellular proteomics of Paradiplonema papillatum reveals digestive capacity of the cell membrane and the plasticity of peroxisomes across euglenozoans

preprint OA: gold CC-BY-4.0
📄 Open PDF Full text JSON View at publisher
Full text 84,298 characters · extracted from oa-pdf · 9 sections · click to expand

Keywords

18 Diplonemids, subcellular proteomics, cell membrane, metabolism, 19 20

Abstract

21 Diplonemids are among the most diverse and abundant protists in the deep ocean, have 22 extremely complex and ancient cellular systems, and exhibit unique metabolic capacities. 23 Despite this, we know very little about this major group of eukaryotes. To establish a model 24 organism for comprehensive investigation, we performed subcellular proteomics on 25 Paradiplonema papillatum and localized 4,870 proteins to 22 cellular compartments. We 26 additionally confirmed the predicted location of several proteins by epitope tagging and 27 fluorescence microscopy. To probe the metabolic capacities of P . papillatum, we explored the 28 proteins predicted to the cell membrane compartment in our subcellular proteomics dataset. 29 Our data revealed an accumulation of many carbohydrate active enzymes (CAZymes). Our 30 predictions suggest that these CAZymes are exposed to extracellular space, supporting 31 proposals that diplonemids may specialize in breaking down carbohydrates in plant and algal 32 cell walls. Further exploration of carbohydrate metabolism revealed an evolutionary 33 divergence in the function of glycosomes (modified peroxisomes) in diplonemids versus 34 kinetoplastids. Our subcellular proteome provides a resource for future investigations into the 35 unique cell biology of diplonemids. 36 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 2 37

Introduction

38 Diplonemids are unicellular, heterotrophic eukaryotes, which constitute one of the most 39 abundant and species-rich protist groups within the world’s oceans (Flegontova, et al. 2016; 40 Gawryluk, et al. 2016). In addition, recent investigations show a comprehensive distribution 41 of diplonemids in freshwater environments (Mukherjee, et al. 2020), as well as in all pelagic 42 zones of the ocean (Obiol, et al. 2020; Lax, et al. 2024). Global metabarcoding estimates > 43 67,000 species of diplonemids worldwide, and therefore, they are presumed to be key 44 ecological players in all marine ecosystems (Tashyreva, et al. 2022). 45 Despite their importance, our knowledge of diplonemid nutrition strategies, ecological roles 46 as well as their molecular and cellular biology remains limited. Beyond general heterotrophy 47 (Prokopchuk, et al. 2022), investigating their lifestyles and specific feeding modes remains 48 challenging, partly due to the difficulty in observing diplonemid behavior in nature. By 49 contrast, the relative ease by which diplonemids can be established in stable axenic cultures 50 (typically in protein-rich media) is promising, and makes them amenable to an expanding 51 range of genomic, transcriptomic and proteomic experiments (Škodová-Sveráková, et al. 52 2021; Valach, Moreira, et al. 2023). Such techniques are necessary to further characterize 53 diplonemids’ cellular and ecological functions. 54 A high-quality nuclear genome is available for the diplonemid Paradiplonema papillatum 55 (formerly Diplonema) (Valach, Moreira, et al. 2023), with two recent assemblies now 56 available for Diplonema japonicum (Tashyreva, Faktorová, Stříbrná, et al. 2025) and 57 Rhynchopus euleeides (Tashyreva, Faktorová, Horák, et al. 2025), in addition to several 58 previously existing transcriptomes (Tashyreva, et al. 2022). However, P . papillatum remains 59 the only genetically tractable diplonemid, enabling functional investigations by gene deletion 60 (Faktorová, Kaur, et al. 2020), endogenous tagging of proteins (Akiyoshi, et al. 2025), and 61 immunoprecipitation (Valach, Benz, et al. 2023). Such tractability has allowed the 62 investigation of P . papillatum respiratory complexes (Valach, et al. 2018), mitochondrial 63 ribosomes (Valach, Benz, et al. 2023), and kinetochores (Benz, et al. 2024). Diplonemids 64 retain many genes that can be traced to the last eukaryotic common ancestor (LECA), 65 including rare, restricted homologs referred to as jotnarlogs (Záhonová, et al. 2025). Thus, 66 diplonemids may prove particularly informative for understanding the complexities of the 67 ancestral eukaryote (Richards, et al. 2024). 68 Among the many protein-coding genes predicted from its genome, an unexpected finding in 69 P . papillatum was the identification of several hundred carbohydrate active enzymes 70 (CAZymes), with the capacity to digest pectin, cellulose, and -1,3 glycans among other 71 carbohydrates (Valach, Moreira, et al. 2023). This expanded CAZyme repertoire is 72 particularly prominent compared to their relatives, Euglenida and Kinetoplastea (Valach, 73 Moreira, et al. 2023). Such presence implies a proclivity of P . papillatum (and potentially 74 other diplonemids) towards digestion of cell wall components of plants and algae. However, 75 it is unclear how these organisms can specifically digest the cell walls of photosynthetic 76 eukaryotes. Osmotrophy has been proposed (Prokopchuk, et al. 2022), through secreting 77 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 3 enzymes to their exterior, as well as phagotrophy, internally ingesting components of their 78 prey. Though P . papillatum is a tractable species, tagging and visualizing hundreds of 79 CAZymes to determine their localization is unrealistic. We therefore sought to perform 80 subcellular proteomics to localize CAZymes to various intracellular compartments. 81 Here, we use a subcellular proteomics workflow similar to localization of organelle proteins 82 by isotope tagging via differential ultracentrifugation (LOPIT-DC) (Geladaki, et al. 2019), to 83 produce the first subcellular proteome of a diplonemid. With our data, we classified 4,870 84 proteins to 22 cellular compartments in P . papillatum. We validated several predicted 85 locations by epitope and fluorescent tagging. Our subcellular proteome provided a clearly 86 resolved cluster of cell membrane proteins enriched with secreted CAZymes. We suggest 87 these enzymes can actively degrade plant and algal cell walls, initially at the cell’s exterior. 88 We also show an ability for internal carbohydrate processing with various secreted CAZymes 89 distributed to the lysosomal compartments, and expand on traditional carbohydrate 90 metabolism across glycosomes and the cytoplasm, demonstrating their diverged 91 compartmentalization from their sister clade Kinetoplastea (Opperdoes and Michels 1993). 92 Finally, we reveal an extensive mitochondrial capacity for varied amino acid digestion, 93 foregrounding the metabolic versatility of this model diplonemid. Our localization of 94 thousands of P . papillatum proteins provides a repository of information that will extend our 95 knowledge of diplonemids, facilitating an exploration of their unusual cell biology and 96 function. 97 98

Results

and Discussion 99 Subcellular proteomics allows predictive clustering of P . papillatum proteins into 22 100 distinct compartments 101 To obtain a subcellular map of P . papillatum, we used a modified workflow adapted from a 102 LOPIT-DC protocol described previously (Geladaki, et al. 2019). Briefly, cells were grown 103 axenically in ‘Diplo’ media (sea water supplemented with 10% Fetal Bovine Serum and 1 g 104 tryptone). Approximately 9.9 x 108 cells per sample were collected and lysed in detergent-free 105 lysis buffer in a nitrogen cavitator (250 psi for 10 min). Cell lysates underwent differential 106 centrifugation resulting in 11 distinct fractions, including initial unlysed cells. We used 107 western blot analysis using antibodies against ATP synthase subunit  from Trypanosoma 108 brucei (Šubrtová, et al. 2015), mammalian Grp75 (Joseph, et al. 2013) and Grp78 (Chou, et 109 al. 2020) to ensure fractional proteomic profiles were distinct (Suppl. Fig. 1). 110 Label-free quantification (LFQ) analysis was followed by peptide data analysis in 111 ProteomeDiscoverer (Orsburn 2021) and with R, primarily via the pRoloc package (Breckels, 112 et al. 2016). Data was quantified against the nuclear and mitochondrial genomes of P . 113 papillatum (Valach, Moreira, et al. 2023). After quality control, 4,870 unique proteins were 114 detected in this dataset. Following normalization, proteins lacking peptide coverage in all 115 fractions underwent and imputation via ‘neighbor averaging’ (1,285 proteins) as well as 116 ‘zero’ methods (2,073 proteins). 117 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 4 To predict cellular localization for the P . papillatum subcellular proteome, we manually 118 curated a set of 368 proteins constituting markers with canonical localizations (e.g. 119 mitochondrion, flagellum, cytosol), specific functions (e.g. membrane trafficking 120 compartments) or those with inferred localization data, corresponding to a total of 22 distinct 121 subcellular compartments or protein complexes (Table S1). Using a median svm cutoff (Table 122 S2), we predicted sub-localization of 2,435 proteins (Fig. 1A,B), with the remainder 123 additionally classified to these compartments with lower confidence (Table S3; Suppl. Fig. 124 2). To further corroborate our designated clusters, we mapped predicted target signals and 125 protein features onto the t-SNE distributions (Fig. 1C). Mitochondrial target peptides (mTP, 126 predicted via TargetP2.0) (Armenteros, et al. 2019) are abundant across the three 127 mitochondrial clusters—matrix, protein complexes and membrane-enriched. Signal peptides, 128 predicted via SignalP6.0 (Teufel, et al. 2022) show enrichment across soluble lysosome, cell 129 membrane, endoplasmic reticulum (ER)/Golgi clusters, as well as endocytic and 130 multivesicular membrane trafficking compartments. Finally, transmembrane domains (TMD), 131 predicted via DeepTMHMM (Hallgren, et al. 2022) correlate to the various membrane-132 enriched clusters of the diplonemid cell. 133 Next, we highlighted proteins that exhibit differences in abundance when P. papillatum was 134 grown in different media: ‘Diplo’ versus ‘Hemi’ media (sea water supplemented with 10 ml 135 inactivated horse serum and 1 ml/L LB medium), and oxygen abundant versus depleted 136 conditions (Fig. 1D) (Škodová-Sveráková, et al. 2021). Cells grown in nutrient-rich ‘Diplo’ 137 medium show enrichment for proteins predicted to the cytosolic ribosome and cell membrane 138 clusters, including sodium/potassium exchangers and sterol transporters. The nutrient-poorer 139 ‘Hemi’ medium showed notable enrichment across multiple clusters, including the 140 proteasome, cytosol, soluble lysosome and mitochondrial regions (Fig. 1D). Equally, aerobic 141 conditions resulted in the enrichment of several hypothetical cell membrane components, 142 subunits of mitochondrial complex IV, as well as various soluble lysosomal proteases. By 143 contrast, anaerobic conditions induced enrichment across clusters of the cytosol, cytosolic 144 ribosomes, mitochondrial matrix, and translation initiation factors 2 and 3 (Fig. 1D). 145 146 Endogenous tagging confirms subcellular localizations inferred from proteomic data 147 To validate designated clusters, we successfully performed endogenous tagging with either 148 V5 or YFP epitopes on 12 proteins predicted or classified to various cell compartments, 149 which typically lack both annotation and homologs outside diplonemids (Fig. 2). Such 150 proteins were ultimately located to the flagella (Fig. 2A), cytoplasm (B,C), mitochondrion 151 (D,E), ER/Golgi (F), nucleus (G,H,I), nucleolus (J), endocytic membrane trafficking (K) and, 152 finally, the cell membrane (L), encompassing nine defined clusters in total (Table S4). 153 Mitochondrial proteins DIPPA_24150 and DIPPA_15120 co-localize with the organellar 154 DNA within this reticulated mitochondrion (Suppl. Fig. 3) at the cell periphery (Figs. 2D and 155 E). In turn, the tagged ER/Golgi candidate DIPPA_04811 shows a signal surrounding the 156 nuclear DNA, while also branching and extending into the cell posterior (Fig. 2F). Next, we 157 validated four proteins assigned to the nucleus, which show different sub-localizations by 158 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 5 immunofluorescence analysis (IFA) within this compartment (Figs. 2G-J). The first nuclear 159 candidate (DIPPA_16310) has a patchy distribution on the outermost periphery of the nuclear 160 DNA (Fig. 2G). Unlike the novel ER/Golgi protein (Fig. 2F), this candidate does not extend 161 beyond the nucleus, and hence, likely constitutes a novel nuclear membrane component. A 162 second nuclear candidate (DIPPA_32825) co-localizes with the chromatin signal of the 163 nucleus (Fig. 2H), similar to the general nuclear signal of a third selected nuclear protein 164 (DIPPA_24937) (Fig. 2I). The last nuclear candidate (DIPPA_00315) displays a confined 165 distribution within the nucleus, corresponding to the nucleolus (Fig. 2J). Similarly, this 166 protein’s uncharacterized homolog within the kinetoplastid T. brucei (Tb927.3.2750) also 167 displays a nucleolus-like signal when tagged via green fluorescent protein (Billington, et al. 168 2023). 169 One protein (DIPPA_21158) classified to the ‘endocytic membrane trafficking’ compartment, 170 seemingly exhibits dual localization, with an ER-like pattern similar to DIPPA_04811 (Fig. 171 2F), while also showing enrichment towards and encompassing the cell cytopharynx (Fig. 172 2K). Finally, a protein predicted to the cell membrane cluster (DIPPA_16504) (Fig. 2), shows 173 a signal enriched across the cell outline, excepting the apical papilla (Fig. 2L). This protein 174 possesses a signal peptide and a TMD, both of which are enriched for proteins predicted to 175 the cell membrane (Fig. 1C). This cell membrane cluster also exhibits an accumulation of 176 predicted signal peptides in tandem with glycosylphosphatidylinositol (GPI)-attachment 177 domains (Suppl. Fig. 4), further supporting the validity of this newly defined cluster. 178 179 Secreted CAZymes localize to the cell membrane and lysosomes 180 Carbohydrate-Active Enzymes (CAZymes) are particularly abundant in P . papillatum, 181 suggesting complex digestive capabilities against plant and algal cell wall carbohydrates 182 (Valach, Moreira, et al. 2023). Through our subcellular dataset, we show a notable proportion 183 of CAZymes enriched with signal peptides localized with high confidence to the cell 184 membrane and the lysosome (Fig. 3). Schematic diagrams of these cell membrane CAZymes 185 show the presence of a C-terminal TMD and/or GPI anchor sites, preceded by the catalytic 186 domains of associated enzymes. This topology indicates that the CAZyme domains are 187 exposed to extracellular space and thus expected to digest external carbohydrate substrates 188 (Fig. 3A). Enzymatic domains present include pectin esterase, pectin lyase and glycosyl 189 hydrolases, from which we construct a digestion pathway on the cell membrane to externally 190 degrade methylated pectin to galacturonic acid monomers (Fig. 3A). Some CAZymes of the 191 cell membrane lack the predicted TMDs or GPI anchors, such as glycosyl hydrolase, which 192 degrades hemicellulose to glucose, xylose and galactose. It remains a possibility that such 193 CAZymes are released into the extracellular space or simply lack identifiable motifs for cell 194 anchorage. 195 Candidate sugar transporters, recently identified through genome analysis(Valach, Moreira, et 196 al. 2023), were not localized to the cell membrane cluster, rather being assigned to the 197 ER/Golgi and glycosome compartments (Table S3). Thus, we propose that instead of being 198 passaged directly to the cytoplasm across the cell membrane, digested or partially digested 199 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 6 carbohydrate substrates are engulfed through the cytopharynx, leading to trafficking through 200 the endocytic vesicles, which have been observed prominently budding off from this 201 distinctive structure in diplonemids (Tashyreva, et al. 2023). Within the endocytic membrane 202 trafficking cluster of this dataset, we also identified one secretory CAZyme (Fig. 3B). 203 Endocytosed contents are typically passaged to the lysosomal compartments, for which we 204 also define a corresponding cluster of soluble proteins containing numerous signal peptide-205 bearing CAZymes, with the ability to digest various forms of pectin and other polysaccharide 206 chains, such as sucrose and glycosides (Fig. 3C). We additionally predict one sugar 207 transporter (DIPPA_16016.mRNA.1) to the multivesicular membrane trafficking body, 208 enriched for V-type ATPases and other membranous components of the lysosome, suggesting 209 eventual saccharide transport from these organelles to the cytosol and possibly other 210 compartments. 211 Given that these analyzed cells were grown in the protein-rich ‘Diplo’ medium, we did not 212 necessarily expect an abundance of CAZymes in our extractions. Nonetheless, we detected a 213 total of 94 different enzymes across our subcellular dataset, 55 of which were not recorded in 214 previous studies (Table S5) (Škodová-Sveráková, et al. 2021; Valach, Moreira, et al. 2023). 215 The proteomic presence of these enzymes in a mostly carbohydrate-depleted medium 216 suggests that most CAZymes are permanently expressed regardless of substrate availability. 217 We further note that in previous cultivation studies, the lysosomal CAZymes identified in this 218 study showed conditional enrichment, while the newly identified CAZymes of the cell 219 membrane do not change in the face of different conditions or media (Fig. 1D) (Škodová-220 Sveráková, et al. 2021). Such constitutive presence supports suggestions recently made for 221 plants and algae being the primary food source of P . papillatum in nature, potentially making 222 use of both carbohydrates on the external cell walls, as well as the internal proteinaceous 223 energy sources (Valach, Moreira, et al. 2023). 224 The soluble lysosome contains a chitinase (Fig. 3C), while in the endocytic trafficking 225 compartment we documented a complementary glucuromannan-digesting GH92, which 226 combined suggests a proclivity for fungal cell wall digestion (Fig. 3B). The single 227 observation of P . papillatum regarding its in natura behavior comes from its initial isolation 228 from drifting eelgrass (Porter 1973), a plant that is known to harbor various fungal 229 cohabitants on its surface (Newell 1981). This documented enzymatic sub-localization 230 appears consistent with such a supposition for varied sources of prey. Interestingly, a single 231 CAZyme member, xylan-α-glucuronidase, is predicted with high confidence to the cytosol, 232 despite the presence of an N-terminal signal sequence. This enzyme is additionally predicted 233 to have been acquired via horizontal gene transfer from a bacterial endosymbiont, for which 234 diplonemids have shown a propensity for acquisition (George, et al. 2022; Tashyreva, 235 V otýpka, et al. 2025), though absent from the extant P . papillatum. 236 237 Subcellular distribution of glycolysis/gluconeogenesis enzymes reveals novel glycosomal 238 insights for P . papillatum 239 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 7 Diplonemids and their sister lineage, the mostly parasitic kinetoplastids, are categorized as 240 glycomonads due to their shared compartmentalization of part of their glycolytic pathways in 241 specialized peroxisomes called glycosomes (Michels, et al. 2006), yet the extent to which 242 these organelles retain the same function in both lineages is unclear. Kinetoplastids localize 243 the first seven steps of glycolysis to glycosomes (Opperdoes and Michels 1993), while in 244 diplonemids peroxisomal targeting signals (PTS) are predicted in six enzymatic steps 245 suggesting a similar metabolic arrangement, although only five enzyme have been 246 experimentally confirmed to colocalize with known peroxisomal proteins (Makiuchi, et al. 247 2011; Morales, et al. 2016). Here, we use our subcellular proteomics dataset to partly confirm 248 and expand on previous analyses (Fig. 4). As our glycosome showed fractional similarity 249 with other organelle clusters, we only confirmed two enzymatic steps to this organelle, one of 250 which (step III) representing a newly described designation in P . papillatum (Fig. 4A). 251 However, we confirmed the cytosolic localization of four enzymes, glyceraldehyde 3-252 phosphate dehydrogenase (GADPH, step VI), phosphoglycerate mutase (PGAM, step VIII), 253 enolase (step IX) and pyruvate kinase (Step Xa) (Morales, et al. 2016). 254 Certain enzymes were not detected in previous investigations; thus, it came as a surprise that 255 we detected in the glycosome both phosphofructokinase (PFK) and fructose 1,6-256 biphosphatase (FBP), which typically participate in glycolysis (IIIa) and gluconeogenesis 257 (IIIb), respectively (Fig. 4A; Suppl. Fig. 5). Their localization demonstrates a capacity for 258 this organelle to mediate both directions of this pathway (Fig. 4A). The genome of P . 259 papillatum encodes two PFKs (Morales, et al. 2016), with PFK1 (DIPPA_21987) being a 260 PPi-dependent variant horizontally acquired from a bacterium (Škodová-Sveráková, et al. 261 2021), which is typically able to function in an ATP-poor environment. PFK1 also shows the 262 potential to engage in gluconeogenesis (Škodová-Sveráková, et al. 2021), which along with 263 FBP further supports the capacity of P . papillatum ‘glycosomes’ to perform steps of 264 gluconeogenesis. We further note the prediction of a TMD in PFK1, the presence of which 265 represents an unusual feature for enzymes of this pathway (Fig. 4A), though not without 266 precedent (Jirsová, et al. 2025). We propose that the N-terminal TMD allows insertion of the 267 enzyme from within the glycosome, exposing its enzymatic domains to the organellar lumen 268 (Suppl. Fig. 6A). While previous transcriptome analysis recorded an additional PTS1-lacking 269 PFK with presumable cytosolic residence (Škodová-Sveráková, et al. 2021), a survey of the 270 now complete genome confirmed only the presence of PFKs furnished with PTS (Valach, 271 Moreira, et al. 2023). 272 One copy (DIPPA_70192) of fructose-biphosphate aldolase (FBA, step IV) was previously 273 localized to the glycosomes and indeed shows a corresponding fractional pattern in our 274 dataset (Fig. 4A; Suppl. Fig. 5). A second FBA (DIPPA_30805), also bearing a PTS2 motif, 275 displays a more subdued profile with less similarity to the cytosolic and glycosomal 276 fractional profiles (Suppl. Fig. 6B). We interpret this as co-localization in both compartments, 277 a phenomenon described for several peroxisomal proteins across eukaryotes (Freitag, et al. 278 2018). Moreover, while in ‘Hemi’ media, the PTS1-bearing copies of G6P and TIM have 279 been localized to the glycosomes (Morales, et al. 2016), in our dataset they occupy an 280 ambiguous position that similarly implies a dual localization, which contrasts with their 281 confidently placed cytosolic counterparts which lack a PTS (Fig. 4A; Suppl. Fig. 6C,D). 282 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 8 We additionally localized multiple copies of PTS-lacking GADPH distributed to either the 283 cell membrane or the mitochondrion, as described elsewhere (Bártulos, et al. 2018). We 284 demonstrate a convincing cytosolic localization for four additional paralogues of enzymes 285 lacking a PTS, namely glucose 6-phophate isomerase (G6P, step II), triosephosphate 286 isomerase (TIM, step V), phosphoglycerate kinase (PGK, step VII) and PGAM (Fig. 4A). 287 While such cytosolic localizations may facilitate the glycolytic processing from 288 glyceraldehyde-3-phosphate to pyruvate, they alternatively reveal the potential for partial 289 cytosolic gluconeogenesis initiated by processing oxaloacetate to phosphoenolpyruvate via 290 cytosol-localized PEP carboxykinase (Fig. 4A). Along with other reversible steps of 291 glycolysis, proteomic enrichment was reported for this enzyme from cells grown in the 292 glucose-depleted ‘Hemi’ medium (Škodová-Sveráková, et al. 2021), reflecting an increased 293 use of gluconeogenesis under such conditions (Table S6). 294 Metabolically adjacent to glycolysis and gluconeogenesis is the pentose phosphate pathway 295 (PPP), which facilitates the interconversion of simple carbohydrates of different sizes (Fig. 296 4B). In kinetoplastids, several PPP enzymes possess a PTS, producing a glycosomal or dual 297 glycosomal and cytosolic localization (Kovárová and Barrett 2016). However, P . papillatum 298 encodes only a single PTS1-possessing enzyme, phosphogluconolactonase (step II) (Fig. 4B). 299 Despite its targeting signal, our dataset suggests the enzyme localizes to the cytoplasm 300 (Suppl. Fig. 6E), demonstrating that, similar to the localizations of certain proteins in T. 301 brucei (Güther, et al. 2014), the presence of a PTS does not guarantee peroxisomal targeting 302 (Fig. 4B,C). While ribulose-5-phosphate 3-epimerase (step IV of PPP) and trans-aldolase 303 (step VI of PPP) are classified to the soluble lysosome, considering the fractional similarity 304 of this cluster to that of the cytosol, we regard them as cytosolic (Fig. 4A-C; Suppl. Fig. 5). 305 By contrast, a copy of glucose-6-phosphate dehydrogenase (step I of PPP) shows a 306 fractionation profile consistent with that of the endocytic membrane trafficking cluster, which 307 warrants future investigation (Fig. 4B,C; Suppl. Fig. 6F). 308 In summary, our localization of individual steps for glycolysis/gluconeogenesis supports the 309 hypothesis that diplonemids separated from kinetoplastids prior to the complete transfer of 310 the first seven steps of these pathways into the glycosomes (Morales, et al. 2016), leaving 311 step VI in the cytosol for diplonemids. Moreover, in these flagellates the cell membrane-312 embedded versions of GAPDH underlines compartmentalization of this enzyme distinct from 313 that of its homologs in kinetoplastids (Moloney, et al. 2023). A further distinction is 314 represented by PEP carboxykinase, which in P . papillatum remains cytosolic and lacks a PTS, 315 with its glycosomal compartmentalization evolved in the kinetoplastid clade only secondarily. 316 The presence of a PTS1 in just a single PPP enzyme likely signifies a remnant of the ancestral 317 trend in glycomonads towards compartmentalization of this pathway which, unlike in 318 kinetoplastids, did not continue to progress in diplonemids. 319 320 Versatile amino acid digestive capabilities within the mitochondrion 321 Gluconeogenesis in P . papillatum is presumably supplied by substrates from the amino acid 322 (AA) catabolism (Morales, et al. 2016), which we endeavored to resolve with our subcellular 323 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 9 dataset. Accordingly, we show this protist’s capacity to digest a broad range of AA’s, 324 primarily within the mitochondrion, reminiscent of similar capabilities demonstrated in the 325 mitochondrion of its fellow euglenozoan, Euglena gracilis (Hammond, et al. 2020) (Suppl. 326 Fig. 7). Ultimately, the mitochondrion of P . papillatum appears capable of metabolizing 327 arginine, aspartate, histidine, glutamate, glycine, isoleucine, leucine, proline, serine, threonine 328 and valine into metabolites that can directly feed into the tricarboxylic acid (TCA) cycle, as 329 well as glutamine and cysteine with initial cytosolic processing (Suppl. Fig. 7). We 330 additionally reveal the ability of diplonemids to process the fatty acid propanoate to 331 propanoyl-CoA, which allows incorporation into AA intermediate-processing pathways 332 within the mitochondrion, representing a functional distinction from kinetoplastids (Suppl. 333 Fig. 7). 334 335

Conclusions

336 Previous work has demonstrated the global distribution, relative importance, abundance, and 337 diversity of marine diplonemids (Tashyreva, et al. 2022), underscoring the value in clarifying 338 their ecological roles and biology. Only recently has P. papillatum emerged as a genetically 339 tractable species (Faktorová, Kaur, et al. 2020), opening the entire clade to inquiry via 340 cellular and molecular methods. Our subcellular proteomics dataset is complementary to 341 these efforts and provides a pathway towards hypothesis-driven research, thereby accelerating 342 our understanding of these ecologically and evolutionary important protists (Valach, Benz, et 343 al. 2023; Benz, et al. 2024; Akiyoshi, et al. 2025; Záhonová, et al. 2025). In total, our data 344 enabled us to localize thousands of proteins to 22 distinct subcellular compartments in P . 345 papillatum. The confidence of our data is strengthened by the endogenous tagging of selected 346 proteins. 347 From this wealth of data, we focused specifically on the confidently predicted cluster of cell 348 membrane proteins. In this cluster, we identified an expanded family of CAZymes, 349 supporting recent predictions that P . papillatum primarily preys on plant and algae via 350 degrading their cell walls. CAZymes were also localized to the lysosome, further suggesting 351 active ingestion of complex carbohydrates. The fact that we supplied P . papillatum with 352 protein-rich, carbohydrate-limited media represents an intriguing question for future analysis: 353 why are CAZymes expressed in the absence of carbohydrates? We speculate that they are 354 produced in anticipation of interacting with these substrates, and hope that in natura studies 355 may now be used to definitively clarify the ecological role for this and other diplonemids. 356 In conclusion, we have sub-localized thousands of proteins in a model species representing a 357 major protist group. Given the scarcity of available marine protists that are genetically 358 tractable and can be investigated with relative ease (Faktorová, Nisbet, et al. 2020), our data 359 provide a novel and rich resource to explore diplonemids’ unique cell biology and to map 360 ancestral traits in this free-living heterotrophic flagellate. 361 362 363 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 10

Materials and methods

364 Key Resource Table 365 REAGENT or RESOURCE SOURCE IDENTIFIER Antibodies Rabbit anti-ATP Synthase-β Zíková et al. (Šubrtová, et al. 2015) Mouse anti-GRP 75 Enzo Cat# SPS-826D, RRID:AB_2120451 (https://www.antibod yregistry.org/AB_212 0451) GRP 78 Novus Cat# NB100-56413, RRID:AB_838320 (https://www.antibod yregistry.org/AB_838 320) Goat anti-Rabbit IgG (H+L) Secondary Antibody, HRP Invitrogen Catalog# 31460 (https://www.thermof isher.com/antibody/pr oduct/Goat-anti- Rabbit-IgG-H-L- Secondary-Antibody- Polyclonal/31460) Goat anti-Mouse IgG (H+L) Secondary Antibody, HRP Invitrogen Catalog# 31430 (https://www.thermof isher.com/antibody/pr oduct/Goat-anti- Mouse-IgG-H-L- Secondary-Antibody- Polyclonal/31430) Mouse anti-V5 Monoclonal Antibody (2F11F7) Invitrogen Catalog# 37-7500- A555, RRID:AB_2610631 (https://www.antibod yregistry.org/AB_261 0631) Rabbit anti-V5 Polyclonal Antibody Sigma-Aldrich Catalog# V8137, RRID:AB_261889 (https://www.antibod yregistry.org/AB_261 889) .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 11 Goat anti-Mouse IgG (H+L) Cross-Adsorbed Secondary Antibody, Alexa Fluor™ 555 Invitrogen Catalog# A-21422 (https://www.thermof isher.com/antibody/pr oduct/Goat-anti- Mouse-IgG-H-L- Cross-Adsorbed- Secondary-Antibody- Polyclonal/A-21422) Goat anti-Rabbit IgG (H+L) Cross-Adsorbed Secondary Antibody, Alexa Fluor™ 488 Invitrogen Catalog# A-110087 (https://www.thermof isher.com/antibody/pr oduct/Goat-anti- Rabbit-IgG-H-L- Cross-Adsorbed- Secondary-Antibody- Polyclonal/A-11008) Chemicals, peptides, and recombinant proteins Critical commercial assays Pierce™ Dilution-Free™ Rapid Gold BCA Protein Assay Thermo Scientific Catalog# A55860 Pierce™ Quantitative Peptide Assays & Standards Thermo Scientific Catalog# 23290 Deposited data Raw peptide data PRIDE XXXX For protein predictions and annotations see Table S3 Experimental models: Cell lines Paradiplonema papillatum Porter(Porter 1973) ATCC50162 Experimental models: Organisms/strains For cell lines generated for this study see Table S4 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 12 Oligonucleotides For primers used in this study see Suppl. File 2. Recombinant DNA pBA3294 vector Akiyoshi et al.(Akiyoshi, et al. 2025) pDP011 vector Genebank OQ547858 Software and algorithms Micosoft Excel Microsoft https://www.microsof t.com R and Rstudio Rstudio https://posit.co/downl oad/rstudio-desktop/ pROLOC Crook et al. (Crook, et al. 2019) Fiji (Image J) Fiji https://fiji.sc/ Signal P 6.0 Teufel et al. (Teufel, et al. 2022) https://services.health tech.dtu.dk/services/S ignalP-6.0/ Target P 2.0 Armenteros et al. (Armenteros, et al. 2019) https://services.health tech.dtu.dk/services/T argetP-2.0/ DeepTMHMM Hallgren et al. (Hallgren, et al. 2022) https://services.health tech.dtu.dk/services/ DeepTMHMM-1.0/ DeepLOC 2.1 Odum et al. (Odum, et al. 2024) https://services.health tech.dtu.dk/services/ DeepLoc-2.1/ NetGPI 1.1 Gíslason et al. (Gíslason, et al. 2021) https://services.health tech.dtu.dk/services/ NetGPI-1.1/ Ghost KOALA Kaneisha et al. (Kanehisa, et al. 2016) https://www.kegg.jp/g hostkoala/ 366 Resource availability 367 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 13 Further information and requests for reagents should be directed to and will be fulfilled by 368 the lead contact, Michael Hammond ([email protected]). 369

Materials

availability 370 Vectors and novel cell lines generated for this study are available from lead contact upon 371 request. 372 Experimental model and study participant details 373 P . papillatum (ATCC50162) served as the cell line for both proteomic analysis and cell line 374 generation. 375 Strain, culture conditions and preparation for lysis 376 Paradiplonema papillatum (ATCC50162) cells were cultivated axenically in Diplo media (36 377 g/L sea salts [Sigma], with 1g Tryptone and 10 ml Fetal Bovine Serum [Sigma], 0.22 m 378 filter sterilized) at 22°C. Cell cultures were harvested in a combined volume of 750 ml per 379 sample, harvested at ~2.5 x106 cells/ml, processed in triplicates. Cell cultures were 380 concentrated by centrifugation (900xg for 10 min) and pellets were resuspended in 6 ml 381 detergent free lysis buffer (0.25 M sucrose, 10 mM HEPES, pH 7.4, 2 mM EDTA, 2 mM 382 Mg(OAc) with HaltTM Protease and Phosphatase Inhibitor Cocktail, pre-chilled to 4°C. 383 Cell lysis and fractionation 384 Cell suspension underwent lysis via nitrogen cavitation at 250 psi for 10 min (Parr 4639, Parr 385 Instrument Co.). Cell lysate was gently released from the chamber to minimize foaming, with 386 collected sample undergoing differential centrifugation following a previously established 387 protocol (Geladaki, et al. 2019). Briefly, cell lysate underwent centrifugation at speeds (Table 388 1), and the supernatant was transferred to fresh 2 ml centrifuge tubes and subjected to 389 subsequent centrifugation steps, with pellets from previous spins stored at -80°C after 390 collection, in addition to the supernatant fraction from final spin. Pelleted cell lysate was 391 additionally collected and stored for proteomic analysis. 392 Table 1: Fractional protocol for centrifugation as used in this study, adapted from LOPIT-DC 393 protocol (Geladaki, et al. 2019). 394 Fraction Centrifuge speed (x g) Spin time (min) Cell Lysate 200 5 1 1,000 10 2 3,000 10 3 5,000 10 4 9,000 15 5 12,000 15 6 15,000 15 7 30,000 20 8 79,000 43 9 120,000 45 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 14 Supernatant NA NA 395 Fractional assessment 396 To assess the distribution and enrichment of proteins across P . papillatum fractions, 397 immunoblotting was performed using antibodies against ATP synthase subunit  (kindly 398 provided by A. Zíková) (Šubrtová, et al. 2015), Grp75 (Enzo) (Joseph, et al. 2013)and Grp78 399 (Novus) (Chou, et al. 2020). Pellets were resuspended in 2x Laemmli buffer (0.125 M Tris-400 HCl, pH 6.8, 4% SDS, 20% glycerol, 0.004% bromophenol blue) without DTT. Fractional 401 samples were quantified using the Pierce™ BCA Protein Assay (Thermo Fisher Sci.). 402 10 µg of protein was loaded onto an SDS-PAGE gel (Invitrogen Bolt Bis-Tris Plus Mini 403 Protein Gels, 4-12%, 1.0 mm, WedgeWell™ format) along with a protein marker 404 (Amersham™ ECL™ Rainbow™ Marker - Full Range). The gel was run for 1 hour at 130 V , 405 briefly washed in 1x PBS buffer, and transferred onto a methanol-activated PVDF membrane 406 (iBlot™ 2 Transfer Stacks, PVDF, Invitrogen) using the iBlot 2 Western Blot Transfer Device 407 (Invitrogen). The membrane was blocked in 5% non-fat dry milk and 1x PBS buffer for 1 408 hour at room temperature, followed by incubation with relevant antibodies diluted (1:10000 409 for ATP-β, and 1:1000 for Grp75 and Grp78) in blocking solution (5% non-fat dry milk in 1x 410 PBS buffer). Blots were incubated at room temperature for 1 hour, then overnight at 4°C. The 411 following day blots were washed three times for 10 min each in 1x PBS, probed with HRP-412 linked secondary antibodies (31460/31430, Invitrogen) diluted 1:1000 in blocking solution 413 for 1 hour at room temperature, and rinsed again three times for 10 min each in 1x PBS-T. 414 Detection was performed using the Pierce ECL Western Blotting Substrate (Thermo Fisher 415 Sci.), and imaging was conducted with the Azure 600 (Biosystems). 416 Sample preparation and LC-MS Analysis 417 Native protein pellets obtained from differential centrifugation were digested and desalted 418 following the protocol for the S-Trap Micro Column (ProtiFi, USA). Protein concentration 419 was quantified using the BCA assay (Thermo Fisher Sci.), while peptide concentration was 420 measured using a fluorometric kit (Thermo Fisher Sci.). 421 Liquid-chromatography tandem mass spectrometry 422 LC-MS/MS analyses were performed at the Biosciences Mass Spectrometry Core Facility, 423 Arizona State University. Data-dependent mass spectra were collected in positive mode using 424 an Orbitrap Fusion Lumos mass spectrometer coupled with an UltiMate 3000 UHPLC 425 (Thermo Fisher Sci.). Peptides were fractionated on an Easy-Spray LC column (50 cm × 75 426 μm ID, PepMap C18, 2 μm, 100 Å) with an upstream trap column. Each sample was 427 analyzed in technical triplicate. LC-MS settings: electrospray potential 1.6 kV , ion transfer 428 tube temperature 300°C, and the “Universal” peptide analysis method. Full MS scans (375–429 1500 m/z) were acquired at a resolution of 120,000 with three sec cycles. The RF lens was set 430 to 30%, AGC to “Standard,” and monoisotopic peak determination included charge states 2–431 7. Dynamic exclusion was 60 sec with a 10 ppm mass tolerance. MS/MS spectra were 432 acquired in centroid mode with a quadrupole isolation window of 1.6 m/z and CID energy of 433 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 15 35%. Peptides were eluted over a 240-min gradient at 0.25 µL/min using 2–80% 434 acetonitrile/water: 0–3 min (2%), 3–75 min (2–15%), 75–180 min (15–30%), 180–220 min 435 (30–35%), 220–225 min (35–80%), 225–240 min (80–85%). 436 LC-MS/MS analysis of the digested peptides was performed on an EASY-nLC 1200 (Thermo 437 Fisher Sci.) coupled to an Orbitrap Eclipse Tribrid mass spectrometer (Thermo Fisher Sci.). 438 Peptides were separated on an Aurora UHPLC column (25 cm × 75 µm, 1.6 µm C18, AUR2-439 25075C18A, Ion Opticks) with a flow rate of 0.35 µL/min for a total duration of 135 min 440 ionized at 1.6 kV in the positive ion mode. The gradient was composed of 2% solvent B (5 441 min), 2-6% B (7.5 min), 6-25% B (82.5 min), 25–40% B (30 min), 40-98% B (1min) and 442 98% B (15min); solvent A: 2% ACN and 0.2% FA in water; solvent B: 80% ACN and 0.2% 443 FA. MS1 scans were acquired at the resolution of 120,000 from 350 to 1,600 m/z, AGC target 444 1e6, and maximum injection time 50 ms. MS2 scans were acquired in the ion trap using fast 445 scan rate on precursors with 2-7 charge states and quadrupole isolation mode (isolation 446 window: 0.7 m/z) with higher-energy collisional dissociation (HCD, 30%) activation type. 447 Dynamic exclusion was set to 30 s. The temperature of ion transfer tube was 300°C and the 448 S-lens RF level was set to 30. 449 Raw data processing and quantification 450 The LFQ analysis was performed using Proteome Discoverer 2.4 (Thermo Fisher Sci.) based 451 on the composite database: P . papillatum’ s predicted proteome, and mitochondrial ORFs, 452 Raw files were searched with SequestHT using Trypsin as the enzyme, allowing up to three 453 missed cleavages. Peptide length was set to 6–144 amino acids, with precursor ion mass 454 tolerance at 20 ppm, fragment mass tolerance at 0.5 Da, and a minimum of one peptide 455 identified. Carbamidomethyl (C) was a fixed modification, while Acetyl (N-terminus), Met-456 loss (N-terminus), and oxidation of Met were dynamic modifications. A target/decoy strategy 457 and 1.0% FDR were calculated using Percolator. Data were imported into Proteome 458 Discoverer 2.4, and features were detected using the Minora Feature Detector algorithm. The 459 area-under-the-curve for aligned ion chromatograms was calculated to determine relative 460 abundances. The RAW data have been deposited to the ProteomeXchange Consortium via the 461 PRIDE partner repository with the dataset identifier XXXXXX. 462 Proteins and their corresponding LFQ abundance values were imported into the R 463 programming language and converted into MSnset object using the Bioconductor packages 464 MSnbase (v 2.24.2) and pRoloc (v 1.38.2) (Crook, et al. 2019). The data was examined and 465 proteins with low confidence (PSM < 3 and without unique peptides) were filtered out. 466 Triplicates were averaged to generate a 33rd dimensional dataset of relative protein 467 abundance. The datasets were split into their respective experiments (i.e., 1-11, 12-22, 23-33) 468 to perform hybrid imputation and sum-normalization across rows. 469 Missing data were imputed first by nearest-neighbor averaging and then imputing zeros for 470 all remaining empty cells. Principal component analysis and t-distributed Stochastic 471 Neighbor Embedding (t-SNE) were applied for dimensional reduction and data visualization. 472 Supervised and unsupervised classification 473 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 16 268 manually curated marker proteins (Table S1) were used as the training set for a support 474 vector machine (SVM) model with the ‘svmOptimization’ and ‘svmClassification’ functions 475 in pRoloc package. Initially, 100 rounds of five-fold cross-validation were performed to 476 optimize the SVM parameters based on the marker protein abundance profiles. The optimal 477 parameters for the SVM classifier were then applied to all proteins in the dataset with a 478 corresponding SVM score whose range is 0-1 with 1 being the score of marker proteins. The 479 SVM classifier was then applied to unlabeled data (i.e., non-marker proteins) with 480 corresponding weights applied to each marker class. Each protein was thus classified to one 481 compartment, and any protein whose classification fell below the global median SVM score 482 was reset to ‘unknown’ while the other half of the dataset was considered “predicted” to its 483 corresponding compartment due to their higher SVM scores (Table S3). 484 Unsupervised clustering was performed using the K-means (KM) algorithm implemented in 485 the MLearn function from the MLInterfaces package in Rstudio (version 1.78.0). KM 486 generates k random centroids and includes surrounding data points iteratively such that all 487 data points are included in one of the k clusters and the size of each centroid is minimized. K-488 means clusters were generated with 22 clusters corresponding to number Rof marker groups 489 (Table S3). 490 Targeting signal prediction, annotation, and conditional enrichment analysis 491 P . papillatum protein database was annotated via blast search against CDS of parasitic 492 kinetoplastid Trypanosoma brucei 927 (v66) and free-living Bodo saltans (v66) 493 (https://tritrypdb.org/tritrypdb/app) as well as baker’s yeast Saccharomyces cerevisiae 494 (559292) (https://blast.ncbi.nlm.nih.gov/Blast.cgi), with a threshold of E-5. Metabolic 495 pathway analysis was also performed via GhostKoala (Kanehisa, et al. 2016). 496 Signal P version 6.0 was used for the prediction of signal peptides, using a confidence 497 threshold of >0.9 (Fig. 1C) (Teufel, et al. 2022), with NetGPI 1.1 additionally used on this 498 subset to determine proteins that additionally possessed predicted C-terminal GPI anchors 499 (Gíslason, et al. 2021) (Table S3). Target P 2.0 was used for prediction of mitochondrial 500 target peptides (Armenteros, et al. 2019), with DeepTMHMM (Hallgren, et al. 2022) used for 501 predictions of TMD (Fig.1C) (Table S3). Peroxisomal target signal prediction was conducted 502 using a custom regex script designed by Prof. Fred Oppoerdoes against a broad range of AA 503 combinations with PTS1 determined by the script: [SAGCNP][RHKSNQ][LIVFAMY]$, and 504 PTS2 via ^M.[1,10],[RK][LVI].....[HQ][ILA] (Table S3), which were then manually 505 inspected for specific enzymes of relevance (Table S5-7). DeepLoc2.1 was additionally used 506 to assess protein localization predictions and membranous status (Odum, et al. 2024) (Table 507 S3). 508 Protein enrichment data for media and conditional cultivation (Škodová-Sveráková, et al. 509 2021) was displayed across dataset, including proteins that displayed enrichment status of 510 any capacity (Fig. 1D) (Table S3). 511 Endogenous tagging and P . papillatum microscopy 512 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 17 Endogenous C-terminal tagging of cell lines corresponding to 12 proteins within supervised 513 protein clusters were generated to verify predictions (Table S4). 514 Proteins DIPPA_11651.mRNA.1, DIPPA_15120.mRNA.1, DIPPA_04811.mRNA.1, 515 DIPPA_32825.mRNA.1, DIPPA_00315.mRNA.1 underwent tagging via yellow fluorescent 516 protein, using vector pBA3294 (Akiyoshi, et al. 2025). PacI and AscI restriction sites of 517 pBA3294 were used to insert two ~2 kb homology arms that were amplified from genomic 518 DNA by PCR using KOD one polymerase (Merck). Primer sequences are provided in Suppl. 519 File 2. The first fragment corresponds to downstream of the gene ORF (starting just after its 520 stop codon) surrounded with PacI and NotI restriction sites, while the second fragment 521 corresponds to the 2 kb DNA fragment starting from 2kb upstream of the stop codon and 522 ending just before the stop codon surrounded with NotI and AscI. After cutting the fragments 523 with respective restriction enzymes, the two DNA fragments were ligated into pBA3294 that 524 were cut with PacI and AscI. Plasmids were validated by nanopore whole plasmid sequencing 525 (Plasmidsaurus). Tagging constructs were linearized by NotI, transfected into P . papillatum 526 cells by electroporation, and selected by the addition of 75 µg/mL G418. 527 Cells were pelleted by centrifugation at 1300 x g for 5 min and fixed by 4% formaldehyde 528 solution diluted in PBS for 5 min. Cells were washed with 1 mL PBS twice, resuspended in a 529 small volume of DABCO mounting media (1% w/v 1,4-diazabicyclo[2.2.2]octane, 90% 530 glycerol, 50 mM sodium phosphate pH 8.0) with 100 ng/mL DAPI, and mounted onto glass 531 slides. Images were captured on an Axioimager.Z2 microscope (Zeiss) installed with ZEN 532 using a Hamamatsu ORCA-Flash4.0 camera with 63x objective lenses (1.40 NA). Typically, 533 25 z sections spaced 0.24 μm apart were collected. 534 Proteins DIPPA_07493.mRNA.1, DIPPA_20982.mRNA.1, DIPPA_24150.mRNA.1, 535 DIPPA_16310.mRNA.1, DIPPA_24837.mRNA.1, DIPPA_21158.mRNA.1, 536 DIPPA_16504.mRNA.1 underwent tagging via 3xV5 epitope, using vector pDP011 537 (GeneBank OQ547858) (Faktorová, et al. 2023) (Table S4). A fusion PCR strategy using 538 Q5 High-Fidelity DNA Polymerase (NEB Biolabs, M0491S) was used to design and obtain 539 the above DNA constructs, as described previously (Kaur, et al. 2018). Used primers and 540 product sizes are listed in Suppl. File 2. 1-5 µg of gel-purified and ethanol-precipitated DNA 541 constructs were electroporated into 5 x 107 cells/ml P . papillatum cells as described elsewhere 542 (Kaur, et al. 2018; Faktorová, Kaur, et al. 2020). 24 h after electroporation, transfected cells 543 underwent selection in a 24-well plate at 27°C, under increasing concentrations of 544 hygromycin (100-225 µg/mL). After 3 weeks, transfectants were selected and expanded into 545 a volume of 10 ml before downstream analyses. 546 To address subcellular localization of the tagged proteins, an immunofluorescence assay was 547 performed as described previously (Faktorová, et al. 2023). Briefly, 20 to 30 ml of a log 548 phase culture was harvested by centrifugation at 1,700 x g for 10 min, resuspended in 500 μl 549 of 4% paraformaldehyde (dissolved in sea water), and fixed for 15 min on Superfrost plus 550 slides (Thermo Fisher Sci.) at room temperature. After removing the fixative with 1x PBS, 551 cells were permeabilized in ice-cold methanol for 10 min and rinsed with 1x PBS. From this 552 point on, the slides were kept in a humid chamber. Next, the slides were blocked in 5.5% 553 (w/v) fetal bovine serum in PBS-T for 45 min at room temperature, and the blocking solution 554 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 18 was removed by washing the cells two times with 1x PBS. The slides were incubated with 555 either mouse anti-V5 or rabbit anti-V5 primary antibody diluted (1:500; Thermo Fisher Sci.) 556 in 3% (w/v) bovine serum albumin (Sigma), at 4°C overnight, covered with parafilm. 557 Afterwards, the primary antibody was removed by washing the slides three times with PBS-T 558 and twice with 1x PBS. AlexaFluor555-labelled goat anti-mouse (1:1000; Invitrogen) or 559 AlexaFluor488-labelled goat anti-rabbit (1:1000; Invitrogen) secondary antibody was added 560 and incubated at room temperature for 1 hour in the dark, covered with parafilm. After that, 561 the slides were rinsed three times with PBS-T and twice with 1x PBS. All slides were coated 562 with ProLong Gold Antifade Mountant with DNA Stain DAPI (Life Technol.) and mounted. 563 Samples were imaged with an Olympus BX63 automated fluorescence microscope equipped 564 with an Olympus DP74 digital camera. Pictures were acquired with the cellSens Dimension 565 software (Olympus) and processed through the ImageJ software. 566 567

Acknowledgements

568 We thank A. Zíková (Biology Centre) for the anti-ATP synthase subunit  antibodies. This 569 work is supported by the National Science Foundation BII: Mechanisms of Cellular 570 Evolution DBI-2119963 (to J.W.), the Czech Grant Agency grants 23-06479X and 25-15298S 571 (to J.L.) and a Wellcome Discovery Award 227243/Z/23/Z (to B.A.). 572 573 Author contributions 574 Conceptualization, M.H, J.L and J.G.W.; Methodology, M.H, D.F, B.A, and J.G.W.; 575 Software, M.H, Y.P and T.L.; Validation, M.H.; Formal Analysis, M.H.; Investigation, M.H, 576 O.I, D.F, M.S and B.A.; Data Curation, M.H and Y.P.; Writing – Original Draft, M.H, O.I, 577 D.F, M.S, B.A, J.L, J.G.W.; Writing – Review & Editing, M.H, J.L and J.G.W.; 578 Visualization, M.H.; Supervision, M.H, D.F, J.L and J.G.W.; Project Administration, M.H 579 and J.G.W.; Funding Acquisition, M.H, B.A, J.L and J.G.W. 580 581 Declaration of interests 582 The authors declare no competing interests. 583 584

Reference

list 585 Akiyoshi B, Faktorová D, Lukeš J. 2025. Discovery of unique mitotic mechanisms in 586 Paradiplonema papillatum. bioRxiv:2025.2003.2021.644664. 587 Armenteros J, Salvatore M, Emanuelsson O, Winther O, von Heijne G, Elofsson A, Nielsen 588 H. 2019. Detecting sequence signals in targeting peptides using deep learning. Life Sci 589 Alliance 2. 590 Benz C, Raas MWD, Tripathi P, Faktorová D, Tromer EC, Akiyoshi B, Lukeš J. 2024. On the 591 possibility of yet a third kinetochore system in the protist phylum Euglenozoa. mBio 592 15:e02936-02924. 593 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 19 Billington K, Halliday C, Madden R, Dyer P, Barker A, Moreira-Leite F, Carrington M, 594 Vaughan S, Hertz-Fowler C, Dean S, et al. 2023. Genome-wide subcellular protein map for 595 the flagellate parasite Trypanosoma brucei. Nature Microbiology 8:533-547. 596 Breckels L, Holden S, Wojnar D, Mulvey C, Christoforou A, Groen A, Trotter M, Kohlbacher 597 O, Lilley K, Gatto L. 2016. Learning from heterogeneous data sources: An application in 598 spatial proteomics. PLos Comput Biol 12. 599 Bártulos C, Rogers M, Williams T, Gentekaki E, Brinkmann H, Cerff R, Liaud M, Hehl A, 600 Yarlett N, Gruber A, et al. 2018. Mitochondrial glycolysis in a major lineage of Eukaryotes. 601 Genome Biol and Evol 10:2310-2325. 602 Chou C, Yang R, Chan L, Li C, Sun L, Lee H, Lee P, Sher Y , Ying H, Hung M. 2020. The 603 stabilization of PD-L1 by the endoplasmic reticulum stress protein GRP78 in triple-negative 604 breast cancer. Am J Cancer Res 10:2621-2634. 605 Crook OM, Breckels LM, Lilley KS, Kirk PDW, Gatto L. 2019. A Bioconductor workflow 606 for the Bayesian analysis of spatial proteomics. F1000Res 8:446. 607 Faktorová D, Kaur B, Valach M, Graf L, Benz C, Burger G, Lukeš J. 2020. Targeted 608 integration by homologous recombination enables in situ tagging and replacement of genes in 609 the marine microeukaryote Diplonema papillatum. Environ Microbiol 22:3660-3670. 610 Faktorová D, Nisbet R, Robledo J, Casacuberta E, Sudek L, Allen A, Ares M, Aresté C, 611 Balestreri C, Barbrook A, et al. 2020. Genetic tool development in marine protists: emerging 612 model organisms for experimental cell biology. Nature Methods 17:481-494. 613 Faktorová D, Záhonová K, Benz C, Dacks J, Field M, Lukeš J. 2023. Functional 614 differentiation of Sec13 paralogues in the euglenozoan protists. Open Biol 13:220364. 615 Flegontova O, Flegontov P, Malviya S, Audic S, Wincker P, de Vargas C, Bowler C, Lukeš J, 616 Horák A. 2016. Extreme diversity of diplonemid eukaryotes in the ocean. Curr Biol 26:3060-617 3065. 618 Freitag J, Stehlik T, Stiebler AC, Bölker M. 2018. The obvious and the hidden: Prediction and 619 function of fungal peroxisomal matrix proteins. Subcell Biochem 89:139-155. 620 Gawryluk RMR, Del Campo J, Okamoto N, Strassert JFH, Lukeš J, Richards TA, Worden 621 AZ, Santoro AE, Keeling PJ. 2016. Morphological identification and single-cell genomics of 622 marine diplonemids. Curr Biol 26:3053-3059. 623 Geladaki A, Britovšek N, Breckels L, Smith T, Vennard O, Mulvey C, Crook O, Gatto L, 624 Lilley K. 2019. Combining LOPIT with differential ultracentrifugation for high-resolution 625 spatial proteomics. Nat Commun 10. 626 George EE, Tashyreva D, Kwong WK, Okamoto N, Horák A, Husnik F, Lukeš J, Keeling PJ. 627 2022. Gene transfer agents in bacterial endosymbionts of microbial eukaryotes. Genome Biol 628 Evol 14,7. 629 Gíslason M, Nielsen H, Armenteros J, Johansen A. 2021. Prediction of GPI-anchored proteins 630 with pointer neural networks. Curr Res in Biotech 3:6-13. 631 Güther M, Urbaniak M, Tavendale A, Prescott A, Ferguson M. 2014. High-confidence 632 glycosome proteome for procyclic form Trypanosoma brucei by epitope-tag organelle 633 enrichment and SILAC proteomics. Journal of Proteome Res 13:2796-2806. 634 Hallgren J, Tsirigos K, Pederson M, Armenteros J, Marcatili P, Nielsen H, Krogh A, Winther 635 O. 2022. DeepTMHMM predicts alpha and beta transmembrane proteins using deep neural 636 networks. In. 637 Hammond M, Nenarokova A, Butenko A, Zoltner M, Dobáková E, Field M, Lukeš J. 2020. A 638 uniquely complex mitochondrial proteome from Euglena gracilis. Mol Biol and Evol 639 37:2173-2191. 640 Jirsová D, Licknack TJ, Poh Y-P, Qiu Y , Quan N, Chou T-F, Karr T, Lynch M, Wideman JG. 641 2025. Subcellular proteomics of Paramecium tetraurelia reveals mosaic localization of 642 glycolysis and gluconeogenesis. bioRxiv:2025.2004.2024.650466. 643 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 20 Joseph A, Adhihetty P, Wawrzyniak N, Wohlgemuth S, Picca A, Kujoth G, Prolla T, 644 Leeuwenburgh C. 2013. Dysregulation of mitochondrial quality control processes contribute 645 to sarcopenia in a mouse model of premature aging. PLos One 8. 646 Kanehisa M, Sato Y , Morishima K. 2016. BlastKOALA and GhostKOALA: KEGG Tools for 647 functional characterization of genome and metagenome sequences. J Mol Biol 428:726-731. 648 Kaur B, Valach M, Peña-Diaz P, Moreira S, Keeling P, Burger G, Lukeš J, Faktorová D. 2018. 649 Transformation of Diplonema papillatum, the type species of the highly diverse and abundant 650 marine microeukaryotes Diplonemida (Euglenozoa). Environ Microbiol 20:1030-1040. 651 Kovárová J, Barrett M. 2016. The pentose phosphate pathway in parasitic trypanosomatids. 652 Trends Parasitol 32:622-634. 653 Lax G, Okamoto N, Keeling PJ. 2024. Phylogenomic position of eupelagonemids, abundant, 654 and diverse deep-ocean heterotrophs. ISME J 18. 655 Makiuchi T, Annoura T, Hashimoto M, Hashimoto T, Aoki T, Nara T. 2011. 656 Compartmentalization of a glycolytic enzyme in Diplonema, a non-kinetoplastid 657 Euglenozoan. Protist 162:482-489. 658 Michels P, Bringaud F, Herman M, Hannaert V . 2006. Metabolic functions of glycosomes in 659 trypanosomatids. Biochimi Biophys Acta-Mol Cell Res 1763:1463-1477. 660 Moloney N, Barylyuk K, Tromer E, Crook O, Breckels L, Lilley K, Waller R, MacGregor P. 661 2023. Mapping diversity in African trypanosomes using high resolution spatial proteomics. 662 Nat Commun 14. 663 Morales J, Hashimoto M, Williams T, Hirawake-Mogi H, Makiuchi T, Tsubouchi A, Kaga N, 664 Taka H, Fujimura T, Koike M, et al. 2016. Differential remodelling of peroxisome function 665 underpins the environmental and metabolic adaptability of diplonemids and kinetoplastids. 666 Proc R Soc B-Biolog Sci 283. 667 Mukherjee I, Salcher MM, Andrei A, Kavagutti VS, Shabarova T, Grujčić V , Haber M, 668 Layoun P, Hodoki Y , Nakano SI, et al. 2020. A freshwater radiation of diplonemids. Environ 669 Microbiol 22:4658-4668. 670 Newell S. 1981. Fungi and bacteria in or on leaves of Eelgrass (Zostera marina L.) from 671 Chesapeake Bay. Appl Environ Microbiol 41:1219-1224. 672 Obiol A, Giner CR, Sánchez P, Duarte CM, Acinas SG, Massana R. 2020. A metagenomic 673 assessment of microbial eukaryotic diversity in the global ocean. Mol Ecol Resour 20. 674 Odum M, Teufel F, Thumuluri V , Armenteros J, Johansen A, Winther O, Nielsen H. 2024. 675 DeepLoc 2.1: multi-label membrane protein type prediction using protein language models. 676 Nucleic Acids Res 52:W215-W220. 677 Opperdoes FR, Michels PA. 1993. The glycosomes of the Kinetoplastida. Biochimie 75:231-678 234. 679 Orsburn B. 2021. Proteome discoverer-A community enhanced data processing suite for 680 protein informatics. Proteomes 9. 681 Porter D. 1973. Isonema papillatum sp. n., a new colorless marine flagellate: A light- and 682 electronmicroscopic study. J Protozool 20:351-356. 683 Prokopchuk G, Korytár T, Juricová V , Majstorovic J, Horák A, Šimek K, Lukeš J. 2022. 684 Trophic flexibility of marine diplonemids-switching from osmotrophy to bacterivory. ISME J 685 16:1409-1419. 686 Richards T, Eme L, Archibald J, Leonard G, Coelho S, de Mendoza A, Dessimoz C, Dolezal 687 P, Fritz-Laylin L, Gabaldon T, et al. 2024. Reconstructing the last common ancestor of all 688 eukaryotes. Plos Biol 22. 689 Tashyreva D, Faktorová D, Horák A, Lukeš J, Archibald J, Oatley G, Sinclair E, Santos C, 690 Paulini M, Aunin E, et al. 2025. The genome sequences of the diplonemid protist Rhynchopus 691 euleeides YPF1915 and its bacterial endosymbiont Candidatus Syngnamydia salmonis 692 (Chlamydiota). Wellcome Open Res 10. 693 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 21 Tashyreva D, Faktorová D, Stříbrná E, Horák A, Lukeš J, Archibald JM, Oatley G, Sinclair E, 694 Aunin E, Gettle N, et al. 2025. The genome sequences of the diplonemid protist Diplonema 695 japonicum YFP1604 and its bacterial endosymbiont Ca. Cytomitobacter primus and Ca. 696 Nesciobacter abundans. 10. 697 Tashyreva D, Simpson A, Prokopchuk G, Škodová-Sveráková I, Butenko A, Hammond M, 698 George E, Flegontova O, Záhonová K, Faktorová D, et al. 2022. Diplonemids-a review on 699 "new" flagellates on the oceanic block. Protist 173:125868. 700 Tashyreva D, Týč J, Horák A, Lukeš J. 2023. Ultrastructure and 3D reconstruction of a 701 diplonemid protist (Diplonemea) and its novel membranous organelle. mBio 14:e01921-702 01923. 703 Tashyreva D, V otýpka J, Yabuki A, Horák A, Lukeš J. 2025. Description of new diplonemids 704 (Diplonemea, Euglenozoa) and their endosymbionts: Charting the morphological diversity of 705 these poorly known heterotrophic flagellates. Protist 177. 706 Teufel F, Armenteros J, Johansen A, Gíslason M, Pihl S, Tsirigos K, Winther O, Brunak S, 707 von Heijne G, Nielsen H. 2022. SignalP 6.0 predicts all five types of signal peptides using 708 protein language models. Nat Biotechnol 40:1023-1025. 709 Valach M, Benz C, Aguilar L, Gahura O, Faktorová D, Zíková A, Oeffinger M, Burger G, 710 Gray M, Lukeš J. 2023. Miniature RNAs are embedded in an exceptionally protein-rich 711 mitoribosome via an elaborate assembly pathway. Nucleic Acids Res 51:6443-6460. 712 Valach M, Léveillé-Kunst A, Gray MW, Burger G. 2018. Respiratory chain Complex I of 713 unparalleled divergence in diplonemids. J Biol Chem 293:16043-16056. 714 Valach M, Moreira S, Petitjean C, Benz C, Butenko A, Flegontova O, Nenarokova A, 715 Prokopchuk G, Batstone T, Lapébie P, et al. 2023. Recent expansion of metabolic versatility 716 in Diplonema papillatum, the model species of a highly speciose group of marine eukaryotes. 717 BMC Biol 21. 718 Záhonová K, Lukeš J, Dacks JB. 2025. Diplonemid protists possess exotic endomembrane 719 machinery, impacting models of membrane trafficking in modern and ancient eukaryotes. 720 Curr Biol 35:1508-1520.e1502. 721 Škodová-Sveráková I, Záhonová K, Juricová V , Danchenko M, Moos M, Baráth P, 722 Prokopchuk G, Butenko A, Lukáčová V , Kohútová L, et al. 2021. Highly flexible metabolism 723 of the marine euglenozoan protist Diplonema papillatum. BMC Biol 19:251. 724 Šubrtová K, Panicucci B, Zíková A. 2015. ATPaseTb2, a unique membrane-bound FoF1-725 ATPase component, is essential in bloodstream and dyskinetoplastic trypanosomes. PLoS 726 Pathog 11:e1004660. 727 728 729 730 731 732 733 734 735 736 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 22 Figure Legends 737 738 Fig. 1: Clustered protein predictions of Paradiplonema papillatum align with predicted 739 protein features and clarify conditional enrichment trends. A Neighbor-average imputed 740 t-SNE of dataset displaying clustered predictions displayed for 2,797 proteins across 22 cell 741 compartments. Predictions were generated via support vector modelling conducted on 742 fractional profiles of marker proteins, applied to the remaining dataset. B Selected fractional 743 abundances of marker proteins across one replicate of this experiment, representing distinct 744 profiles that facilitate predictive clustering (SUP, Supernatant). C Software prediction for 745 protein features of signal peptides, transmembrane domains and mitochondrial target peptides 746 across dataset, demonstrating accumulation across certain defined compartments. D Proteins 747 determined to be enriched in varying nutrient media (Diplo or Hemi) or cultivation conditions 748 (aerobic or anaerobic) from a conditional study of P . papillatum (Škodová-Sveráková, et al. 749 2021). Additional information for all proteins available in Table S1 and S3. 750 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 23 751 752 Fig. 2: Endogenous tagging of novel proteins confirms supervised cluster predictions. 753 Tagged proteins highlighted (black) among relevant predicted clusters, resolved on neighbor-754 averaged imputed t-SNE. Individual cell lines were generated via endogenous tagging and 755 imaged through fluorescence microscopy for comparison with the compartment relevant 756 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 24 protein was predicted to. Merged microscopy images showing protein signal (green) merged 757 with nuclear and mitochondrial DNA (blue). All imaged cells are oriented with their apical 758 regions facing right and posterior facing left, cell membrane outlines are traced for all images 759 except for L, showing only trace of the papilla, which lacks signal. Scale bar represents 5m. 760 Proteins A, E, G and J are resolved in zero and neighbor-averaged imputed t-SNE in Suppl. 761 Fig. 3, for which separate channels of each cell line are also shown. Further information on 762 cell lines is available in Table S4. 763 764 Fig. 3: Secreted Carbohydrate Active Enzymes (CAZymes) primarily localized on cell 765 membrane and lysosomes. Distribution of signal peptide-enriched CAZymes, which are 766 predicted with high confidence on neighbor-average imputed t-SNE, corresponding to 767 highlighted cluster predictions of the cell membrane (A), endocytic membrane trafficking 768 (B), lysosome (C) and cytosol (D). Proteins of cell membrane (A) have schematic 769 representations showing software predictions for signal peptides, transmembrane domains 770 (TMD) and/or GPI attachment sites, which demonstrate extracellular exposure of CAZyme 771 domains in accordance with conventional membrane topology. Bordered outlines indicate 772 separate enzymatic reactions for CAZymes, with carbohydrate substrates and products in 773 black. Further information on CAZymes of P . papillatum is available in Table S5. 774 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 25 775 Fig. 4: Metabolic reconstruction of glycolysis/gluconeogenesis and pentose phosphate 776 pathway demonstrates altered glucose metabolism in P . papillatum. Localization of 777 relevant enzymes across glycolysis/gluconeogenesis (A) and pentose phosphate pathway (B), 778 resolved on neighbor-average imputed t-SNE (C) with relevant localization clusters 779 highlighted. Peroxisomal target sequences (PTS), mitochondrial target peptides (mTP) and 780 transmembrane domains (TMDs) are indicated. Proteins previously localized via anti-sera 781 immunolocalizations indicated with *, metabolite shunts between two pathways indicated 782 with dotted arrows. Split coloring of proteins represents their manual designations to the 783 cytosol (24,25,38) or indicates the possibility of glycosomal dual localizations between the 784 cytosol and glycosomes (1,2,5,9,12,20), based on inspection of fractionation profiles (Suppl. 785 Fig. 6) and targeting signals. Protein numbers highlighted in white represent those only 786 resolved on zero and neighbor-average imputed t-SNE (Suppl. Figure 5). Further information 787 is available in Table S6. 788 789 Supplementary Figures, Files and Tables 790 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 26 791 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 27 Suppl. Fig. 1: Immunoblot analysis used to resolve fractional distribution across 792 triplicate samples. 10g of protein has been loaded for each fraction generated via 793 differential centrifugation in addition to the initial cell lysate (CL). ATP synthase-β antibody 794 used at 1:10,000 ratio (A), Grp75 antibody used in 1:1,000 (B) and Grp78 (C) which displays 795 non-specific signal. Unlysed cells (UC), Supernatant (S). Marker band molecular weights 796 (kDa) indicated in dark grey on the leftmost lane of blots. 797 798 Suppl. Fig. 2: Neighbor-averaged and zero imputed t-SNE of clustered protein 799 predictions, protein features and conditional enrichment of dataset. A Full dataset 800 displaying clustered predictions displayed for 4,780 proteins across 22 cell compartments. 801 Predictions were generated via support vector modelling conducted on fractional profiles of 802 marker proteins, applied to the remaining dataset. B Selected fractional abundances of marker 803 proteins across one replicate of this experiment, representing distinct profiles that facilitate 804 predictive clustering (SUP, Supernatant). C Software prediction for protein features of signal 805 peptides, transmembrane domains and mitochondrial target peptides across dataset, 806 demonstrating accumulation across certain defined compartments. D Proteins determined to 807 be enriched in varying nutrient media (Diplo or Hemi) or cultivation conditions (aerobic or 808 anaerobic) from a conditional study of P . papillatum (Škodová-Sveráková, et al. 2021). 809 Additional information for all proteins available in Table S1 and S3. 810 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 28 811 812 Suppl. Fig. 3: Neighbor-averaged and zero imputed t-SNE of endogenous tagged cell 813 lines. Tagged proteins highlighted (black) among relevant predicted clusters, resolved on 814 neighbor-averaged and zero imputed t-SNE. Individual cell lines were generated via 815 endogenous tagging and imaged through fluorescence microscopy for comparison with 816 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 29 compartment relevant protein was predicted to. In descending order, panels depict phase 817 contrast, epitope signal (green), nuclear and mitochondrial DNA (blue), with merges below 818 additionally displaying cell membrane outlines traces for all images, excepting L, which 819 shows only trace of the papilla, which lacks epitope signal. All imaged cells are oriented with 820 their apical regions facing right and posterior facing left. Scale bar represents 5m. Further 821 information on cell lines is available in Table S4. 822 823 Suppl. Fig. 4: Cell membrane cluster shows enrichment of proteins possessing both 824 predicted signal peptide (SP) glycosylphosphatidylinositol (GPI) anchors. t-SNE imputed 825 via neighbor-averaging (A) as well as zeroed dataset (B). Signal peptides predicted via Signal 826 P 6.0 with a confidence threshold greater than 0.9, in tandem with NetGPI 1.1 used for GPI 827 predictions. Further information is available in Table S3. 828 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 30 829 Suppl. Fig. 5: Metabolic reconstruction of glycolysis/gluconeogenesis and pentose 830 phosphate pathway on neighbor-averaged and zero imputed t-SNE. Localization of 831 relevant enzymes across glycolysis/gluconeogenesis (A) and pentose phosphate pathway (B), 832 resolved on neighbor-average and zero imputed t-SNE (C) with relevant localization clusters 833 highlighted. Peroxisomal target sequences (PTS), mitochondrial target peptides (mTP) and 834 transmembrane domains (TMDs) are indicated. Proteins previously localized via anti-sera 835 immunolocalizations indicated with *, metabolite shunts between two pathways indicated 836 with dotted arrows. Split coloring of proteins represents their manual designations to the 837 cytosol (24,25,38) or indicates the possibility of glycosomal dual localizations between the 838 cytosol and glycosomes (1,2,5,9,12,20), based on inspection of fractionation profiles (Suppl. 839 Fig. 6) and targeting signals. Further information is available in Table S6. 840 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 31 841 Suppl. Fig. 6: Fractional and schematic analysis of specific enzymes mediating 842 carbohydrate metabolism. Schematic depiction of DIPPA_21987, phosphofructokinase 1 843 showing phosphofructokinase (PFK) domains, transmembrane domain (TMD) and 844 Peroxisomal Target Signal along with fractional analysis (A), along with fractional profiles of 845 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint 32 relevant enzymes across glycolysis/gluconeogenesis (B) compared to marker proteins of the 846 cytosol, glycosomes and endocytic membrane trafficking markers. 847 848 Suppl. Fig. 7: Metabolic reconstruction of Amino Acid (AAs) breakdown for 849 incorporation in the TCA cycle, localized across cell compartments. AAs and metabolites 850 of the TCA cycle are indicated in bold. Propanoate metabolism, which involves intermediates 851 of certain AA digestion, is also depicted. Split coloring indicates manual annotation for 852 specific enzymes based on certain target peptides or candidate function, on top, versus 853 contrasting predictions below (eg. Enzyme 2: proline dehydrogenase, we designate to the 854 mitochondrion, despite low confidence predictions to the nucleus). Further information is 855 available in Table S7. 856 Suppl. File 1: Tables S1-7. 857 Suppl. File 2: Primer sequences used for endogenous tagging of P . papillatum. 858 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted July 20, 2025. ; https://doi.org/10.1101/2025.07.16.665091doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-pdf

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-21T05:10:58.409756+00:00
License: CC-BY-4.0