Genetic architecture of the tomato fruit lipidome; new insights link lipid and volatile compounds

preprint OA: closed CC-BY-NC-ND-4.0
📄 Open PDF Full text JSON View at publisher
Full text 137,972 characters · extracted from oa-pdf · 7 sections · click to expand

Abstract

37 Tomato (Solanum lycopersicum L.) fruit flavor is determined by a combination of 38 multiple volatile compounds, including several derived from lipids and fatty acids. 39 Although fruit flavor has been intensively studied, the linkage between lipid 40 metabolism and flavor remains largely undefined. Here, we performed a genome-41 wide association study (GWAS) and QTL mapping for the fruit lipid content from 550 42 tomato accessions and 107 backcross inbred lines (BILs) in two consecutive 43 seasons. Over 130 lipid compounds were identified and mapped, allowing for the 44 identification of over 600 metabolic QTL (mQTL). We further described and validated 45 candidate genes associated with lipid content. Among them is a lipase-like protein 46 (TomLLP) whose function was validated in vivo using overexpression lines in tomato 47 and knockout mutants in Arabidopsis. We also identified functions for three 48 enzymes: a class III lipase (Sl-LIP8), a cyclopropane-fatty-acyl-phospholipid 49 synthase (CFAPS1), and Lipoxygenase C (TomLoxC). By utilizing knockout lines for 50 CFAPS1 and CRISPR-Cas9 loss-of-function lines for Sl-LIP8 and TomLoxC, we 51 demonstrated the functional importance of these enzymes in fruit lipid metabolism. 52 Our study provides a comprehensive analysis of the tomato fruit lipidome and 53 insights into key genes that shaped the natural variation in tomato lipid content and 54 their links to flavor-associated volatile compounds. 55

Introduction

56 Tomato (Solanum lycopersicum L.) is one of the most economically important crops 57 in the world (Knapp et al., 2004), serving as an important source of micronutrients for 58 the human diet. In recent years, great progress has been made in understanding a 59 wide range of metabolic traits associated with fruit compositional quality (Klee & 60 Tieman, 2018). Lipids in plants have essential structural roles within cells as they are 61 major constituents of membranes and constitute much of the cuticle layer that 62 protects plant outer surfaces (Yeats et al., 2012; Fernandez-Moreno et al., 2017; 63 García-Coronado et al., 2022). In addition, lipids act as signaling molecules. One of 64 the main defense hormones, jasmonic acid, is derived from linolenic acid (Vick & 65 Zimmerman, 1984) and fatty acids have been shown to directly induce the 66 expression of defense-related R genes (Chandra-Shekara et al., 2007). Moreover, 67 lipids serve as precursors for many compounds that contribute to flavor, that 68 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 3 regulates attraction, and repulsion of herbivores and ultimately, seed distribution 69 (Wang et al., 2001; Chen et al., 2004; Tieman et al., 2012; Garbowicz et al., 2018). 70 In addition, so-called fatty-acid-derived volatile organic compounds (FA-VOCs) are 71 important contributors to human liking of food crops (Schwab et al., 2008; Tieman et 72 al., 2017; Cortina et al., 2018). In tomato, several FA-VOCs, including multiple C5, 73 C6, C7, C8, and C10 volatiles, are significantly linked with overall liking and flavor 74 intensity (Chen et al., 2004; J. Zhang et al., 2015; Tieman et al., 2017). Even within 75 the S. lycopersicum species there is tremendous variation in FA-VOC content; levels 76 of these chemicals within ripe fruit can vary by several orders of magnitude (Tieman 77 et al., 2017). Knowledge of how these volatiles are synthesized and how the 78 pathways are regulated is important for the development of varieties with superior 79 flavor that do not compromise plant defense. 80 Genetic mapping has been used to characterize and clone a large number of 81 qualitative and quantitative traits in tomatoes including pathogen resistance (Martin 82 et al., 1993), fruit ripening (Manning et al., 2006), β -carotene formation (Ronen et al., 83 2000), fruit morphology and size (Frary et al., 2000; J. Liu et al., 2002; Xiao et al., 84 2008; Rodríguez-Leal et al., 2017). Genetic mapping of metabolite abundances 85 enables the identification of metabolite QTL and potentially provides insights into the 86 complex mechanisms underlying the regulation of metabolic pathways (Chen et al., 87 2004; Grandillo et al., 2007; H. Li et al., 2013; Wu et al., 2016, 2018; Alseekh et al., 88 2015; Matsuda et al., 2015; Luo, 2015; Garbowicz et al., 2018; Luzarowska et al., 89 2020; Brouckaert et al., 2023). Advances in metabolomic profiling coupled with an 90 increase in genomic resources has enabled the identification of numerous mQTLs in 91 tomato, for both primary (Causse, 2004; Overy, 2004; Schauer et al., 2006, 2008; 92 Toubiana et al., 2012, 2015) and secondary metabolites (Rousseaux et al., 2005; 93 Tieman et al., 2006; Minutolo et al., 2013; Rambla et al., 2013, 2016; Alseekh et al., 94 2015, 2017; Schilmiller et al., 2015; Szyma ń ski et al., 2020). However, given the 95 relatively low resolution reached using this approach, cloning of the causal genes 96 can be challenging. Genome-wide association studies (GWAS) provide better QTL 97 resolution (Mitchell-Olds, 2010; Korte & Farlow, 2013). The successful identification 98 of complex traits in various crop species has been achieved by combining linkage 99 QTL mapping using biparental populations and GWAS, which helps overcome the 100

Limitations

of each approach (Wen et al., 2016; Wu et al., 2016; Garbowicz et al., 101 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 4 2018) In tomato, several GWAS have been conducted, leading to, among other 102 insights, an understanding of the history of tomato breeding and domestication. 103 Examples include the 100-fold increased size of the modern tomato relative to its 104 ancestor (Lin et al., 2014) as well as identification of QTL controlling morphological 105 traits (Shirasawa et al., 2013), volatile compounds (J. Zhang et al., 2015; Tieman et 106 al., 2017), and fruit metabolites (Sauvage et al., 2014; Zhu et al., 2018; X. Li et al., 107 2020). Nevertheless, utilization of the above approaches to investigate natural 108 variation in the tomato fruit lipidome has not been thoroughly investigated. 109 Here, we describe large-scale lipid profiling of fruit pericarp tissue extracts of a 110 GWAS panel as well as a S. neorickii backcross inbred line (BIL) population . We 111 identified 436 and 175 mQTL using GWAS and linkage mapping, respectively. We 112 identified 384 candidate genes associated with lipid content in fruit. To provide 113 deeper insights into lipid metabolism in tomato fruit and the relationship to volatile 114 compounds, we selected five genes for characterization at the molecular level, 115 namely: acetyl-coenzyme A synthetase ( Solyc06g008920), Sl-LIP8 116 (Solyc09g091050) encoding a class III lipase, CFAPS1 (Solyc09g090510) encoding 117 a putative cyclopropane-fatty-acyl-phospholipid synthase, a lipase-like protein 118 (TomLLP, Solyc03g119980) , and TomLoxC ( Solyc01g006450), encoding 119 lipoxygenase C. To do so, we created and analyzed CRISPR-Cas9 knockout lines 120 for CFAPS1, Sl-LIP8, and TomLoxC, and overexpression lines of TomLLP. In 121 addition, we analyzed mutant lines of the TomLoxC Arabidopsis orthologue (CSE, 122 At1g52760). We further performed a correlation-based network analysis of lipids, 123 volatiles, and RNA-Seq expression data across 340 tomato accessions. We were 124 able to identify an additional 85 potential lipid-related genes and provide new insights 125 regarding tomato fruit metabolism. 126

Results

127 A genome-wide lipidomic profile for tomato fruit 128 To evaluate the genetic underpinnings of the tomato fruit lipidome, we performed 129 family-based QTL mapping using 107 BILs in addition to GWAS on 550 tomato 130 accessions grown in two harvest seasons (see Supplemental Fig. S1 and 131 Supplemental Data Sets S1-6). Using high throughput UHPLC-MS, we were able to 132 detect and quantify 138 lipid compounds (Fig. 1A, Supplemental Data Set S6), 133 classified into six major classes: seven diacylglycerols (DAGs), 19 134 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 5 digalactosyldiacylglycerols (DGDGs), 15 monogalactosyldiacylglycerols (MGDGs), 135 24 phosphatidylcholines (PCs), 14 phosphatidylethanolamines (PEs), and 59 136 triacylglycerols (TAGs). We also quantified 2179 distinct lipophilic compounds from 137 two populations (see Supplemental Data Sets S1-4). For the GWAS panel, a 138 principal component analysis (PCA) of all annotated lipid compounds revealed two 139 main groups, consistent with the evolutionary and domestication relationship; tomato 140 wild species including S. pimpinellifolium, and domesticated red-fruited accessions 141 largely overlapped with S. lycopersicum var. cerasiforme (PCA, Fig. 1B). The 142 abundance of different lipid classes was variable across the three groups (Fig. 1C, 143 Supplemental Fig. S1). The wild tomato species had higher lipid levels compared to 144 cultivated varieties, including glycero-, galacto-, and phospholipids. Moreover, most 145 detected non-annotated lipophilic compounds were more abundant in older tomato 146 varieties compared to domesticated ones (Fig. 1, B-C). In addition to exploring the 147 lipidome profiles in the GWAS panel, we quantified 83 lipids, and 826 distinct 148 lipophilic compounds in fruits from a 107-member S. neorickii BIL (Supplemental 149 Data Sets S3 and S4). As expected, the lipid profiles exhibited a large variance 150 across the S. neorickii lines (Supplemental Fig. S4). 151 Genetic foundation of the tomato fruit lipidome 152 In order to uncover the genetic components of lipid abundances in fruit, GWAS was 153 conducted in two consecutive seasons. We mapped the abundance of 134 154 annotated and 675 unidentified lipophilic compounds using 1.8 million SNPs (Tieman 155 et al., 2017). In addition, we used 16,526 SNPs generated by genotype by 156 sequencing (GBS) analysis (Zemach et al., 2023) on the GWAS population. We 157 performed QTL mapping on 83 annotated and 826 unidentified lipophilic compounds 158 from BILs using the 10K SolCAP single nucleotide polymorphism chip for linkage 159 mapping. 160 The GWAS identified 436 significant mQTL (p ≤ 1.0E-04; Supplemental Data Set 161 S6). Linkage mapping using the S. neorickii BILs identified an additional 175 162 significant mQTL in homozygous and heterozygous lines (p ≤ 0.05; Supplemental 163 Data Set S6). Visualizing the distribution of the mQTL in both (GWAS and BILs) 164 populations pointed to a few hotspots for the regulation of a large number of lipids 165 and lipophilic compounds in the tomato genome (Fig. 2; Supplemental Fig. S5), 166 particularly on chromosomes 1, 2, 3, and 6; these represent 12.8%, 15.5%, 11.4%, 167 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 6 and 17.3% of the detected loci, respectively (Fig. 2, Supplemental Data Set S6). The 168 vast majority of detected mQTL are for non-annotated lipophilic compounds followed 169 by glycero- and galactolipids. Taken together, we were able to identify 34 conserved 170 lipid mQTL across the tomato genome combining both GWAS and linkage mapping 171 approaches. 172 Key genes controlling lipid metabolism in tomato fruit 173 Having identified mQTL associated with lipid composition in fruit, we investigated 174 several potentially causative genes. Among the identified loci, 384 candidate genes 175 were predicted, based on sequence homology to genes that participate in fatty acid 176 and lipid metabolism (Supplemental Data Set S6). For example, phospholipase D 177 (Solyc04g082000), is a potential causal gene for the observed changes in the level 178 of phospho-, glycero- and galactolipids in the QTL on chromosome 4 detected by 179 GWAS in two consecutive seasons. Furthermore, an mQTL located on chromosome 180 3 identified through both GWAS and BIL mapping, contains a gene annotated as a 181 class III TAG lipase, Solyc03g123750 (SlLIP2) . Another gene, Solyc06g008920, 182 mapped only in the BIL population (Supplemental Fig. S6), affecting a wide range of 183 long-chain saturated and unsaturated fatty acids, encodes an acyl-CoA 184 synthetase/AMP acid ligase II. 185 A mQTL detected on chromosome 9 (Fig. 3A) contains an interval of about 50 186 genes, including Sl-LIP8 ( Solyc09g091050), which regulates the biosynthesis of 187 short-chain FA-VOCs by cleaving 18:2 and 18:3 acyl groups from glycerolipids (X. Li 188 et al., 2020) Next, based on the lead SNP, accessions representing the allelic 189 variation was identified for each of the mapped lipids (Fig. 3B). The level of these 190 lipids was plotted according to their domestication status: from S. pimpinellifolium, 191 considered to be the ancestor of the modern tomato, S. lycopersicum var. 192 cerasiforme, to the domesticated tomato S. lycopersicum , as well as other wild 193 tomato species (Fig. 3C). Of note, the levels of lipids, particularly 194 phosphatidylethanolamine, in S. pimpinellifolium differ from the cultivated tomato 195 group. Expression analysis using RNA-Seq data from fruits of the entire GWAS 196 population previously reported (Zhu et al., 2018) showed that Sl-LIP8 is highly 197 expressed in wild species compared to S. lycopersicum var. cerasiforme and 198 cultivated tomato (Fig. 3D). 199 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 7 In order to investigate the role of Sl-LIP8 in the tomato fruit metabolome, we 200 generated Sl-LIP8 CRISPR-Cas9 knockout (KO) lines in the Fla. 8059 background 201 and characterized their metabolites. LC-MS analysis of lipids in fully ripe fruit from 202 KO lines and Fla. 8059 demonstrated notable alterations across various lipid classes 203 (Supplemental Data Set S7). The most substantial changes were observed in the 204 profiles of glycerolipids and phospholipids. Specifically, major differences were 205 observed for polyunsaturated glycerolipids (Fig. 3E) confirming the influence of Sl-206 LIP8 on lipid metabolism in tomato fruit. 207 We also detected another prominent mQTL located on chromosome 9 with a 208 significant impact on the levels of phospho- and glycerolipids using both the GWAS 209 mapping (Fig. 4A) and whole genome sequencing (WGS) SNPs (Fig. 4B). The 210 mQTL harbors two genes encoding putative cyclopropane-fatty-acyl-phospholipid 211 synthases ( Solyc09g090500 and Solyc09g090510) . One of these genes, 212 Solyc09g090510 (CFAPS1), is expressed in developing fruits 213 (https://bar.utoronto.ca/eplant_tomato/). Accessions responsible for the allelic 214 variation for each of the mapped lipids was identified based on the GBS lead SNP 215 (Fig. 4C) and WGS (Fig. 4D). Lipid levels were plotted along the tomato 216 domestication track, spanning from S. pimpinellifolium to S. lycopersicum var. 217 cerasiforme, and wild tomato species. Notably, the lipid levels, particularly TAG 54:3 218 and TAG 52:1 across S. pimpinellifolium and different wild tomato accessions, differ 219 from those found in the cultivated tomato group (Fig. 4E). Utilizing RNA-Seq data 220 obtained from fruits of the previously characterized GWAS population (Zhu et al., 221 2018) showed that CFAPS1 mRNA was more abundant in cultivated tomatoes (Fig. 222 4F). 223 We created CFAPS1 CRISPR-Cas9 KO lines and identified mutants with a deletion 224 of 166 bp and an insertion of 19 bp in the promoter region and coding sequence 225 (exon 1) (Supplemental Fig. S7), which resulted in a premature translation stop at 226 the beginning of the protein. LC-MS analysis of lipidomic profiles in fully ripened fruit 227 from KO and control (Fla. 8059) lines showed significant differences in the contents 228 of various galactolipids (Fig. 4G and Supplemental Data Set S8). In addition, we 229 noticed significant changes in the levels of short-chain (C5, C6) FA-VOCs (Fig. 4H) 230 as well as longer-chain (C7, C8) FA-VOCs (Fig. 4H and Supplemental Data Set S8). 231 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 8 In addition to the above examples, both GWAS and linkage mapping identified a 232 major mQTL influencing the levels of phospholipids and galactolipids at the end of 233 chromosome 3 (Fig. 5, A-E). Linkage disequilibrium analysis of GWAS and 234 recombination breakpoints in a S. neorickii BIL population narrowed the mQTL 235 region to 0.33 Mb. This region contains 40 genes, including TomLLP 236 (Solyc03g119980) , annotated as a lipase-like protein. The population was divided 237 into two haplotypes based on the lead SNP. These two haplotypes exhibited 238 significantly different TAG 50:3 content (Fig. 5C). The TomLLP orthologue in 239 Arabidopsis thaliana, CSE (At1g52760), encodes a caffeoyl-shikimate esterase that 240 has been demonstrated to be involved in lignin biosynthesis (Vanholme et al., 2013). 241 CSE has a dual enzymatic activity as both a monoacylglycerol acyltransferase and 242 an acyl hydrolase (W. Gao et al., 2010; Vijayaraj et al., 2012). TomLLP exhibits a 243 strong phylogenetic relationship with caffeoyl shikimate esterase from other plant 244 species (Supplemental Fig. S8 and Supplemental Data Set S9). Next, we selected 245 five neorickii BILs covering the QTL interval and measured transcript levels in the 246 same fruit materials used to perform the lipid analysis. We detected higher 247 expression of TomLLP in BILs harboring the S. neorickii allele compared to the BILs 248 harboring the cultivated allele from the cv. TA209 (Supplemental Fig. S9). To 249 validate our finding, we generated a TomLLP overexpression (OE) line in the M82 250 tomato background carrying the S. neorickii allele driven by the figwort mosaic virus 251 35S promoter (Fig. 5F and Supplemental Data Set S10). Lipid profiling of red ripe 252 fruits from T2 plants showed significant differences in lipid contents between the 253 overexpression line and wild type (Fig. 5G, Supplemental Data Set S11) indicating a 254 role for TomLLP in lipid metabolism and supporting its role as the causative gene 255 associated with the QTL. Furthermore, leaves lipid profiling of three loss-of-function 256 Arabidopsis CSE (At1g52760) lines demonstrated alterations in the level of multiple 257 lipids belonging to six lipid classes (Fig. 6, A-B, Supplemental Data Set S12). Taken 258 together, the results support a role for TomLLP in tomato fruit lipid metabolism. 259 Lipoxygenase, a key player affecting tomato fruit lipidome 260 A robust association between lipids and the locus harboring TomLoxC 261 (Solyc01g006540) was identified by GWAS (Fig. 7A). Previously TomLoxC was 262 shown to be involved in FA-VOCs production (L. Gao et al., 2019). In order to 263 identify expression differences across the GWAS population, we analyzed previously 264 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 9 published TomLoxC expression data from 340 accessions (Zhu et al., 2018) and 265 performed eGWAS using 1.8 million SNPs data points (Tieman et al., 2017).A cis-266 eQTL was detected for TomLoxC (Fig. 7B) indicating that this gene likely underlies 267 the variation observed in the lipid levels mapped to the TomLoxC locus. Accessions 268 responsible for driving the allelic variation for each of the mapped lipids were 269 identified based on the GBS lead SNP (Fig. 7C), and mapped variation in the 270 TomLoxC transcript level based on the WGS lead SNP (Fig. 7D). When plotting the 271 median expression level and lipid level associated with the lead SNP we observed 272 that low gene expression coincides with a high level of lipids (Fig. 7, E-F). 273 Interestingly, the levels of some of the lipids mapped to the TomLoxC locus were 274 higher in S. pimpinellifolium than in cultivated tomatoes (Fig. 7E). These results are 275 consistent with the expression of the mapping panel. 276 To provide further insight into the role of TomLoxC in lipid metabolism and 277 biologically validate our results, we generated TomLoxC CRISPR-Cas9 KO lines in 278 the M82 background. LC-MS lipidomic profiling of red ripe fruit from KO lines and 279 M82 revealed significant changes in a wide range of lipid classes (Fig. 7, G-H, 280 Supplemental Data Set S13). The largest changes were observed in galactolipids 281 followed by phospho- and glycerolipids. Specifically, large differences were observed 282 for polyunsaturated galactolipids (e.g. DGDG 32:2, DGDG 32:3, DGDG 34:3; DGDG 283 36:6, MGDG 34:2, MDGD 34:5, MGDG 36:5) (Fig. 7H). These observations provide 284 further support for an important role for TomLoxC on fruit lipid abundance. 285 Transcript-metabolite correlation-based network 286 In recent years, large-scale gene expression and metabolite data sets have been 287 published for tomato fruit. We used the RNA-Seq data for 340 tomato accessions 288 (Zhu et al., 2018), volatile profiling for 398 tomato accessions (Tieman et al., 2017), 289 and the lipidome data on the GWAS panel obtained in both seasons generated in 290 this study. We constructed a transcript-metabolite correlation-based network using 291 these data (Fig. 8). The result showed that the large majority of the overall 292 correlations were positive (97%) (Supplemental Data Set S14). Negative correlations 293 were observed between phospholipids, glycerolipids, and genes involved in lipid 294 degradation processes and fatty acid oxidation. FA-VOCs were correlated with 295 genes taking part in lipid biosynthesis, signaling, and degradation processes. Genes 296 with lipoxygenase activity showed a connection with different classes of 297 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 10 polyunsaturated lipids. For example, TomLoxC , involved in FA-VOCs biosynthesis 298 (Chen et al., 2004; Tieman et al., 2017; Klee & Tieman, 2018) was positively 299 correlated with phospholipids PC 36:2 ( r=0.32), and PE 36:5 ( r=0.33). Of interest , 300 TomLLP showed negative interaction with the FA-VOC hexyl alcohol ( r=-0.31). 301 Finally, four genes ( Solyc02g071700, Solyc02g077430, Solyc06g053900, and 302 Solyc08g063090) showed positive correlations with phospho-, galacto-, and 303 glycerolipids and these genes fell within the mQTL intervals we identified as 304 candidate genes in this study (Supplemental Data Set S6). 305

Discussion

306 In recent years, the application of many metabolic GWAS studies has focused on 307 dissecting the genetic architecture underlying regulation and biosynthesis of 308 metabolic pathways (Alseekh et al., 2021). However, application of this approach to 309 investigate the fruit lipidome of tomato and its link to fruit flavor has not been 310 performed. Here, through a combination of GWAS and linkage analysis, we identified 311 over 600 mQTL and many genes that affect lipid composition in tomato fruit. In order 312 to validate the results, we determined the functions of four lipid biosynthesis genes 313 by transgenic analysis. 314 Metabolite GWAS has become increasingly common in recent years, particularly for 315 lipids (Fang & Luo, 2019; Luzarowska et al., 2020; Brouckaert et al., 2023). These 316 studies have revealed novel associations between structural genes and lipids. In 317 tomato, limited work based on the natural variation in a bi-parental population has 318 been carried out to study cuticle lipids (Yeats et al., 2012; Fernandez-Moreno et al., 319 2017). Here, we utilized both S. neorickii BILs and a GWAS panel (Supplemental 320 Fig. S1) to map the lipidome of tomato fruit pericarp (Fig. 1), revealing genetic 321 associations between lipids, volatiles, and gene expression (Fig. 2 and Supplemental 322 Fig. S5). We examined the variation in lipid levels among cultivated and wild species 323 (Fig. 1, A-B). Wild tomato accessions had higher lipid levels as compared to 324 cultivated tomatoes (Fig. 1C). Domestication of tomato has resulted in a narrower 325 genetic and phenotypic variation in cultivated species (Ranc et al., 2008; 326 Bergougnoux, 2014; Blanca et al., 2015).The utility of exotic germplasm as a source 327 of new traits has been extensively exploited (Bessey, 1906; Zamir, 2001; McCouch, 328 2004; Doebley et al., 2006; Fernie et al., 2006; Grandillo et al., 2007). For example, 329 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 11 introgression populations containing portions of the genomes of the wild tomato 330 relatives S. pennellii and S. neorickii into cultivated tomato have provided useful 331 tools to explore multiple genes with roles in morphological and metabolic traits 332 (Schauer et al., 2008; Chitwood et al., 2013; Alseekh et al., 2015; Brog et al., 2019), 333 including lipids (Garbowicz et al., 2018). The large differences in chemical 334 composition between fruits of the different species mean that introgressions of 335 defined segments from those wild relatives frequently result in major perturbations of 336 metabolic pathways. Our integrative approach using two types of populations has 337 been proven to be particularly useful, allowing us to cross-validate results (Fig. 1-2). 338 We identified 436 mQTL and 175 mQTL using GWAS and linkage mapping ( S. 339 neorickii BILs) approaches, respectively (Supplemental Data Set S6). Thirty-four 340 mQTL and 38 candidate lipid-related genes were common between the two mapping 341 populations. 342 The mGWAS uncovered an association between the levels of multiple lipid classes 343 and the genomic locus harboring Sl-LIP8 (as shown in Fig. 3A). Additionally, Sl-LIP8 344 mutants showed significantly increased levels of several TAGs (as depicted in Fig. 345 3E). Sl-LIP8 encodes a class III lipase, which has been demonstrated to cleave 346 TAGs and DAGs, leading to the subsequent release of volatile compounds 347 (Garbowicz et al., 2018). Moreover, Garbowicz et al. precisely mapped the mQTL 348 containing Sl-LIP8 to a shared region within S. pennellii introgression lines (IL 9-3, IL 349 9-3-1, and IL 9-3-2), where a noticeable reduction of levels of DAGs, DGDGs, 350 MGDGs, and TAGs were observed in comparison to control (Garbowicz et al., 2018). 351 Sl-LIP8 participates in the biosynthesis of important flavor-associated FA-VOCs (X. 352 Li et al., 2020) (Supplemental Fig. S10B). Therefore, Sl-LIP8 acts as a bridge 353 connecting lipid and volatile metabolism presumably through the cleavage of 354 glycero- and phospholipids, leading to the release of free fatty acids which are 355 further metabolized to short-chain FA-VOCs (Garbowicz et al., 2018; X. Li et al., 356 2020) (Supplemental Fig. S10B). Previous results indicated that the transcript level of 357 Sl-LIP8 is higher in the S. lycopersicum cv. M82 tomato variety than in S. pennellii 358 (Garbowicz et al., 2018), potentially due to the structural variation in the promoter 359 regions (Kuhalskaya et al., 2020). In our study, we observed increased Sl-LIP8 360 expression in wild tomato species such as S. habrochaites, S. arcanum, S. 361 chmielewskii and S. peruvianum . These species are positioned closer to S. 362 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 12 lycopersicum on the domestication track than S. pennellii (Koenig et al., 2013; Bolger 363 et al., 2014; Lin et al., 2014). Additionally, the elevated Sl-LIP8 expression was 364 correlated with various lipid alterations (Fig. 3, C-D). 365 The GWAS mapping uncovered mQTL harboring the CFAPS1 gene, which is 366 responsible for observed changes in the level of various lipid classes and of FA-367 VOCs (Fig. 4, A-B). CFAPS1 exhibited higher transcript levels in cultivated tomato 368 accessions with corresponding differences in the lipid levels (Fig. 4, E-F). The 369 CFAPS1 gene encodes a cyclopropane-fatty-acyl-phospholipid synthase, previously 370 undescribed in tomatoes. In E. coli, this enzyme introduces cyclopropane rings into 371 unsaturated membrane phospholipids, resulting in generation of cyclopropane fatty 372 acids (Cronan & Luk, 2022). In plants, these unusual cyclopropane fatty acids can be 373 utilized for TAGs and DAGs synthesis (Shockey et al., 2018), which can then be 374 used as substrates to produce volatile compounds (Garbowicz et al., 2018) 375 (Supplemental Fig. S10, B-C). Our study demonstrated that the tomato CFAPS1 376 targets monounsaturated or polyunsaturated TAGs and membrane lipids (Fig. 4). 377 Membrane lipids in plants include phospholipids and galactolipids (Li-Beisson et al., 378 2013). The CFAPS1 mutants showed a significant increase in the levels of various 379 galactolipids (Fig. 4G). Furthermore, the KO lines exhibited substantial variations in 380 the composition of short-chain (C5, C6) and longer-chain (C7) FA-VOCs, with 381 remarkable increases of hexanal (C6), E-2-hexenal (C6), E-2-heptenal (C7), and 1-382 octene-3-one (C8) (Fig. 4H). In tomato fruit, certain C5, C6, C7, C8, and C10 FA-383 VOCs, exhibit significant correlations with flavor intensity and overall preference (J. 384 Zhang et al., 2015; Tieman et al., 2017). Thus, our validation establishes CFAPS1 as 385 the causal gene responsible for the natural variation in lipid abundances associated 386 with this locus. Moreover, our study demonstrates that CFAPS1 is another bridge 387 connecting lipid and volatile metabolic pathways (Supplemental Fig. S10C). 388 Another gene that was shown to participate in synthesis of FA-VOCs, through both 389 lipase-dependent and lipase-independent pathways, is TomLoxC (Klee, 2010; Klee 390 & Tieman, 2013; Mwenda & Matsui, 2014; Garbowicz et al., 2018; Zhao et al., 2019) 391 (Supplemental Fig. S10C). Quantitative variation in several lipids was mapped to the 392 TomLoxC locus (Fig. 7A). Further support for TomLoxC as the causal gene is 393 provided by a cis-eQTL (Fig. 7B). TomLoxC has higher transcript abundance in 394 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 13 cultivated varieties (Fig. 7F). Earlier research documented that TomLoxC is a 395 chloroplast-targeted lipoxygenase active during ripening of tomato fruit. TomLoxC 396 utilizes both linoleic and linolenic acids as substrates, resulting in the production of 397 flavor-associated short-chain FA-VOCs (Chen et al., 2004; Shen et al., 2014). Lipid 398 profiling of a TomLoxC loss-of-function line revealed that TomLoxC is associated 399 with significant increases in multiple lipid species with the most significant changes 400 occuring in galactoplipids (Fig. 7H). Thus, the association of various phospholipids 401 with the TomLoxC gene and accumulation of galacto-, phospho-, and glycerolipids in 402 the KO lines strongly suggests a role for this enzyme in chloroplast lipid degradation 403 during fruit ripening concomitant with the release of free FAs that are the precursors 404 for ripening-associated FA-VOCs (Supplemental Fig. S10D). Taken together, we 405 validated a role for TomLoxC as a causal gene responsible for at least part of the 406 natural variation of lipid abundance in tomato and their close relatives. 407 Our results showed that, Sl-LIP8, CFAPS1, and TomLoxC provide a connection 408 between lipid metabolism and FA-VOC biosynthesis. Several of these FA-VOCs 409 have been demonstrated to affect consumer preferences (Tieman et al., 2012, 2017; 410 Cortina et al., 2018). The TomLoxC locus appears to be under positive selection 411 within a domestication sweep (Lin et al., 2014). Thus, those genes are important 412 targets for breeding tomato with improved flavor. 413 We also identified a region containing TomLLP on chromosome 3 that affects the 414 levels of phospholipids and galactolipids using both mapping populations (Fig. 5, A-415 B). TomLLP is annotated as a lipase-like protein, and its Arabidopsis orthologue 416 (At1g52760), caffeoyl-shikimate esterase (CSE) participates in lignin and lipid 417 biosynthesis (Vanholme et al., 2013) (Supplemental Fig. S10E). Additionally, prior 418 research has demonstrated that At1g52760 exhibits lysophospholipase activity, 419 utilizing lysophospholipids as substrates, which has a role in phospholipid 420 metabolism (W. Gao et al., 2010; Vijayaraj et al., 2012; Miao et al., 2019) and also 421 has acyltransferase activities, facilitating the synthesis of diacyl- and triacylglycerol 422 (Vijayaraj et al., 2012). Phylogenetic analysis revealed that TomLLP clusters closely 423 with the caffeoyl shikimate esterases found in S. tuberosum, and Capsicum Annuum, 424 rather than with other tomato lipases (Supplemental Fig. 8). Overexpressing 425 TomLLP in tomatoes resulted in decreased glycerolipid contents (Fig. 5, F-G), while 426 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 14 lipid profiling of three Arabidopsis loss-of-function cse mutants (At1g52760) exhibited 427 a significant accumulation of multiple lipid species that belong to six lipid classes 428 (Fig. 6, Supplemental Data Sets S11-12). Our results illustrate that both TomLLP 429 and CSE function in lipid metabolism. Further study is needed to explore if TomLLP 430 also has a dual function as does CSE. 431 We performed correlation-based network analysis between lipophilic compounds, 432 transcript abundance of lipid-related genes, and levels of FA-VOCs (Fig. 8). Our 433 analysis showed positive interactions between lipophilic compounds and the 434 expression of lipid-related genes in various metabolic pathways. We identified 435 several lipid-related genes correlated with multiple lipid classes and several lipophilic 436 compounds connected to multiple lipid-related genes. Solyc02g071700, 437 Solyc02g090930, Solyc02g077430, Solyc06g053900, and Solyc08g063090 were 438 among the genes with numerous connections to lipids. Genetic mapping and 439 correlation-based network analysis revealed predominantly inter-class regulation, 440 with loci controlling several classes of lipids (Fig. 3-5, 7, 8). The network reveals an 441 intra-class regulation exemplified by the correlation between 16 glycerolipids and 442 Solyc09g009570 encoding a hexadecanal dehydrogenase. 443 This study combined lipidomics with mGWAS and family-based QTL mapping to 444 identify novel lipid-metabolism genes and expand our understanding of genome-level 445 regulation of lipid biosynthesis in tomato, establishing a direct genetic link between 446 lipid metabolism and FA-VOC production in tomato fruits. Hence, those genes are 447 likely to be useful in breeding for the improvement of palatability and nutrient 448 contents. 449

Materials and methods

450 Plant material 451 The GWAS mapping population used in this study comprised a collection of 550 S. 452 lycopersicum accessions that were selected after a phenotype-guided screen of over 453 7900 tomato accessions from around the world (Zemach et al., 2023). The GWAS 454 panel contained modern cultivars, heirloom strains, and wild tomato species 455 harvested from two independent greenhouse experiments during fall 2014 and fall 456 2015. The fall 2015 experiment (season) comprised a collection containing 388 S. 457 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 15 lycopersicum accessions (cultivated tomato), 61 accessions of the S. lycopersicum 458 var cerasiforme, 30 S. pimpinellifolium accessions, and 25 wild accessions. The 459 2014 season comprised a panel of 295 accessions including 24 S. lycopersicum var 460 cerasiforme cherry tomato accessions and 271 cultivated varieties. 461 The S. neorickii BILs resulted from a cross between the green ‐ fruited, 462 self‐ compatible wild accession LA2133 and the processing tomato inbred variety cv. 463 TA209 (S. lycopersicum). F1 hybrids were then backcrossed for three generations to 464 cv. TA209 as described by Brog et al. (2019) followed by 10 generations of 465 self‐ pollination to achieve BILs with maximum homozygosity of the wild genomic 466 introgressions (Supplemental Fig. S11). The S. neorickii BIL population contained 467 142 lines. Due to poor germination, only 107 lines were used for this study. 468 Additionally, hybrids for all BILs were produced in the background of the cv. TA209 469 recurrent parent to evaluate the wild introgressions in a heterozygous state. 470 For both the GWAS panel and the S. neorickii BILs population, pericarp tissue was 471 isolated from ripe fruits, snap frozen in liquid nitrogen and stored at −80°C before 472 extraction. 473 Sl-LIP8, CFAPS1 KO lines and wild type Fla.8059 were grown in randomized, 474 replicated plots in a heated greenhouse on the University of Florida campus or a field 475 in Live Oak, Florida, using recommended commercial practices. All fruits for lipid and 476 FA-VOC quantification were harvested at a full-red ripe stage. 477 478 Lipid extraction protocol 479 Lipids were extracted from the GWAS panel harvested in two consecutive years from 480 plants grown in the greenhouse and 2–4 independent biological replicates of S. 481 neorickii BILs from fruit pericarp (Hummel et al., 2011). Briefly, 120 mg of ripe frozen 482 fruits were used to make aliquots. Lipids were extracted with 1 ml of pre-cooled (–483 20°C) extraction buffer (homogenous methanol:methyl- tert-butyl-ether (1:3) mixture 484 + internal standards). After 10 min incubation in 4° C and sonication for 10 min in a 485 sonic bath, 500 µL of water/methanol mixture was added. Samples were then 486 centrifuged (5 min, 14000 x g). The lipophilic phase was collected and dried under 487 vacuum. 488 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 16 UPLC–FT–MS measurement 489 Samples were processed using ultra-performance liquid chromatography coupled 490 with Fourier transform mass spectrometry (UPLC–FT–MS) on a C 8 reverse phase 491 column (100 x 2.1 mm x 1.7 µm particle size, Waters) at 60 °C. Samples were first 492 subjected to liquid chromatography to separate them to their components. The 493 mobile phase consisted of 1% 1 M NH 4OAc and 0.1% acetic acid in water (Buffer A) 494 and acetonitrile/isopropanol (7:3, UPLC grade BioSolve) supplemented with 1 M 495 NH4OAc, 0.1 % acetic acid (Buffer B). The dried lipid extracts were re-suspended in 496 500 µL buffer B. The following gradient profile was applied: 1 min 45% A, 3 min 497 linear gradient from 45% A to 35% A, 8 min linear gradient from 25 to 11% A, 3 min 498 linear gradient from 11 to 1% A. After washing the column for 3 min with 1% A the 499 buffer was set back to 45% A, and the column was re-equilibrating for 4 min, leading 500 to a total run time of 22 min. The flow rate of the mobile phase was 400 µL/min. 501 Mass spectra were acquired using an Exactive mass spectrometer (Thermo Fisher, 502 http://www.thermofisher.com) equipped with an ESI interface. All the spectra were 503 recorded using altering full scan and all-ion fragmentation scan mode, covering a 504 mass range from 100–1500 /i4 m/z at a capillary voltage of 3.0 kV with a sheath gas 505 flow value of 60 and an auxiliary gas flow of 35. The resolution was set to 10,000 506 with 10 scans per second, restricting the Orbitrap loading time to a maximum of 507 100/i4 ms with a target value of 1E6 ions. The capillary temperature was set to 150 508 °C, while the drying gas in the heated electrospray source was set to 350 °C. The 509 skimmer voltage was held at 25 /i4 V while the tube lens was set to a value of 130 V. 510 The spectra were recorded from min 1 to min 20 of the UPLC gradient. 511 Targeted lipid profiling by LC–MS and data acquisition 512 Processing of chromatograms, peak detection, and integration were performed using 513 REFINER MS® 10.0 (GeneData, www.genedata.com). Workflow comprised peak 514 detection, retention time alignment, chemical noise and isotopic peaks removal from 515 the MS data. Obtained mass features characterized by specific peak ID, retention 516 time, m/z values, and intensity were further processed using in-house R scripts 517 (Team, 2000). Clusters with mean signal intensities lower than 40,000 were removed 518 and only peaks present in at least 80% of the samples were kept for analysis. Every 519 day more than 65 samples were processed using LC-MS including quality controls at 520 the beginning, the middle, and at the end of each day run. Peak intensities were 521 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 17 weight- and day-normalized and log 2-transformed. After that, obtained molecular 522 features were queried against a lipid database for annotation. The in-house 523 database used includes 219 lipid species of the following classes: diacylglycerols 524 (DAGs), digalactosyldiacylglycerols (DGDGs), monogalactosyldiacylglycerols 525 (MGDGs), phosphatidylcholines (PCs), phosphatidylethanolamines (PEs), and 526 triacylglycerols (TAGs) (Giavalisco et al., 2011). Their retention times and exact 527 masses were compared against those of the reference compounds, allowing 528 maximal deviations of 0.1 minutes and 10 ppm. Identified lipids were confirmed by 529 manual verification of the chromatograms using Xcalibur (Version 3.0, Thermo-530 Fisher, Bremen, Germany). 531 General statistical and multivariate analysis of lipidomic data 532 Principal component analysis plot (Fig. 1, 7) boxplots analysis (Fig. 1, 3-7, 533 Supplemental Fig. S6), pleiotropic map (Fig. 2), volcano plot (Fig. 3), heatmaps (Fig. 534 4, 6, 7) for the lipidomic data were obtained using R software version 4.3.1. 535 Complete pairwise correlations were calculated for lipid quantification from the 536 GWAS panel of two consecutive years and gene expression information (Zhu et al., 537 2018). Correlations exceeding a significance threshold of 0.05 (adjusted by FDR) 538 and Pearson correlation coefficient of ≥ 0.3 or ≤ 0.3, were assigned as edges in the 539 network. Calculations were performed in R using package “Hmisc”, processed for 540 visualization with the package “Rcy3” and visualized in Cytoscape 3.9.1 (see 541 https://cytoscape.org/). 542 The chromosomal distribution of mQTL identified (Supplemental Fig. S5) was 543 obtained by applying the RIdeogram R package to visualize and map genome-wide 544 data on idiograms using R software version 4.2.2 (Hao et al., 2020). 545 Heatmaps (Supplemental Fig. S2-4) were created using Multi Experimental Viewer 546 (MeV) software. Lipids were clustered using complete-linkage clustering. To detect 547 the fold change in lipid-feature levels across all S. neorickii BIL lines and for the 548 GWAS panel for each lipid feature, fold change was calculated by dividing the raw 549 value of the peak height of a certain lipid by the average of values across all the lines 550 for each trait. 551 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 18 GWAS and linkage QTL mapping 552 Accessions for mGWAS were genotyped by sequencing (GBS) at Cornell University 553 following an established protocol (Elshire et al., 2011) which yielded 16,526 SNP 554 markers (Zemach et al., 2023) all marker data are deposited in the Phenome 555 Networks database (http://unity.phenome-networks.com ). The final SNP matrix 556 (16,526) used for the analysis was obtained by filtering for minor allele frequency ≥ 557 5% (MAF ≥ 5%). Three principal components were included in the MLM to account 558 for population structure and the SNP fraction considered for PCA. The kinship matrix 559 and other parameters were set to default values. Association analysis was 560 conducted using both 16,526 SNPs obtained from the GBS on 550 accessions and 561 1.8 million SNPs across 367 overlapping (same) accessions were previously 562 characterized (Tieman et al., 2017). GWAS was performed using a compressed 563 mixed linear model (MLM) (Z. Zhang et al., 2010) implemented in the Genome 564 Association and Prediction Integrated Tool (GAPIT) in R (Lipka et al., 2012). 565 The S. neorickii BILs were genotyped with a 10K SolCAP single nucleotide 566 polymorphism chip, and 3,111 polymorphic markers were used for mapping 567 recombination breakpoints on the background of the physical map of S. 568 lycopersicum. On average the BILs harbor 4.3 introgressions per line, with a mean 569 introgression length of 34.7 Mbp, allowing the division of the genome into 340 bins 570 and enabling rapid trait mapping (Supplemental Fig. S11). Linkage QTL mapping 571 was done by applying “qtl” package version 1.40-8 that follows Haley–Knott 572 regression (Brog et al., 2019). Each S. neorickii BIL and the appropriate controls of 573 LA2133, TA209, and their F1 hybrid were genotyped using an Illumina 10K SNP chip 574 (www.illumina.com). The genetic map contains 3,111 genome ‐ anchored SNP 575 markers that were investigated to be polymorphic between S. neorickii and cv. 576 TA209. The markers were divided into 340 bins with an average length of 2.07 Mbp 577 per bin and composited from an average of 9.15 SNPs per bin (Brog et al., 2019). 578 QTL identification and candidate gene selection 579 We extracted SNPs associated with lipid species with significant p-values. For the 580 mGWAS with GBS SNPs (N=16,526) data, the p-value was calculated according to 581 the formula < 1/N (N=16,526). For mGWAS with 1.8 million SNPs data, the p-value 582 was calculated according to the formula < 1/N (N=1,800,000) (Wu et al., 2016). 583 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 19 Candidate loci in the GWAS were identified based on the logarithm (base 10) of 584 odds (LOD) score. A LOD score ≥ 4.5 was chosen as significant for mGWAS using 585 GBS SNP data. A LOD score ≥ 6.5 was chosen as significant for mGWAS using 1.8 586 million SNP data. All genes in a given QTL were taken as putative candidates. 587 Candidate genes were selected for validation based on their sequence homology 588 with Arabidopsis genes related to lipid metabolism, their tissue-specific expression, 589 and functional annotation. 590 The lipid profile of each S. neorickii BIL was compared (i.e., analysis of variance 591 [ANOVA], using permissive threshold, p < 0.05) to the lipid content of TA209. If it 592 was significantly different from the TA209 genotype, the introgression was 593 considered as harboring an mQTL. Causal genes responsible for mQTL were 594 identified considering the margins of introgressed regions from S. neorickii delimited 595 by the genetic markers used in this work. The upstream and downstream borders of 596 each introgression were established to be halfway between the inclusive and 597 exclusive wild-species SNP (Brog et al., 2019). 598 Linkage disequilibrium and haplotype analysis 599 Haplotype analysis was performed by using available SNP data. Accessions 600 belonging to the same haplotype were grouped and the haplotype median was 601 obtained for each lipid feature. To cluster haplotypes, a combination of allele sharing 602 distance and Ward's minimum variance was used (X. Gao & Starmer, 2007). One-603 way ANOVA for multiple comparisons was performed across all haplotypes (p < 604 0.05). 605 Cloning of lipase-like protein (TomLLP) and generation of transgenic plants 606 Gateway® Technology (Invitrogen) was used in this work for overexpression of the 607 lipase-like protein (Solyc03g119980) under the control of the Cauliflower mosaic 608 virus 35S (CaMV35S) promotor. Transformation of S. lycopersicum was performed 609 as described in previous studies (Earley et al., 2006). The pDONR221 vector (4761 610 bp) for attB-attP reaction with a selectable marker for kanamycin resistance was 611 used. The pK7WG2 (11,159 bp) vector for overexpression contained the 612 Solyc03g119980 gene in the sense orientation under the control of the CaMV35S 613 promoter. 614 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 20 qPCR gene expression analysis 615 Three parental M82 and six independent transgenic lines (with overexpressed 616 Solyc03g119980) were grown in a greenhouse at the Max Planck Institute of 617 Molecular Plant Physiology (Golm-Potsdam, Germany; Supplemental Data Set S10). 618 The plants were grown in pots with 20 cm diameter three liter pots. At least three 619 biological replicates were grown for each genotype. At least six fruit samples were 620 collected from each replicate for transgenic lines and ten samples for the parental 621 M82 line. Fruit pericarp was frozen in liquid nitrogen before storage at –80 °C. 622 Frozen fruit pericarp was ground and RNA was extracted using a Thermo Fisher 623 Scientific kit. The quality of extracted RNA was identified by visualization of the RNA 624 integrity using 1% agarose gel and using a spectrophotometer (NanoDrop; Thermo 625 Scientific). The final RNA concentration was adjusted to be equal in all the samples 626 and cDNA was synthesized for each sample using a TAKARA cDNA synthesis kit for 627 real-time gene quantification. Each cDNA sample was diluted 1:10, by adding 628 RNAse-free water, and stored at –80 °C. The primers for Solyc03g119980 were 629 designed based on the Primer3 tool. The housekeeping genes were chosen 630 according to the relevant literature (Supplemental Data Set S15) (Expósito-631 Rodríguez et al., 2008). The amplification efficiencies are indicated for every primer 632 pair. The same cDNA in four dilutions (1:10; 1:100; 1:1000; 1:10000) was amplified 633 with every primer pair (qPCR). The calibration curve slope ( R2) was used for the 634 identification of primer efficiency according to the formula: 635 Efficiency (%) = (E1) ×100, 636 Where E was obtained from R2 regarding following formula: 637 E=101/slope 638 The differences in gene expression were calculated as fold change between 639 independent transgenic lines and M82 according to the method of Schmittgen and 640 Livak (Schmittgen & Livak, 2008) Δ Ct value was calculated as the difference 641 between the Ct (cycle threshold) of the candidate gene and the Ct of the control 642 gene for normalization of gene expression, according to Schmittgen and Livak 643 (Schmittgen & Livak, 2008). 644 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 21 CRISPR lines 645 Three CRISPR lines for TomLoxC (Solyc01g06540) were created following the 646 protocol described by Reem and Van Eck (Reem & Van Eck, 2019) with minor 647 modifications. CRISPR constructs were created with a two-step golden gate cloning 648 procedure to assemble a vector containing two guide-RNA-expressing cassettes, a 649 kanamycin resistance gene, and the Cas9 nuclease, which was introduced into 650 Agrobacterium tumefaciens GV2260. Guide RNAs were designed using the online 651 tool CRISPR-P 2.0 (H. Liu et al., 2017). Transgenic plants were produced as 652 previously described (Reem & Van Eck, 2019). Cas9-free plants homozygous for the 653 gene of interest were transplanted in the greenhouse. 654 The design procedure and efficacy test of sgRNAs were performed using the 655 CRISPR-P (http://cbi.hzau.edu.cn/cgi-bin/CRISPR) tool and the Guide-itTM sgRNA In 656 vitro Transcription and Screening System (Takara, Mountain View, CA, USA) 657 according to the manufacturer. Vector construction and tomato transformation were 658 performed as described previously (X. Li et al., 2020). Briefly, two 20-bp sgRNAs 659 were inserted into a CRISPR/Cas9 binary vector (pCAMBIA2300_CR3-EF), in which 660 the target sequence was driven by the Arabidopsis U6-26 promoter and Cas9 by 2 x 661 35S. The sgRNA sequences are listed in Supplemental Data Set S15. The final 662 binary vector was transformed into cultivar Fla. 8059 by Agrobacterium-mediated 663 transformation (X. Li et al., 2020). Genomic DNA was extracted from T1 and T3 664 homozygous cfaps1 leaves and flanking regions containing the target sites were 665 amplified using the specific primers CFAPS1-F and CFAPS1-R. The homozygous 666 cfaps1 allele was verified in the T1 and T3 generation by PCR-based sequencing. 667 Cas9-free plants were used for quantitative analysis. The primers used for 668 amplification and genotyping are listed in Supplemental Data Set S15. 669 Plant material from CRISPR lines targeting SI-LIP8 (Solyc09g091050) have been 670 previously published (X. Li et al., 2020). The sgRNAs were inserted into the 671 pCAMBIA2300_CR3-EF vector and transformed into Fla. 8059 by Agrobacterium -672 mediated transformation. Genomic DNA from an F2 plant backcrossed to WT and T3 673 tomato leaves was used for amplification with specific primers (Sl-LIP8-F and Sl-674 LIP8-R) for genotyping. Quantitative analysis used Cas9-free plants (X. Li et al., 675 2020). 676 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 22 Experimental design for TomLLP (Solyc03g119980) validation 677 Sterilized seeds were grown on Murashige & Skoog selective plates with 20% 678 sucrose (2MS) and Km under long-day conditions (16 h light, 8 h dark), temperature 679 was kept at 21/16 °C (day and night, respectively), light intensity at 150 μ E m-2 sec-680 1, humidity 75%. After 3 weeks, seedlings that survived selection were transported to 681 the greenhouse in individual round pots with soil (potting compost) for fruit 682 production and seeds for the next generation. The plants with empty-vector control 683 and wild-type plants were grown in the same conditions as plants with 684 overexpressed candidate genes. 685 Experimental design of the Arabidopsis orthologue ( CSE, At1g52760 ) of 686 candidate gene (TomLLP, Solyc03g119980) validation 687 AtCSE_KD (cse-1, SALK_008202C) and AtCSE_KO1 (cse-2 , SALK_023077) have 688 been previously described (W. Gao et al., 2010; Vanholme et al., 2013) AtCSE_KO2 689 (GABI_368D11) is a T-DNA insertion mutant and was obtained from the GABI-Kat 690 collection (Rosso et al., 2003). The T-DNA flanking sequence was analyzed via PCR 691 with the primers 5'-ACCATTAGATGGTGAAATCAAAGG-3' (1) and 5'-692 ATAATAACGCTGCGGACATCTACA-3' (2), whereas the absence of the T-DNA was 693 analyzed via PCR with the primers 1 and 5'-CTTGATAGCCTTCCCAACCA-3' (3). 694 The AtCSE_KO2 T-DNA insertion was confirmed to be positioned in the second 695 exon (Vanholme et al., 2013). 696 Figures: 697 Figure 1. 698 Characterization of Natural Variation in Lipophilic Metabolites Across 550 Different 699 Tomato Accessions. A) Numbers of lipid compounds measured by LC-MS in 550 700 tomato accessions and their compound classes. B) PCA of lipid content for tomato 701 lines representing green-fruited wild species (green dots), cultivated varieties (red 702 dots), cherry tomato varieties S. lycopersicum var. cerasiforme (pink dots), and red-703 fruited wild accessions of S. pimpinellifolium (blue dots). Each dot represents a 704 single accession. C) Box plots indicating the average value of all compounds for 705 each lipid class in diverse wild accessions (n = 29), S. pimpinellifolium (n = 30), S. 706 lycopersicum var. cerasiforme (n = 62), and S. lycopersicum (n = 398). Significances 707 are indicated by * < 0.05, ** < 0.01, *** < 0.001 using Student’s t-test. 708 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 23 Figure 2. 709 Pleiotropic Map Summarizing Quantitative Fruit Mapping. A) Chromosomal 710 distribution of the QTL derived from GWAS represents the combined results from the 711 2014 and 2015 seasons using SNPs markers generated from Genotype by 712 Sequencing (GBS) and Whole Genome Sequencing (WGS). Colors indicate different 713 lipid classes. The inner circle specifies the amount of lipids mapped to the identified 714 region. QTL harboring candidate genes are highlighted. B) Bar charts show the 715 number of significant SNPs associated with each lipid class chromosome-wise. C) 716 Number of traits associated with significant markers for GWAS on each chromosome 717 (upper panel) and BIL (lower panel). The corresponding lipid compounds and 718 number of QTL are provided in Supplemental Data Sets S1-5. 719 Figure 3. 720 Lipid Contents are Associated with the Locus Harboring SI-LIP8 (Solyc09g091050). 721 A) Manhattan plots of the mGWAS results using GBS SNPs data. B) Accessions 722 were separated by the lead SNP and the average lipid level was determined. Zero 723 represents the homozygous genotype for the first allele, one represents the 724 heterozygote, and two represents the homozygous genotype for the other allele. C) 725 The average lipid level in each of the following groups: S. lycopersicum (n = 398), S. 726 lycopersicum var. cerasiforme (n = 62), S. pimpinellifolium (n = 30), and diverse wild 727 tomato species (n = 27). D) SI-LIP8 transcript levels in fruits of S. lycopersicum (n = 728 258), S. lycopersicum var. cerasiforme (n = 56), and diverse wild tomato species (n = 729 6). E) Volcano plot showing the abundance of selected lipids in SI-LIP8 KO and wild 730 type (Fla. 8059). Lipid levels were calculated as a log2 fold change of Fla. 8059. 731 Significances are indicated by * < 0.05, ** < 0.01, *** < 0.001 using Student’s t-test. 732 Figure 4. 733 Phospho-, Galacto- and Glycerolipid Contents are Associated with the CFAPS1 734 (Solyc09g090510) Locus. A) Manhattan plots of mGWAS using GBS SNPs data. B) 735 Manhattan plots of the mGWAS using WGS SNPs data. C) Lipid contents in different 736 haplotypes based on the lead mGWAS SNP. Zero is homozygous for the first allele; 737 one is heterozygous; two is homozygous for the second allele. D) Lipid analysis of 738 accessions with different haplotypes. E) The average lipid level in each of the 739 following: S. lycopersicum (n = 398) , S. lycopersicum var. cerasiforme (n = 62), S. 740 pimpinellifolium (n = 30), and diverse wild tomato accessions (n = 27). F) CFAPS1 741 transcript level in fruits of S. lycopersicum (n = 240) and S. lycopersicum var. 742 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 24 cerasiforme (n = 43). G) Abundance of selected lipids in CFAPS1 KO and control 743 (Fla. 8059) fruits. H) Abundance of short-chain FA-VOC (C5, C6) and longer-chain 744 FA-VOC (C7, C8) volatiles in CFAPS1 KO and control (Fla. 8059) fruits. 745 Significances are indicated by * < 0.05, ** < 0.01, *** < 0.001 using Student’s t-test. 746 Figure 5. 747 Linkage Mapping Identifies a Role for TomLLP (Solyc03g119980 ) in Fruit Lipid 748 Metabolism. A) Association plot of PC 38:3 obtained with linkage mapping using S. 749 neorickii BIL population. B) Manhattan plot of mGWAS of TAG 50:3 using WGS 750 SNPs data. C) Lipid contents of two haplotypes for accessions separated by the lead 751 mGWAS SNP. D) S. neorickii tomato segments introgressed into cultivated tomato 752 variety TA209 on chromosome 3. E) Levels of PC 36:1 and PC 38:3 in BILs sharing 753 the S. neorickii introgression on chromosome 3 and BILs with the TA209 754 background. F) TomLLP transcript levels in the TomLLP overexpression line and 755 wild type M82. G) Level of selected lipid in the TomLLP overexpression line and 756 M82. Significances are indicated by * < 0.05, ** < 0.01, *** < 0.001 using Student’s t-757 test. 758 Figure 6. 759 CSE ( At1g52760) Influences the Lipid Metabolism in Arabidopsis. A) Heatmap 760 shows the significant (p ≤ 0.05) changes in lipid levels between wild-type and the cse 761 knock-out (KO) and knock-down (KD) lines. B) Changes in lipid levels of selected 762 lipids classes between the cse KO and KD lines and the wild type. 763 Figure 7. 764 Phospho-, Galacto- and Glycerolipid Content is Associated with the TomLoxC 765 (Solyc01g006540) Locus. A) Manhattan plots for mGWAS using GBS SNPs data. B) 766 Manhattan plot for eGWAS using WGS SNPs data. C) Lipid contents for a group of 767 accessions separated by the lead mGWAS SNP. Zero, homozygous for the first 768 allele; one, heterozygous; two, homozygous for the second allele. D) Lipid contents 769 for accessions grouped by the lead eGWAS SNP. E) Average lipid levels for S. 770 lycopersicum (n = 398), S. lycopersicum var. cerasiforme (n = 62), S. pimpinellifolium 771 (n = 30), and diverse wild tomato species (n = 27). F) TomLoxC transcript level in 772 fruits of S. lycopersicum (n = 258), S. lycopersicum var. cerasiforme (n = 56), S. 773 pimpinellifolium (n = 6). G) PCA plot of lipid levels in TomLoxC KO and wild type 774 M82. H) Heatmap representing the abundance of short-chain FA-VOC (C5, C6) and 775 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 25 longer-chain FA-VOC (C7, C8) in TomLoxC KO and wild type M82. Significances are 776 indicated by * < 0.05, ** < 0.01, *** < 0.001 using Student’s t-test. 777 Figure 8. 778 Metabolite-Transcript-Volatile Correlation-based Network. Each node represents a 779 metabolite or gene transcript; edges connecting two nodes show a correlation (R ≤ -780 0.3, or R ≥ 0.3) between the two nodes. In total, the network is composed of 185 781 nodes and about 335 edges assembled into three large groups: lipophilic metabolites 782 comprise 74 nodes, gene expression data have 107 nodes, and four nodes for 783 volatile organic compounds (VOCs; Supplemental Data Set S14). There are 672 784 genes with homology to genes known to be involved in lipid metabolism Garbowicz 785 et al., 2018). Transcript levels were used to construct the network (Zhu et al., 2018), 786 VOC data (Tieman et al., 2017), and all other lipid metabolites derived from the 787 current study. 788 Supplemental Fig. S1. 789 Schematic model of conducted experiments focused on investigation of genes 790 underlying lipid metabolism in tomato fruit pericarp applying forward genetic 791 approaches using association panels represented by A) S. neorickii biparental 792 population, and B) unrelated cultivated tomato genotypes for genome-wide 793 association study (GWAS). 794 Supplemental Figure S2. 795 Heatmap of lipid levels across 550 accessions of the GWAS panel. The data 796 represent lipidomic profiling of material harvested in two consecutive years 2014 A) 797 and 2015 B) of plants grown in the greenhouse. For each lipid species mean lipid 798 level was calculated and the level of the same lipid in each accession was 799 normalized to this mean by dividing each lipid value by this mean. Each season was 800 normalized separately and presented in a logarithmic scale (log2). Regions of red or 801 blue indicate lower or higher compared to the average of each lipid species, 802 respectively. 803 Supplemental Figure S3. 804 Heatmap of lipid levels across 550 accessions of the GWAS panel belonging to S. 805 lycopersicum, S. lycopersicum var. cerasiforme, S. pimpinellifolium, and wild tomato 806 groups. For each lipid species mean lipid level was calculated and the level of the 807 same lipid in each accession was normalized to this mean by dividing each lipid 808 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 26 value by this mean. The data are presented in logarithmic scale (log2). Regions of 809 red or blue indicate lower or higher compared to the average of each lipid species, 810 respectively. 811 Supplemental Figure S4. 812 Heat map of lipid profiling across S. neorickii backcross inbred lines (BILs). The data 813 represent lipidomic profiling of material harvested from S. neorickii BILs population 814 A) heterozygous and B) homozygous lines. For each lipid species mean lipid level 815 were calculated and the level of the same lipid in each BIL were normalized to this 816 mean by dividing each lipid value by this mean. Each season was normalized 817 separately and presented in a logarithmic scale (log2). Regions of red or blue 818 indicate lower or higher compared to the average of each lipid species, respectively. 819 Regions of white color, reflecting many of the chromosomal segment substitutions, 820 do not affect lipid levels. 821 Supplemental Figure S5. 822 Chromosomal distribution of identified mQTL. A) Idiogram represents a 823 chromosomal distribution of the mQTL resulting from GWAS of material harvested in 824 two consecutive years using GBS and WGS SNPs data. B) Chromosomal 825 distribution of the mQTL found in the BIL mapping of heterozygous and homozygous 826 lines. 827 Supplemental Figure S6. 828 Linkage mapping of lipids in the BILs population reveled an mQTL comprising 61 829 genes harboring Solyc06g008920, encoding an Acetyl-CoA synthetase. A) 830 Association plot of TAG 54:8 obtained with linkage mapping using S. neorickii BIL 831 population. B) S. neorickii tomato segments introgressed into cultivated tomato 832 variety TA209 on chromosome 6. C) Levels of DAG 36:5, DGDG 36:6, and TAG 54:8 833 in BILs sharing the S. neorickii introgression on chromosome 6 and BILs with the 834 TA209 background. D) Average lipid levels for S. lycopersicum (n = 388), S. 835 lycopersicum var. cerasiforme (n = 61), S. pimpinellifolium (n = 30), and diverse wild 836 tomato species (n = 25). E) Solyc06g08920 transcript level in fruits of S. 837 lycopersicum (n = 258) , S. lycopersicum var. cerasiforme (n = 56), and S. 838 pimpinellifolium (n = 6). Significances (p-value) are indicated by letters using 839 Student‘s t-test. 840 Supplemental Figure S7. 841 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 27 Construction and characterization of CFAPS1-edited lines. CFAPS1 KO line (Fla. 842 8059 background) exhibits a deletion of 166 bp and an insertion of 19 bp in the first 843 exon. 844 Supplemental Figure S8. 845 Phylogenetic analysis of the caffeoyl shikimate esterase (CSE) family. Coding 846 sequences of genes with confirmed function as CSE or putatively annotated as CSE 847 were used for the construction of a phylogenetic tree. Frame highlights two genes, 848 the tomato TomLLP and CSE from Arabidopsis. Genes IDs are specified in 849 Supplemental Data Set S9. A phylogenetic tree was reconstructed with the neighbor-850 joining method. 851 Supplemental Figure S9. 852 Expression level of TomLLP across six S. neorickii BILs. Monitoring the expression 853 level of TomLLP using PCR on six BILs. Of which, in the region containing TomLLP: 854 three BILs with the S. neorickii background and three BILs with the TA209 855 background. 856 Supplemental Figure S10. 857 The metabolic pathways involve the identified lipid-related gene candidates. A) 858 Schematic representation of the process of fatty acids synthesis by acetyl-CoA 859 synthetase (Solyc06g008920) using pyruvate as a substrate. B) Schematic 860 representation of the pathway of volatile synthesis from the free fatty acids liberated 861 from triacylglycerol by class III lipase (Solyc09g091050) . C) Schematic 862 representation of the process of conversion of membrane lipids (phospho- and 863 galactolipids) to acylglycerols via cyclopropane-fatty-acyl-phospholipid with 864 subsequent volatile production. D) Schematic representation of the process of lipid 865 oxylipins and volatile production through the lipoxygenase enzymes 866 (Solyc01g06540) in the lipase-independent pathway. E) The role of CSE 867 (At1g52760, orthologue of Solyc03g119980) in the lignin biosynthetic pathway with 868 additional identified acyltransferase and hydrolase activities. 869 Supplemental Figure S11. 870 Breeding scheme and genetic map for backcross inbred lines (BILs). A) First cross: 871 pollen from S. neorickii was placed onto the stigma of cv. TA209 to obtain F1 plants. 872 Additional backcrosses with cv. TA209 was performed to decrease the S. neorickii 873 genome introgression in the BILs. For each generation, the amount of plants is 874 shown in parentheses. A final cross of the homozygous BILs with cv. TA209 was 875 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 28 performed to obtain heterozygous lines (Brog et al., 2019). B) Schematic 876 representation of S. neorickii backcrossed inbred lines. The BILs harbor on average 877 4.3 introgressions per line, with a mean introgression length of 34.7 Mbp, allowing 878 the division of the genome into 340 bins and enabling rapid trait mapping (Brog et 879 al., 2019). 880 Supplemental Data Sets: 881 Supplemental Data Set S1 : Lipid content of the GWAS panel from 2015 harvest 882 season. 883 Supplemental Data Set S2 : Lipid content of the GWAS panel from 2014 harvest 884 season. 885 Supplemental Data Set S3: Lipid content of homozygous S. neorickii BILs. 886 Supplemental Data Set S4: Lipid content of heterozygous S. neorickii BILs. 887 Supplemental Data Set S5: List of lipid compounds identified and quantified in both 888 GWAS and S. neorickii BILs. 889 Supplemental Data Set S6: Number and distribution of the identified mQTL in both 890 GWAS and S. neorickii BILs. 891 Supplemental Data Set S7: Lipids content in the Sl-LIP8 KO and the wild-type 892 tomato (c.v Fla.8059). ±SE; n = 4; Significances are indicated by * < 0.05, ** < 0.01, 893 *** < 0.001 using Student’s t-test. 894 Supplemental Data Set S8: Lipid and FA-VOCs contents in the CFAPS1 KO lines 895 and the control WT tomato. ±SE; n ≥ 10; Significances are indicated by * < 0.05, ** < 896 0.01, *** < 0.001 using Student’s t-test. 897 Supplemental Data Set S9: Coding sequences of CSE genes used for the 898 construction of a phylogenetic tree. 899 Supplemental Data Set S10: Expression levels of Solyc03g119980 in OE lines 900 using qPCR. 901 Supplemental Data Set S11: Lipids content in the TomLLP OE lines and the control 902 WT (M82) tomato. ±SE; n = 5; Significances are indicated by * < 0.05, ** < 0.01, *** < 903 0.001 using Student’s t-test. 904 Supplemental Data Set S12: Lipids content in CSE knock-down, CSE knock-out 905 lines A. thaliana wild type (A.th_WT). ±SE; n ≥ 4; Significances are indicated by * < 906 0.05, ** < 0.01, *** < 0.001 using Student’s t-test. 907 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 29 Supplemental Data Set S13: Lipids content in the TomLoxC KO lines and the wild-908 type tomato. ±SE; n=7; Significances are indicated by * < 0.05, ** < 0.01, *** < 0.001 909 using Student’s t-test. 910 Supplemental Data Set S14: Summary data sets of correlation-based network 911 analysis 912 Supplemental Data Set S15: Primers used in this study 913 Acknowledgments 914 S.A. acknowledges funding by the PlantaSYST project by the European Union ’ s 915 Horizon 2020 Research and Innovation Programme (SGA ‐ CSA nos 664621 and 916 739582 under FPA no. 664620), and NatGenCrop project: HORIZON-WIDERA-917 2022-TALENTS-01, No. 101087091. W.B. acknowledges financial support from the 918 ERC Advanced grant POPMET. 919 Authors’ contribution 920 A.K., X.L. performed experiments. A.K., X.L., J.L., JvS, M.B., E.K., K.K., L.R., 921 M.W.A., K.G., A.K., A.-K. R. performed data analysis. A.D., A.I., provided technical 922 and computer support. I.G., D.T., J.F., R.V., W.B., D.Z. provided plant materials. 923 H.K., S.A. conceptualized the experiment. A.K., X.L., H.K., S.A. wrote the manuscript 924 with input from all authors. 925 Conflict of Interests 926 The authors declare no conflict of interest. 927 928 929

References

930 Alseekh, S., Kostova, D., Bulut, M., & Fernie, A. R. (2021). Genome-wide association 931 studies: assessing trait characteristics in model and crop plants. Cell Mol Life Sci., 932 78(15), 5743–5754. https://doi.org/10.1007/s00018-021-03868-w 933 Alseekh, S., Tohge, T., Wendenberg, R., Scossa, F., Omranian, N., Li, J., Kleessen, S., 934 Giavalisco, P., Pleban, T., Mueller-Roeber, B., Zamir, D., Nikoloski, Z., & Fernie, A. R. 935 (2015). Identification and Mode of Inheritance of Quantitative Trait Loci for Secondary 936 Metabolite Abundance in Tomato. Plant Cell, 27(3), 485–512. 937 https://doi.org/10.1105/tpc.114.132266 938 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 30 Alseekh, S., Tong, H., Scossa, F., Brotman, Y., Vigroux, F., Tohge, T., Ofner, I., Zamir, D., 939 Nikoloski, Z., & Fernie, A. R. (2017). Canalization of tomato fruit metabolism. Plant Cell, 940 29(11). https://doi.org/10.1105/tpc.17.00367 941 Bergougnoux, V. (2014). The history of tomato: From domestication to biopharming. 942 Biotechnol Adv, 32(1), 170–189. https://doi.org/10.1016/j.biotechadv.2013.11.003 943 Bessey, C. E. (1906). Crop improvement by utilizing wild species. J Hered, os-2(1). 944 https://doi.org/10.1093/jhered/os-2.1.112 945 Blanca, J., Montero-Pau, J., Sauvage, C., Bauchet, G., Illa, E., Díez, M. J., Francis, D., 946 Causse, M., van der Knaap, E., & Cañizares, J. (2015). Genomic variation in tomato, 947 from wild ancestors to contemporary breeding accessions. BMC Genomics, 16(1), 257. 948 https://doi.org/10.1186/s12864-015-1444-1 949 Bolger, A., Scossa, F., Bolger, M. E., Lanz, C., Maumus, F., Tohge, T., Quesneville, H., 950 Alseekh, S., Sørensen, I., Lichtenstein, G., Fich, E. A., Conte, M., Keller, H., 951 Schneeberger, K., Schwacke, R., Ofner, I., Vrebalov, J., Xu, Y., Osorio, S., … Fernie, 952 A. R. (2014). The genome of the stress-tolerant wild tomato species Solanum pennellii. 953 Nat. Genet., 46(9), 1034–1038. https://doi.org/10.1038/ng.3046 954 Brog, Y. M., Osorio, S., Yichie, Y., Alseekh, S., Bensal, E., Kochevenko, A., Zamir, D., & 955 Fernie, A. R. (2019). A Solanum neorickii introgression population providing a powerful 956 complement to the extensively characterized Solanum pennellii population. Plant J., 957 97(2), 391–403. https://doi.org/10.1111/tpj.14095 958 Brouckaert, M., Peng, M., Höfer, R., El Houari, I., Darrah, C., Storme, V., Saeys, Y., 959 Vanholme, R., Goeminne, G., Timokhin, V. I., Ralph, J., Morreel, K., & Boerjan, W. 960 (2023). QT–GWAS: A novel method for unveiling biosynthetic loci affecting qualitative 961 metabolic traits. Mol. Plant, 16(7), 1212–1227. 962 https://doi.org/10.1016/j.molp.2023.06.004 963 Causse, M. (2004). A genetic map of candidate genes and QTLs involved in tomato fruit size 964 and composition. J. Exp. Bot., 55(403), 1671–1685. https://doi.org/10.1093/jxb/erh207 965 Chandra-Shekara, A. C., Venugopal, S. C., Barman, S. R., Kachroo, A., & Kachroo, P. 966 (2007). Plastidial fatty acid levels regulate resistance gene-dependent defense 967 signaling in Arabidopsis. Proc Natl Acad Sci., 104(17), 7277–7282. 968 https://doi.org/10.1073/pnas.0609259104 969 Chen, G., Hackett, R., Walker, D., Taylor, A., Lin, Z., & Grierson, D. (2004). Identification of 970 a Specific Isoform of Tomato Lipoxygenase (TomloxC) Involved in the Generation of 971 Fatty Acid-Derived Flavor Compounds. Plant Physiol., 136(1), 2641–2651. 972 https://doi.org/10.1104/pp.104.041608 973 Chitwood, D. H., Kumar, R., Headland, L. R., Ranjan, A., Covington, M. F., Ichihashi, Y., 974 Fulop, D., Jimenez-Gomez, J. M., Peng, J., Maloof, J. N., & Sinha, N. R. (2013). A 975 Quantitative Genetic Basis for Leaf Morphology in a Set of Precisely Defined Tomato 976 Introgression Lines. Plant Cell, 25(7), 2465–2481. 977 https://doi.org/10.1105/tpc.113.112391 978 Cortina, P. R., Santiago, A. N., Sance, M. M., Peralta, I. E., Carrari, F., & Asis, R. (2018). 979 Neuronal network analyses reveal novel associations between volatile organic 980 compounds and sensory properties of tomato fruits. Metabolomics, 14(5), 57. 981 https://doi.org/10.1007/s11306-018-1355-7 982 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 31 Cronan, J. E., & Luk, T. (2022). Advances in the Structural Biology, Mechanism, and 983 Physiology of Cyclopropane Fatty Acid Modifications of Bacterial Membranes. Microbiol 984 Mol. Biol Rev., 86(2). https://doi.org/10.1128/mmbr.00013-22 985 Doebley, J. F., Gaut, B. S., & Smith, B. D. (2006). The Molecular Genetics of Crop 986 Domestication. Cell, 127(7), 1309–1321. https://doi.org/10.1016/j.cell.2006.12.006 987 Earley, K. W., Haag, J. R., Pontes, O., Opper, K., Juehne, T., Song, K., & Pikaard, C. S. 988 (2006). Gateway‐ compatible vectors for plant functional genomics and proteomics. 989 Plant J., 45(4), 616–629. https://doi.org/10.1111/j.1365-313X.2005.02617.x 990 Elshire, R. J., Glaubitz, J. C., Sun, Q., Poland, J. A., Kawamoto, K., Buckler, E. S., & 991 Mitchell, S. E. (2011). A Robust, Simple Genotyping-by-Sequencing (GBS) Approach 992 for High Diversity Species. PLoS ONE, 6(5), e19379. 993 https://doi.org/10.1371/journal.pone.0019379 994 Expósito-Rodríguez, M., Borges, A. A., Borges-Pérez, A., & Pérez, J. A. (2008). Selection of 995 internal control genes for quantitative real-time RT-PCR studies during tomato 996 development process. BMC Plant Biol., 8(1), 131. https://doi.org/10.1186/1471-2229-8-997 131 998 Fang, C., & Luo, J. (2019). Metabolic GWAS‐ based dissection of genetic bases underlying 999 the diversity of plant metabolism. Plant J., 97(1), 91–100. 1000 https://doi.org/10.1111/tpj.14097 1001 Fernandez-Moreno, J.-P., Levy-Samoha, D., Malitsky, S., Monforte, A. J., Orzaez, D., 1002 Aharoni, A., & Granell, A. (2017). Uncovering tomato quantitative trait loci and 1003 candidate genes for fruit cuticular lipid composition using the Solanum pennellii 1004 introgression line population. J. Exp. Bot., 68(11), 2703–2716. 1005 https://doi.org/10.1093/jxb/erx134 1006 Fernie, A. R., Tadmor, Y., & Zamir, D. (2006). Natural genetic variation for improving crop 1007 quality. Curr Opin Plant Biol., 9(2), 196–202. https://doi.org/10.1016/j.pbi.2006.01.010 1008 Frary, A., Nesbitt, T. C., Frary, A., Grandillo, S., Knaap, E. van der, Cong, B., Liu, J., Meller, 1009 J., Elber, R., Alpert, K. B., & Tanksley, S. D. (2000). fw2.2/i2 : A Quantitative Trait Locus 1010 Key to the Evolution of Tomato Fruit Size. Science, 289(5476), 85–88. 1011 https://doi.org/10.1126/science.289.5476.85 1012 Gao, L., Gonda, I., Sun, H., Ma, Q., Bao, K., Tieman, D. M., Burzynski-Chang, E. A., Fish, T. 1013 L., Stromberg, K. A., Sacks, G. L., Thannhauser, T. W., Foolad, M. R., Diez, M. J., 1014 Blanca, J., Canizares, J., Xu, Y., van der Knaap, E., Huang, S., Klee, H. J., … Fei, Z. 1015 (2019). The tomato pan-genome uncovers new genes and a rare allele regulating fruit 1016 flavor. Nat. Genet., 51(6), 1044–1051. https://doi.org/10.1038/s41588-019-0410-2 1017 Gao, W., Li, H.-Y., Xiao, S., & Chye, M.-L. (2010). Acyl-CoA-binding protein 2 binds 1018 lysophospholipase 2 and lysoPC to promote tolerance to cadmium-induced oxidative 1019 stress in transgenic Arabidopsis. Plant J., no-no. https://doi.org/10.1111/j.1365-1020 313X.2010.04209.x 1021 Gao, X., & Starmer, J. (2007). Human population structure detection via multilocus genotype 1022 clustering. BMC Genet., 8(1), 34. https://doi.org/10.1186/1471-2156-8-34 1023 Garbowicz, K., Liu, Z., Alseekh, S., Tieman, D., Taylor, M., Kuhalskaya, A., Ofner, I., Zamir, 1024 D., Klee, H. J., Fernie, A. R., & Brotman, Y. (2018). Quantitative Trait Loci Analysis 1025 Identifies a Prominent Gene Involved in the Production of Fatty Acid-Derived Flavor 1026 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 32 Volatiles in Tomato. Mol.Plant, 11(9), 1147–1165. 1027 https://doi.org/10.1016/j.molp.2018.06.003 1028 García-Coronado, H., Tafolla-Arellano, J. C., Hernández-Oñate, M. Á., Burgara-Estrella, A. 1029 J., Robles-Parra, J. M., & Tiznado-Hernández, M. E. (2022). Molecular Biology, 1030 Composition and Physiological Functions of Cuticle Lipids in Fleshy Fruits. Plants, 1031 11(9), 1133. https://doi.org/10.3390/plants11091133 1032 Giavalisco, P., Li, Y., Matthes, A., Eckhardt, A., Hubberten, H., Hesse, H., Segu, S., 1033 Hummel, J., Köhl, K., & Willmitzer, L. (2011). Elemental formula annotation of polar and 1034 lipophilic metabolites using 13 C, 15 N and 34 S isotope labelling, in combination with 1035 high‐ resolution mass spectrometry. Plant J., 68(2), 364–376. 1036 https://doi.org/10.1111/j.1365-313X.2011.04682.x 1037 Grandillo, S., Tanksley, S. D., & Zamir, D. (2007). Exploitation of natural biodiversity through 1038 genomics. INT J Plant Genomics (Vol. 1). https://doi.org/10.1007/978-1-4020-6295-7_6 1039 Hao, Z., Lv, D., Ge, Y., Shi, J., Weijers, D., Yu, G., & Chen, J. (2020). RIdeogram/i2 : drawing 1040 SVG graphics to visualize and map genome-wide data on the idiograms. J Comput Sci, 1041 6, e251. https://doi.org/10.7717/peerj-cs.251 1042 Hummel, J., Segu, S., Li, Y., Irgang, S., Jueppner, J., & Giavalisco, P. (2011). Ultra 1043 Performance Liquid Chromatography and High Resolution Mass Spectrometry for the 1044 Analysis of Plant Lipids. Front in Plant Sci, 2. https://doi.org/10.3389/fpls.2011.00054 1045 Klee, H. J. (2010). Improving the flavor of fresh fruits: genomics, biochemistry, and 1046 biotechnology. New Phytol, 187(1), 44–56. https://doi.org/10.1111/j.1469-1047 8137.2010.03281.x 1048 Klee, H. J., & Tieman, D. M. (2013). Genetic challenges of flavor improvement in tomato. 1049 Trends Genet., 29(4), 257–262. https://doi.org/10.1016/j.tig.2012.12.003 1050 Klee, H. J., & Tieman, D. M. (2018). The genetics of fruit flavour preferences. Nat. Rev. 1051 Genet., 19(6), 347–356. https://doi.org/10.1038/s41576-018-0002-5 1052 Knapp, S., Bohs, L., Nee, M., & Spooner, D. M. (2004). Solanaceae—A Model for Linking 1053 Genomics with Biodiversity. Comp Funct Genomics, 5(3), 285–291. 1054 https://doi.org/10.1002/cfg.393 1055 Koenig, D., Jiménez-Gómez, J. M., Kimura, S., Fulop, D., Chitwood, D. H., Headland, L. R., 1056 Kumar, R., Covington, M. F., Devisetty, U. K., Tat, A. V., Tohge, T., Bolger, A., 1057 Schneeberger, K., Ossowski, S., Lanz, C., Xiong, G., Taylor-Teeples, M., Brady, S. M., 1058 Pauly, M., … Maloof, J. N. (2013). Comparative transcriptomics reveals patterns of 1059 selection in domesticated and wild tomato. Proc Natl Acad Sci, 110(28). 1060 https://doi.org/10.1073/pnas.1309606110 1061 Korte, A., & Farlow, A. (2013). The advantages and limitations of trait analysis with GWAS: a 1062 review. Plant Methods, 9(1), 29. https://doi.org/10.1186/1746-4811-9-29 1063 Kuhalskaya, A., Wijesingha Ahchige, M., Perez de Souza, L., Vallarino, J., Brotman, Y., & 1064 Alseekh, S. (2020). Network Analysis Provides Insight into Tomato Lipid Metabolism. 1065 Metabolites, 10(4), 152. https://doi.org/10.3390/metabo10040152 1066 Li, H., Peng, Z., Yang, X., Wang, W., Fu, J., Wang, J., Han, Y., Chai, Y., Guo, T., Yang, N., 1067 Liu, J., Warburton, M. L., Cheng, Y., Hao, X., Zhang, P., Zhao, J., Liu, Y., Wang, G., Li, 1068 J., & Yan, J. (2013). Genome-wide association study dissects the genetic architecture 1069 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 33 of oil biosynthesis in maize kernels. Nat. Genet., 45(1), 43–50. 1070 https://doi.org/10.1038/ng.2484 1071 Li, X., Tieman, D., Liu, Z., Chen, K., & Klee, H. J. (2020). Identification of a lipase gene with 1072 a role in tomato fruit short‐ chain fatty acid‐ derived flavor volatiles by genome‐ wide 1073 association. Plant J., 104(3), 631–644. https://doi.org/10.1111/tpj.14951 1074 Li-Beisson, Y., Shorrosh, B., Beisson, F., Andersson, M. X., Arondel, V., Bates, P. D., Baud, 1075 S., Bird, D., DeBono, A., Durrett, T. P., Franke, R. B., Graham, I. A., Katayama, K., 1076 Kelly, A. A., Larson, T., Markham, J. E., Miquel, M., Molina, I., Nishida, I., … Ohlrogge, 1077 J. (2013). Acyl-Lipid Metabolism. The Arabidopsis Book, 11, e0161. 1078 https://doi.org/10.1199/tab.0161 1079 Lin, T., Zhu, G., Zhang, J., Xu, X., Yu, Q., Zheng, Z., Zhang, Z., Lun, Y., Li, S., Wang, X., 1080 Huang, Z., Li, J., Zhang, C., Wang, T., Zhang, Y., Wang, A., Zhang, Y., Lin, K., Li, C., 1081 … Huang, S. (2014). Genomic analyses provide insights into the history of tomato 1082 breeding. Nat. Genet., 46(11), 1220–1226. https://doi.org/10.1038/ng.3117 1083 Lipka, A. E., Tian, F., Wang, Q., Peiffer, J., Li, M., Bradbury, P. J., Gore, M. A., Buckler, E. 1084 S., & Zhang, Z. (2012). GAPIT: genome association and prediction integrated tool. 1085 Bioinformatics, 28(18), 2397–2399. https://doi.org/10.1093/bioinformatics/bts444 1086 Liu, H., Ding, Y., Zhou, Y., Jin, W., Xie, K., & Chen, L.-L. (2017). CRISPR-P 2.0: An 1087 Improved CRISPR-Cas9 Tool for Genome Editing in Plants. Mol. Plant, 10(3), 530–532. 1088 https://doi.org/10.1016/j.molp.2017.01.003 1089 Liu, J., Van Eck, J., Cong, B., & Tanksley, S. D. (2002). A new class of regulatory genes 1090 underlying the cause of pear-shaped tomato fruit. Proc Natl Acad Sci, 99(20), 13302–1091 13306. https://doi.org/10.1073/pnas.162485999 1092 Luo, J. (2015). Metabolite-based genome-wide association studies in plants. Curr Opin Plant 1093 Biol, 24, 31–38. https://doi.org/10.1016/j.pbi.2015.01.006 1094 Luzarowska, U., Ruß, A.-K., Joubès, J., Batsale, M., Szymań ski, J., Thirumalaikumar, V. P., 1095 Luzarowski, M., Wu, S., Zhu, F., Endres, N., Khedhayir, S., Schumacher, J., Cordoba, 1096 S. M. C., Skirycz, A., Fernie, A. R., Li-Beisson, Y., Fusari, C. M., & Brotman, Y. (2020). 1097 Hello darkness, my old friend: 3-Ketoacyl-Coenzyme A Synthase4 is a branch point in 1098 the regulation of triacylglycerol synthesis in Arabidopsis by re-channeling fatty acids 1099 under carbon starvation. Plant Cell 1100 Manning, K., Tör, M., Poole, M., Hong, Y., Thompson, A. J., King, G. J., Giovannoni, J. J., & 1101 Seymour, G. B. (2006). A naturally occurring epigenetic mutation in a gene encoding an 1102 SBP-box transcription factor inhibits tomato fruit ripening. Nat. Genet. 38(8), 948–952. 1103 https://doi.org/10.1038/ng1841 1104 Martin, G. B., Brommonschenkel, S. H., Chunwongse, J., Frary, A., Ganal, M. W., Spivey, 1105 R., Wu, T., Earle, E. D., & Tanksley, S. D. (1993). Map-Based Cloning of a Protein 1106 Kinase Gene Conferring Disease Resistance in Tomato. Science, 262(5138), 1432–1107 1436. https://doi.org/10.1126/science.7902614 1108 Matsuda, F., Nakabayashi, R., Yang, Z., Okazaki, Y., Yonemaru, J., Ebana, K., Yano, M., & 1109 Saito, K. (2015). Metabolome‐ genome‐ wide association study dissects genetic 1110 architecture for generating natural variation in rice secondary metabolism. Plant J., 1111 81(1), 13–23. https://doi.org/10.1111/tpj.12681 1112 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 34 McCouch, S. (2004). Diversifying Selection in Plant Breeding. PLoS Biol., 2(10), e347. 1113 https://doi.org/10.1371/journal.pbio.0020347 1114 Miao, R., Lung, S.-C., Li, X., Li, X. D., & Chye, M.-L. (2019). Thermodynamic insights into an 1115 interaction between ACYL-CoA–BINDING PROTEIN2 and LYSOPHOSPHOLIPASE2 1116 in Arabidopsis. J Biol Chem, 294(16), 6214–6226. 1117 https://doi.org/10.1074/jbc.RA118.006876 1118 Minutolo, M., Amalfitano, C., Evidente, A., Frusciante, L., & Errico, A. (2013). Polyphenol 1119 distribution in plant organs of tomato introgression lines. Nat Prod Res, 27(9), 787–795. 1120 https://doi.org/10.1080/14786419.2012.704371 1121 Mitchell-Olds, T. (2010). Complex-trait analysis in plants. Genome Biol., 11(4), 113. 1122 https://doi.org/10.1186/gb-2010-11-4-113 1123 Mwenda, C. M., & Matsui, K. (2014). The importance of lipoxygenase control in the 1124 production of green leaf volatiles by lipase-dependent and independent pathways. Plant 1125 Biotechnol. (Vol. 31, Issue 5). https://doi.org/10.5511/plantbiotechnology.14.0924a 1126 Overy, S. A. (2004). Application of metabolite profiling to the identification of traits in a 1127 population of tomato introgression lines. J Exp Bot, 56(410), 287–296. 1128 https://doi.org/10.1093/jxb/eri070 1129 Rambla, J. L., Medina, A., Fernández-del-Carmen, A., Barrantes, W., Grandillo, S., 1130 Cammareri, M., López-Casado, G., Rodrigo, G., Alonso, A., García-Martínez, S., Primo, 1131 J., Ruiz, J. J., Fernández-Muñoz, R., Monforte, A. J., & Granell, A. (2016). 1132 Identification, introgression, and validation of fruit volatile QTLs from a red-fruited wild 1133 tomato species. J Exp Bot, erw455. https://doi.org/10.1093/jxb/erw455 1134 Rambla, J. L., Tikunov, Y. M., Monforte, A. J., Bovy, A. G., & Granell, A. (2013). The 1135 expanded tomato fruit volatile landscape. J Exp Bot, 65(16), 4613–4623. 1136 https://doi.org/10.1093/jxb/eru128 1137 Ranc, N., Muños, S., Santoni, S., & Causse, M. (2008). A clarified position for solanum 1138 lycopersicum var. cerasiformein the evolutionary history of tomatoes (solanaceae). 1139 BMC Plant Biol., 8(1), 130. https://doi.org/10.1186/1471-2229-8-130 1140 Reem, N. T., & Van Eck, J. (2019). Application of CRISPR/Cas9-Mediated Gene Editing in 1141 Tomato. Metods Mol Biol (pp. 171–182). https://doi.org/10.1007/978-1-4939-8991-1_13 1142 Rodríguez-Leal, D., Lemmon, Z. H., Man, J., Bartlett, M. E., & Lippman, Z. B. (2017). 1143 Engineering Quantitative Trait Variation for Crop Improvement by Genome Editing. Cell, 1144 171(2), 470-480.e8. https://doi.org/10.1016/j.cell.2017.08.030 1145 Ronen, G., Carmel-Goren, L., Zamir, D., & Hirschberg, J. (2000). An alternative pathway to 1146 β -carotene formation in plant chromoplasts discovered by map-based cloning of Beta 1147 and old-gold color mutations in tomato. Proc Natl Acad Sci, 97(20), 11102–11107. 1148 https://doi.org/10.1073/pnas.190177497 1149 Rosso, M. G., Li, Y., Strizhov, N., Reiss, B., Dekker, K., & Weisshaar, B. (2003). An 1150 Arabidopsis thaliana T-DNA mutagenized population (GABI-Kat) for flanking sequence 1151 tag-based reverse genetics. Plant Mol. Biol, 53(1/2), 247–259. 1152 https://doi.org/10.1023/B:PLAN.0000009297.37235.4a 1153 Rousseaux, M. C., Jones, C. M., Adams, D., Chetelat, R., Bennett, A., & Powell, A. (2005). 1154 QTL analysis of fruit antioxidants in tomato using Lycopersicon pennellii introgression 1155 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 35 lines. Theor Appl Genet, 111(7), 1396–1408. https://doi.org/10.1007/s00122-005-0071-1156 7 1157 Sauvage, C., Segura, V., Bauchet, G., Stevens, R., Do, P. T., Nikoloski, Z., Fernie, A. R., & 1158 Causse, M. (2014). Genome-Wide Association in Tomato Reveals 44 Candidate Loci 1159 for Fruit Metabolic Traits. Plant Physiol., 165(3), 1120–1132. 1160 https://doi.org/10.1104/pp.114.241521 1161 Schauer, N., Semel, Y., Balbo, I., Steinfath, M., Repsilber, D., Selbig, J., Pleban, T., Zamir, 1162 D., & Fernie, A. R. (2008). Mode of Inheritance of Primary Metabolic Traits in Tomato. 1163 Plant Cell, 20(3), 509–523. https://doi.org/10.1105/tpc.107.056523 1164 Schauer, N., Semel, Y., Roessner, U., Gur, A., Balbo, I., Carrari, F., Pleban, T., Perez-Melis, 1165 A., Bruedigam, C., Kopka, J., Willmitzer, L., Zamir, D., & Fernie, A. R. (2006). 1166 Comprehensive metabolic profiling and phenotyping of interspecific introgression lines 1167 for tomato improvement. Nature Biotechnol., 24(4), 447–454. 1168 https://doi.org/10.1038/nbt1192 1169 Schilmiller, A. L., Moghe, G. D., Fan, P., Ghosh, B., Ning, J., Jones, A. D., & Last, R. L. 1170 (2015). Functionally Divergent Alleles and Duplicated Loci Encoding an Acyltransferase 1171 Contribute to Acylsugar Metabolite Diversity in Solanum Trichomes. Plant Cell, 27(4), 1172 1002–1017. https://doi.org/10.1105/tpc.15.00087 1173 Schmittgen, T. D., & Livak, K. J. (2008). Analyzing real-time PCR data by the comparative 1174 CT method. Nat. Protoc., 3(6), 1101–1108. https://doi.org/10.1038/nprot.2008.73 1175 Schwab, W., Davidovich‐ Rikanati, R., & Lewinsohn, E. (2008). Biosynthesis of plant‐ derived 1176 flavor compounds. Plant J., 54(4), 712–732. https://doi.org/10.1111/j.1365-1177 313X.2008.03446.x 1178 Shen, J., Tieman, D., Jones, J. B., Taylor, M. G., Schmelz, E., Huffaker, A., Bies, D., Chen, 1179 K., & Klee, H. J. (2014). A 13-lipoxygenase, TomloxC, is essential for synthesis of C5 1180 flavour volatiles in tomato. J. Exp Bot, 65(2), 419–428. 1181 https://doi.org/10.1093/jxb/ert382 1182 Shirasawa, K., Fukuoka, H., Matsunaga, H., Kobayashi, Y., Kobayashi, I., Hirakawa, H., 1183 Isobe, S., & Tabata, S. (2013). Genome-Wide Association Studies Using Single 1184 Nucleotide Polymorphism Markers Developed by Re-Sequencing of the Genomes of 1185 Cultivated Tomato. DNA Res, 20(6), 593–603. https://doi.org/10.1093/dnares/dst033 1186 Shockey, J., Kuhn, D., Chen, T., Cao, H., Freeman, B., & Mason, C. (2018). Cyclopropane 1187 fatty acid biosynthesis in plants: phylogenetic and biochemical analysis of Litchi 1188 Kennedy pathway and acyl editing cycle genes. Plant Cell Rep., 37(11), 1571–1583. 1189 https://doi.org/10.1007/s00299-018-2329-y 1190 Szymań ski, J., Bocobza, S., Panda, S., Sonawane, P., Cárdenas, P. D., Lashbrooke, J., 1191 Kamble, A., Shahaf, N., Meir, S., Bovy, A., Beekwilder, J., Tikunov, Y., Romero de la 1192 Fuente, I., Zamir, D., Rogachev, I., & Aharoni, A. (2020). Analysis of wild tomato 1193 introgression lines elucidates the genetic basis of transcriptome and metabolome 1194 variation underlying fruit traits and pathogen response. Nat. Genet., 52(10), 1111–1121. 1195 https://doi.org/10.1038/s41588-020-0690-6 1196 Tieman, D., Bliss, P., McIntyre, L. M., Blandon-Ubeda, A., Bies, D., Odabasi, A. Z., 1197 Rodríguez, G. R., van der Knaap, E., Taylor, M. G., Goulet, C., Mageroy, M. H., 1198 Snyder, D. J., Colquhoun, T., Moskowitz, H., Clark, D. G., Sims, C., Bartoshuk, L., & 1199 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 36 Klee, H. J. (2012). The Chemical Interactions Underlying Tomato Flavor Preferences. 1200 Curr Biol., 22(11), 1035–1039. https://doi.org/10.1016/j.cub.2012.04.016 1201 Tieman, D., Taylor, M., Schauer, N., Fernie, A. R., Hanson, A. D., & Klee, H. J. (2006). 1202 Tomato aromatic amino acid decarboxylases participate in synthesis of the flavor 1203 volatiles 2-phenylethanol and 2-phenylacetaldehyde. Proc Natl Acad Sci, 103(21), 1204 8287–8292. https://doi.org/10.1073/pnas.0602469103 1205 Tieman, D., Zhu, G., Resende, M. F. R., Lin, T., Nguyen, C., Bies, D., Rambla, J. L., Beltran, 1206 K. S. O., Taylor, M., Zhang, B., Ikeda, H., Liu, Z., Fisher, J., Zemach, I., Monforte, A., 1207 Zamir, D., Granell, A., Kirst, M., Huang, S., & Klee, H. (2017). A chemical genetic 1208 roadmap to improved tomato flavor. Science, 355(6323), 391–394. 1209 https://doi.org/10.1126/science.aal1556 1210 Toubiana, D., Batushansky, A., Tzfadia, O., Scossa, F., Khan, A., Barak, S., Zamir, D., 1211 Fernie, A. R., Nikoloski, Z., & Fait, A. (2015). Combined correlation‐ based network and 1212 mQTL analyses efficiently identified loci for branched‐ chain amino acid, 1213 serine to threonine, and proline metabolism in tomato seeds. Plant J., 81(1), 121–133. 1214 https://doi.org/10.1111/tpj.12717 1215 Toubiana, D., Semel, Y., Tohge, T., Beleggia, R., Cattivelli, L., Rosental, L., Nikoloski, Z., 1216 Zamir, D., Fernie, A. R., & Fait, A. (2012). Metabolic Profiling of a Mapping Population 1217 Exposes New Insights in the Regulation of Seed Metabolism and Seed, Fruit, and Plant 1218 Relations. PLoS Genet., 8(3), e1002612. https://doi.org/10.1371/journal.pgen.1002612 1219 Vanholme, R., Cesarino, I., Rataj, K., Xiao, Y., Sundin, L., Goeminne, G., Kim, H., Cross, J., 1220 Morreel, K., Araujo, P., Welsh, L., Haustraete, J., McClellan, C., Vanholme, B., Ralph, 1221 J., Simpson, G. G., Halpin, C., & Boerjan, W. (2013). Caffeoyl Shikimate Esterase 1222 (CSE) Is an Enzyme in the Lignin Biosynthetic Pathway in Arabidopsis. Science, 1223 341(6150), 1103–1106. https://doi.org/10.1126/science.1241602 1224 Vick, B. A., & Zimmerman, D. C. (1984). Biosynthesis of Jasmonic Acid by Several Plant 1225 Species. Plant Physiol., 75(2), 458–461. https://doi.org/10.1104/pp.75.2.458 1226 Vijayaraj, P., Jashal, C. B., Vijayakumar, A., Rani, S. H., Venkata Rao, D. K., & 1227 Rajasekharan, R. (2012). A Bifunctional Enzyme That Has Both Monoacylglycerol 1228 Acyltransferase and Acyl Hydrolase Activities. Plant Physiol., 160(2), 667–683. 1229 https://doi.org/10.1104/pp.112.202135 1230 Wang, C., Xing, J., Chin, C.-K., Ho, C.-T., & Martin, C. E. (2001). Modification of fatty acids 1231 changes the flavor volatiles in tomato leaves. Phytochem., 58(2), 227–232. 1232 https://doi.org/10.1016/S0031-9422 (01)00233-3 1233 Wen, W., Brotman, Y., Willmitzer, L., Yan, J., & Fernie, A. R. (2016). Broadening Our 1234 Portfolio in the Genetic Improvement of Maize Chemical Composition. Trends Genet., 1235 32(8), 459–469. https://doi.org/10.1016/j.tig.2016.05.003 1236 Wu, S., Alseekh, S., Cuadros-Inostroza, Á., Fusari, C. M., Mutwil, M., Kooke, R., Keurentjes, 1237 J. B., Fernie, A. R., Willmitzer, L., & Brotman, Y. (2016). Combined Use of Genome-1238 Wide Association Data and Correlation Networks Unravels Key Regulators of Primary 1239 Metabolism in Arabidopsis thaliana. PLoS Genet., 12(10), e1006363. 1240 https://doi.org/10.1371/journal.pgen.1006363 1241 Wu, S., Tohge, T., Cuadros-Inostroza, Á., Tong, H., Tenenboim, H., Kooke, R., Méret, M., 1242 Keurentjes, J. B., Nikoloski, Z., Fernie, A. R., Willmitzer, L., & Brotman, Y. (2018). 1243 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 37 Mapping the Arabidopsis Metabolic Landscape by Untargeted Metabolomics at 1244 Different Environmental Conditions. Mol. Plant, 11(1), 118–134. 1245 https://doi.org/10.1016/j.molp.2017.08.012 1246 Xiao, H., Jiang, N., Schaffner, E., Stockinger, E. J., & van der Knaap, E. (2008). A 1247 Retrotransposon-Mediated Gene Duplication Underlies Morphological Variation of 1248 Tomato Fruit. Science, 319(5869), 1527–1530. https://doi.org/10.1126/science.1153040 1249 Yeats, T. H., Buda, G. J., Wang, Z., Chehanovsky, N., Moyle, L. C., Jetter, R., Schaffer, A. 1250 A., & Rose, J. K. C. (2012). The fruit cuticles of wild tomato species exhibit architectural 1251 and chemical diversity, providing a new model for studying the evolution of cuticle 1252 function. Plant J., 69(4), 655–666. https://doi.org/10.1111/j.1365-313X.2011.04820.x 1253 Zamir, D. (2001). Improving plant breeding with exotic genetic libraries. Nat. Rev. Genet., 1254 2(12), 983–989. https://doi.org/10.1038/35103590 1255 Zemach, I., Alseekh, S., Tadmor‐ Levi, R., Fisher, J., Torgeman, S., Trigerman, S., Nauen, 1256 J., Hayut, S. F., Mann, V., Rochsar, E., Finkers, R., Wendenburg, R., Osorio, S., 1257 Bergmann, S., Lunn, J. E., Semel, Y., Hirschberg, J., Fernie, A. R., & Zamir, D. (2023). 1258 Multi‐ year field trials provide a massive repository of trait data on a highly diverse 1259 population of tomato and uncover novel determinants of tomato productivity. Plant J., 1260 116(4), 1136–1151. https://doi.org/10.1111/tpj.16268 1261 Zhang, J., Zhao, J., Xu, Y., Liang, J., Chang, P., Yan, F., Li, M., Liang, Y., & Zou, Z. (2015). 1262 Genome-Wide Association Mapping for Tomato Volatiles Positively Contributing to 1263 Tomato Flavor. Front. Plant Sci., 6. https://doi.org/10.3389/fpls.2015.01042 1264 Zhang, Z., Ersoz, E., Lai, C.-Q., Todhunter, R. J., Tiwari, H. K., Gore, M. A., Bradbury, P. J., 1265 Yu, J., Arnett, D. K., Ordovas, J. M., & Buckler, E. S. (2010). Mixed linear model 1266 approach adapted for genome-wide association studies. Nat. Genet., 42(4), 355–360. 1267 https://doi.org/10.1038/ng.546 1268 Zhao, J., Sauvage, C., Zhao, J., Bitton, F., Bauchet, G., Liu, D., Huang, S., Tieman, D. M., 1269 Klee, H. J., & Causse, M. (2019). Meta-analysis of genome-wide association studies 1270 provides insights into genetic control of tomato flavor. Nat. Commun., 10(1), 1534. 1271 https://doi.org/10.1038/s41467-019-09462-w 1272 Zhu, G., Wang, S., Huang, Z., Zhang, S., Liao, Q., Zhang, C., Lin, T., Qin, M., Peng, M., 1273 Yang, C., Cao, X., Han, X., Wang, X., van der Knaap, E., Zhang, Z., Cui, X., Klee, H., 1274 Fernie, A. R., Luo, J., & Huang, S. (2018). Rewiring of the Fruit Metabolome in Tomato 1275 Breeding. Cell, 172(1–2), 249-261.e12. https://doi.org/10.1016/j.cell.2017.12.019 1276 1277 .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Main Figures 1-8 for Genetic architecture of the tomato fruit lipidome; new insights link lipid and volatile compounds Kuhalskayaet al. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint A S. lycopersicum S. Lycopersicumvar. cerasiforme S. pimpinellifolium Wild tomato species S. lycopersicum S. Lycopersicumvar. cerasiforme S. pimpinellifolium Wild tomato species Figure 1. B C .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Characterization of Natural Variation in Lipophilic Metabolites Across 550 Different Tomato Accessions. A) Numbers of lipid compounds measured by LC-MS in 550 tomato accessions and their compound classes. B) PCA of lipid content for tomato lines representing green-fruited wild species (green dots), cultivated varieties (red dots), cherry tomato varieties S. lycopersicum var. cerasiforme (pink dots), and red-fruited wild accessions of S. pimpinellifolium (blue dots). Each dot represents a single accession. C) Box plots indicating the average value of all compounds for each lipid class in diverse wild accessions (n = 29), S. pimpinellifolium (n = 30), S. lycopersicum var. cerasiforme (n = 62), and S. lycopersicum (n = 398). Significances are indicated by * < 0.05, ** < 0.01, *** < 0.001 using Student’s t-test. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Figure 2. A B C .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Pleiotropic Map Summarizing Quantitative Fruit Mapping. A) Chromosomal distribution of the QTL derived from GWAS represents the combined results from the 2014 and 2015 seasons using SNPs markers generated from Genotype by Sequencing (GBS) and Whole Genome Sequencing (WGS). Colors indicate different lipid classes. The inner circle specifies the amount of lipids mapped to the identified region. QTL harboring candidate genes are highlighted. B) Bar charts show the number of significant SNPs associated with each lipid class chromosome- wise. C) Number of traits associated with significant markers for GWAS on each chromosome (upper panel) and BIL (lower panel). The corresponding lipid compounds and number of QTL are provided in Supplemental Data Sets S1-5. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint D Sl-LIP8 (Solyc09g091050) Figure 3. A B C E .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Lipid Contents are Associated with the Locus Harboring SI-LIP8 (Solyc09g091050). A) Manhattan plots of the mGWAS results using GBS SNPs data. B) Accessions were separated by the lead SNP and the average lipid level was determined. Zero represents the homozygous genotype for the first allele, one represents the heterozygote, and two represents the homozygous genotype for the other allele. C) The average lipid level in each of the following groups: S. lycopersicum (n = 398) , S. lycopersicum var. cerasiforme (n = 62), S. pimpinellifolium (n = 30), and diverse wild tomato species (n = 27). D) SI-LIP8 transcript levels in fruits of S. lycopersicum (n = 258), S. lycopersicum var. cerasiforme (n = 56), and diverse wild tomato species (n = 6). E) Volcano plot showing the abundance of selected lipids in SI-LIP8 KO and wild type (Fla. 8059). Lipid levels were calculated as a log2 fold change of Fla. 8059. Significances are indicated by * < 0.05, ** < 0.01, *** < 0.001 using Student’s t-test. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint CFAPS1 (Solyc09g090510) Figure 4. D A B C E FG H .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Phospho-, Galacto- and Glycerolipid Contents are Associated with the CFAPS1 (Solyc09g090510) Locus. A) Manhattan plots of mGWAS using GBS SNPs data. B) Manhattan plots of the mGWAS using WGS SNPs data. C) Lipid contents in different haplotypes based on the lead mGWAS SNP. Zero is homozygous for the first allele; one is heterozygous; two is homozygous for the second allele. D) Lipid analysis of accessions with different haplotypes. E) The average lipid level in each of the following: S. lycopersicum (n = 398), S. lycopersicum var. cerasiforme (n = 62), S. pimpinellifolium (n = 30), and diverse wild tomato accessions (n = 27). F) CFAPS1 transcript level in fruits of S. lycopersicum (n = 240) and S. lycopersicum var. cerasiforme (n = 43). G) Abundance of selected lipids in CFAPS1 KO and control (Fla. 8059) fruits. H) Abundance of short-chain FA-VOC (C5, C6) and longer-chain FA-VOC (C7, C8) volatiles in CFAPS1 KO and control (Fla. 8059) fruits. Significances are indicated by * < 0.05, ** < 0.01, *** < 0.001 using Student’s t- test. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint TomLLP (Solyc03g119980) 40 genes 68.41 Mb68.74 Mb NEO 030 NEO 113 NEO 116 NEO 129 Chromosome 3 NEO 95 Figure 5. D A B C E F G .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Linkage Mapping Identifies a Role for TomLLP (Solyc03g119980) in Fruit Lipid Metabolism. A) Association plot of PC 38:3 obtained with linkage mapping using S. neorickii BIL population. B) Manhattan plot of mGWAS of TAG 50:3 using WGS SNPs data. C) Lipid contents of two haplotypes for accessions separated by the lead mGWAS SNP. D) S. neorickii tomato segments introgressed into cultivated tomato variety TA209 on chromosome 3. E) Levels of PC 36:1 and PC 38:3 in BILs sharing the S. neorickii introgression on chromosome 3 and BILs with the TA209 background. F) TomLLP transcript levels in the TomLLP overexpression line and wild type M82. G) Level of selected lipid in the TomLLP overexpression line and M82. Significances are indicated by * < 0.05, ** < 0.01, *** < 0.001 using Student’s t-test. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Figure 6. A B .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint CSE (At1g52760) Influences the Lipid Metabolism in Arabidopsis. A) Heatmap shows the significant (p ≤ 0.05) changes in lipid levels between wild-type and the cse knock-out (KO) and knock-down (KD) lines. B) Changes in lipid levels of selected lipids classes between the cse KO and KD lines and the wild type. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Solyc01g006540, TomLox (cis-mQTL) Solyc01g006540, TomLox (cis-eQTL) Figure 7. D A B C E F G H .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Phospho-, Galacto- and Glycerolipid Content is Associated with the TomLoxC (Solyc01g006540) Locus. A) Manhattan plots for mGWAS using GBS SNPs data. B) Manhattan plot for eGWAS using WGS SNPs data. C) Lipid contents for a group of accessions separated by the lead mGWAS SNP. Zero, homozygous for the first allele; one, heterozygous; two, homozygous for the second allele.D) Lipid contents for accessions grouped by the lead eGWAS SNP. E) Average lipid levels for S. lycopersicum (n = 398), S. lycopersicum var. cerasiforme (n = 62), S. pimpinellifolium (n = 30), and diverse wild tomato species (n = 27). F) TomLoxC transcript level in fruits of S. lycopersicum (n = 258), S. lycopersicum var. cerasiforme (n = 56), S. pimpinellifolium (n = 6) G) PCA plot of lipid levels in TomLoxC KO and wild type M82. H) Heatmap representing the abundance of short-chain FA-VOC (C5, C6) and longer-chain FA- VOC (C7, C8) in TomLoxC KO and wild type M82. Significances are indicated by * < 0.05, ** < 0.01, *** < 0.001 using Student’s t- test. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint PEs volatiles TAGs DGDGs The transcript level of lipid-related genes PCs MGDGs DAGs Positive correlation Negative correlation Figure 8. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Metabolite-Transcript-Volatile Correlation-based Network. Each node represents a metabolite or gene transcript; edges connecting two nodes show an correlation (R ≤ -0.3, or R ≥ 0.3) between the two nodes. In total, the network is composed of 185 nodes and about 335 edges assembled into three large groups: lipophilic metabolites comprise 74 nodes, gene expression data have 107 nodes, and four nodes for volatile organic compounds (VOCs; Supplemental Data Set S14). There are 672 genes with homology to genes known to be involved in lipid metabolism Garbowicz et al., 2018). Transcript levels were used to construct the network (Zhu et al., 2018), VOC data (Tieman et al., 2017), and all other lipid metabolites derived from the current study. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Supplemental Figures 1-11 for Genetic architecture of the tomato fruit lipidome; new insights link lipid and volatile compounds Kuhalskayaet al. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint S. ly. var. cerasiforme S. pimpinellifolium Wild relatives S. lycopersicum Lipidomic profiling Candidate genes & Biological validation 117 BILs x 3 replicates x two experiments 550 GWAS panel x two experiments The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again. Correlation analysis (RNA-seq, lipids and volatiles) BILs Supplemental Figure S1. Schematic model of conducted exper iments focused on investigation of genes underlying lipid metabolism in tomato fruit pericarp applying forward genetic approaches using association panels represented by A) S. neorickii biparental population, and B) unrelated cultivated tomato genotypes for genome-wide association study (GWAS). A B .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint tomato accessions GWAS season 2014 DAGs DGDGs MGDGs PCs PEs TAGs tomato accessions GWAS season 2015 DAGs DGDGs MGDGs PCs PEs TAGs Supplemental Figure S2. Heatmap of lipid levels across 550 accessions of the GWAS panel. The data represent lipidomic profiling of material harvested in two consecutive years 2014 A) and 2015 B) of plants grown in the greenhouse. For each lipid species mean lipid level was calculated and the level of the same lipid in each accession was normalized to this mean by dividing each lipid value by this mean. Each season was normalized separately and presented in a logarithmic scale (log2). Regions of red or blue indicate lower or higher compared to the average of each lipid species, respectively. A B .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Supplemental Figure S3. Heatmap of lipid levels across 550 accessions of the GWAS panel belonging to S. lycopersicum, S. lycopersicum var. cerasiforme, S. pimpinellifolium, and wild tomato groups. For each lipid species mean lipid level was calculated and the level of the same lipid in each accession was normalized to this mean by dividing each lipid value by this mean. The data are presented in logarithmic scale (log2). Regions of red or blue indicate lower or higher compared to the average of each lipid species, respectively. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Supplemental Figure S4. Heat map of lipid profiling across S. neorickii backcross inbred lines (BILs). The data represent lipidomic profiling of material harvested from S. neorickii BILs population A) heterozygous and B) homozygous lines. For each lipid species mean lipid level were calculated and the level of the same lipid in each BIL were normalized to this mean by dividing each lipid value by this mean. Each season was normalized separately and presented i n a logarithmic scale (log2). Regions of red or blue indicate lower or higher compared to the average of eac h lipid species, respectively. Regions of white color, reflecting many of the chromosomal segment substitutions, do not affect lipid levels. A B .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint HTZ S. neorickii BIL HMZ S. neorickii BIL Supplemental Figure S5. Chromosomal distribution of identified mQTL. A) Idiogram represents a chromosomal distribution of the mQTL resulting from GWAS of material harvested in two consecutive year s using GBS and WGS SNPs data. B) Chromosomal distribution of the mQTL found in the BIL mapping of heterozygous and homozygous lines. A B .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 2.66 Mb3.29 Mb BIL: TAG:54:8 61 genes NEO 91 Chromosome 6 NEO 99 NEO 103 NEO 142 NEO 100 NEO 002 Supplemental Figure S6. Linkage mapping of lipids in the BILs population reveled an mQTL comprising 61 genes harboring Solyc06g008920, encoding a Acetyl-CoA synthetase. A) Association plot of TAG 54:8 obtained with linkage mapping using S. neorickii BIL population. B) S. neorickii tomato segments introgressed into cultivated tomato variety TA 209 on chromosome 6. C) Levels of DAG 36:5, DGDG 36:6, and TAG 54:8 in BILs sharing the S. neorickii introgression on chromosome 6 and BILs with the TA209 background. D) Average lipid levels for S. lycopersicum (n = 388) ,S . lycopersicum var. cerasiforme (n = 61), S. pimpinellifolium (n = 30), and diverse wild tomato species (n = 25). E) Solyc06g08920 transcript level in fruits of S. lycopersicum (n = 258) , S. lycopersicum var. cerasiforme (n = 56), and S. pimpinellifolium (n = 6). Significances (p-value) are indicated by letters using Student‘s t-test. DA B C E .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint 5’ 3’ Solyc09g090510 (WT) ATGATTATTTTTTTTATTGCAGTTATAATATTTGATTGAAGACGAGAAAA ATGAAAGTAGCAATTGTAGGGGCA GGG--(87-bp)--TAAAACCGTTACCGTTAA CGG Solyc09g090510 ( - 166/+19) ATGAT---------------------------------------------------------(-166-bp)-------------------------------------------------------- CTTGACCTTAAAACCGTTA CCGTTACCGTTAA CGG +5 +133 sgRNA1 sgRNA2 PAM PAM Exon1 COG2907 +1 +862 Solyc09g090510 (WT) Cfa Solyc09g090510 ( - 166/+19) Totally eliminated Supplemental Figure S7. Construc tion and characterization of CFAPS1-edited lines. CFAPS1 KO line (Fla. 8059 background) exhibits a deletion of 166 bp and an insertion of 19 bp in the first exon. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint Supplemental Figure S8 . Phylogenetic analysis of the caffeoyl shikimate esterase (CSE) family. Coding sequences of genes with confirmed function as CSE or putatively annotated as CSE were used for the construction of a phylogenetic tree. Frame highlights two genes, the tomato TomLLP and CSE from Arabidopsis. Genes IDs are specified in Supplemental Data Set S9. A phy logenetic tree was reconstructed with the neighbor-joining method. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint NEO12 3 .1 NEO12 4 .1 NEO12 4 .2 NEO12 3 .3 NEO12 0 .1 NEO12 9 .1 Supplemental Figure S9. Expression level of TomLLP across six S. neorickii BILs. Monitoring the expression level of TomLLP using PCR on six BILs. Of which, in the region containing TomLLP: three BILs with the S. neorickii background and three BILs with the TA209 background. .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint MAG DAG TAG monoacylglycerol acyltransferase (MGAT) lysophospholipids phospholipids lysophospholipase activity Arabidopsis orthologue ( At1g52760) Solyc06g00892 0 + lipid oxylipins increased level Supplemental Figure S10. The metabolic pathways involve the identified lipid-related gene candidates.A ) Schematic representation of the p rocess of fatty acids synthesis by acetyl-CoA synthetase (Solyc06g008920) using pyruvate as a substrate. B) Schematic representation of the pathway of volatile synthesis from the free fatty acids liberated from triacylglycerol by class III lipase (Solyc09g091050). C) Schematic representation of the process of conversion of membrane lipids ( phospho- and galactolipids) to acylglycerols via cyclopropane-fatty-acyl-phospholipid with subsequent volatile production. D) Schematic representation of the process of lipid oxylipins and volatile production through the lipoxygenase enzymes (Solyc01g06540) in the lipase-independent pathway. E) The role of CSE ( At1g52760, orthologue of Solyc03g119980) in the lignin biosynthetic pathway with additional identified acyltransferase and hydrolase activities. D A B C E .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint S. lycopersicum (cv. TA209) S. Neorickii (LA2133) genotype --- Supplemental Figure S11. Breeding scheme and genetic map for backcross inbred lines (BILs). A) First cross: pollen from S. neorickii was placed onto the stigma of cv. TA209 to obtain F1 plants. Additional backcrosses with cv. TA209 was performed to decrease the S. neorickii genome introgression in the BILs. For each generation, the amount of plants is shown in parentheses. A final cross of the homozygous BILs with cv. TA209 was performed to obtain heterozygous lines (Brog et al., 2019). B) Schematic representation of S. neorickii backcrossed inbred lines. The BILs harbor on average 4.3 introgressions per line, with a mean introgression length of 34.7 Mbp, allowing the division of the genome into 340 bins and enabling rapid trait mapping (Brog et al., 2019). A B .CC-BY-NC-ND 4.0 International licenseavailable under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (whichthis version posted July 8, 2024. ; https://doi.org/10.1101/2024.07.08.602461doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-pdf

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-23T02:00:01.238055+00:00
License: CC-BY-NC-ND-4.0