{"paper_id":"21be25d4-5bc2-48ea-97cf-2416cf5c0a64","body_text":"1\t\nLinage priming and cell type proportioning depends on the \ninterplay between stochastic and deterministic factors  \n \nWilliam M. Salvidge1, Chris Brimson1, Nicole Gruenheit1, Li-Yao Huang2, Catherine J. \nPears2, Jason B. Wolf3 and Christopher R.L. Thompson1* \n \n1 Centre for Life's Origins and Evolution, Department of Genetics, Evolution and \nEnvironment, University College London, Darwin Building, Gower Street, London, \nWC1E 6BT, UK \n \n2 Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 \n3QU, UK \n \n3 Milner Centre for Evolution and Department of Life Sciences, University of Bath, \nClaverton Down, Bath, BA2 7AY, UK  \n \n \n* Author for correspondence \n  \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n2\t\nAbstract \n \nEmbryonic stem cells (ESCs) can break symmetry and differentiate along different \nlineages, even when exposed to a seemingly identical environment. It is thought that \nthis priming of cells towards different lineages is due to cell-cell variation, although the \nunderlying mechanisms are poorly understood. To address this, we exploit the \ntractability of the social amoeba Dictyostelium discoideum, where cell fate choice also \ndoes not depend on spatial cues. We develop and test a model to explain quantitative \nexperimental single cell observations of probabilistic differentiation. The model \nsuggests that cell cycle position affects lineage choice, as previously shown, but that \nstochastic cell-cell variation also plays a key role. Single cell sequencing reveals genes \nstrongly associated with fate choice exhibit extensive stochastic cell-cell expression \nvariation. Like lineage priming genes in ESCs, they are associated with specific \nepigenetic modifications, which when perturbed affect their expression and disrupt fate \nchoice. We suggest this represents an adaptive mechanism that increases \ndevelopmental robustness against perturbations that affect deterministic signals.  \n \n  \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n3\t\nIntroduction \n \nThe robust spatial and temporal control of symmetry-breaking and cell type \ndifferentiation is a fundamental feature of developmental systems. The mechanisms \nunderlying the spatial control of type specialization have been extensively studied\t\n(Slack, 2014). However, there are many examples of robust cell fate choice in the \nabsence of spatial cues. Examples include embryonic stem cells (ESCs) in culture \n(Pauklin and Vallier, 2013), lineage choice in the mouse blastocyst (Saiz et al., 2016; \nYamanaka et al., 2010) and ‘bet hedging’ strategies in microbial populations (Maamar \net al., 2007; Süel et al., 2007). In these cases, differentiation is probabilistic and is \nthought to depend on cell-cell heterogeneity. Intrinsic variation in cell physiology, redox \nstate and gene expression arising from differences in cell cycle position, cell size, \ninheritance of cell contents upon cell division and transcriptional bursting have all been \nlinked to fate choice (Huh and Paulsson, 2011; Raj et al., 2006). These factors can \nproduce robust cell type proportioning if the probability of cells being in different states \nis predictable at the cell-population level (if the population size is sufficiently large).  \nDespite the widespread use of heterogeneity to facilitate developmental \ndecision making, we still have a limited understanding of the underlying mechanisms \nor how variation is tuned to result in robust proportioning. Work has centred around \nthe identification of factors that ‘prime’ cells for differentiation before they commit to a \nparticular cell fate and express genes associated with specific cell lineages. For \nexample, genes associated with ESC differentiation exhibit bivalent H3K4me3 and \nH3K27me3 epigenetic marks, suggesting a role in coordinating their expression\t\n(Azuara et al., 2006; Bernstein et al., 2006). H3K4Me3 deposition has also been shown \nto be sensitive to redox state, thus providing a potential explanation for the link \nbetween reactive oxygen species and ESC differentiation\t(Ulfig and Jakob, 2024). It \nhas also been suggested that the level of stochasticity in the expression of lineage \nassociated genes could influence the number of cells that are in the primed state and \nthus the number of  differentiating cells (Desai et al., 2021). Finally, deterministic \ndifferences between cells, such as cell cycle position have been shown to influence \nthe response to differentiation cues and lineage choice (Pauklin and Vallier, 2013). \n The social amoeba Dictyostelium discoideum provides a uniquely tractable \nmodel system to study the mechanisms underlying lineage priming and the control of \nrobust probabilistic differentiation. In response to starvation D. discoideum cells \nundergo a multicellular developmental programme to build a fruiting body consisting \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n4\t\nof hardy spore cells and dead stalk cells that aid dispersal from deteriorating \nenvironments. There is assumed to be selective pressure to ensure that the ‘fittest’ or \nmost energy rich cells are chosen as spores, which requires mechanisms that bias \ndifferentiation. Furthermore, there is assumed to be strong selective pressure to \nensure robust output of ‘optimal’ cell type proportioning.  Excess stalk cell production \nis costly because it reduces the number of viable spores available for dispersal, while \nunderproduction of stalk can compromise fruiting body architecture, limiting the \nsuccess of spore dispersal (Buttery et al., 2009; Madgwick et al., 2018; Rodrigues and \nGardner, 2022; Wolf et al., 2015).  \n Symmetry breaking and initial cell type differentiation in D. discoideum does \nnot depend on spatial cues. Instead, cell fate choice has been associated with \ndifferences in cell cycle position at the time of starvation (Gomer and Firtel, 1987; \nGruenheit et al., 2018). Cell cycle position affects the threshold of responsiveness to \ndiffusible inducers of spore or stalk cell differentiation, such as cAMP and DIF-1 (even \nthough all cells experience the same amount of each signal) (Chattwood et al., 2013; \nGruenheit et al., 2018). Under this scenario, stochastic variation in cell cycle length \ngenerates a steady-state distribution of cell cycle positions. Optimal proportioning of \nstalk and spore cells is presumably reached because cell fate propensity has been \nevolutionarily tuned through the cell cycle when a sufficiently large number of cells is \nrandomly sampled. It is likely that this provides an adaptive mechanism to select the \nleast fit cells to differentiate as stalk\t(Zahavi et al., 2018). Cells that have just divided \nare primed to become stalk and are more sensitive to stalk inducing signals (and less \nsensitive to spore inducing signals). Post mitotic cells have just divided their cytoplasm \nand are likely to be energy poor compared to cells in G2. This idea is supported by the \nfinding that glucose depletion causes cells to arrest around mitosis and differentiate as \nstalk cells (Gruenheit et al., 2018; Thompson and Kay, 2000).  \nThese observations provide insights into the effects of cell cycle position \nheterogeneity on cell fate at a population level. However, we still have a poor \nunderstanding of how single cells interpret heterogeneity to result in probabilistic \ndecision making or how heterogeneity is tuned to result in robust cell type proportions. \nTo understand how cell-cell variation can be exploited to control fate choice and \ngenerate robust proportioning in the absence of cell-cell communication, we have used \na combination of mathematical modelling and experimentation. The model is based on \nobservations of probabilistic differentiation in D. discoideum. We find quantitative \nbehaviour can only be explained by a model that not only encompasses deterministic \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n5\t\ncell cycle effects, but also the effects of stochastic cell-cell variation on the \nresponsiveness of cells to differentiation inducing signals. Experimental observations, \nusing single cell RNA-seq, reveals a set of genes that show extensive stochastic gene \nexpression variation at the time of starvation. These genes show many hallmarks of \nlineage priming genes. They are upregulated as cells undergo a programme of \ndifferentiation and exhibit cell type specific gene expression. A genome wide screen \nreveals that these genes, together with cell cycle associated genes, are also more \nlikely to be required for cell type differentiation. Stochastically expressed genes also \nexhibit differential H3K4 methylation. Perturbation of H3K4 methylation preferentially \naffects the level of their expression (and thus cell-cell variation), as well as cell type \nproportioning during development. These observations suggest that deterministic \nvariation in cell cycle position acts together with stochastic variation in developmental \ngenes to control probabilistic cell fate choice. Finally, we suggest this system is \nevolutionarily advantageous because it allows the use of deterministic information on \nthe status or quality of cells (e.g., their energetic state), yet protects against potentially \ncatastrophic effects of large-scale perturbations that cause cells to exhibit inadequate \nvariation in deterministic properties. These studies provide further evidence supporting \nthe key role of stochastic variation in developmental gene expression and cellular \ndecision making.  \n \nResults \n \nA combination of stochastic and deterministic variation explains lineage priming \nand fate choice in D. discoideum \n \nTo understand the mechanisms underlying lineage priming and fate choice, we first \nconsidered an earlier deterministic model that was designed to explain cell cycle \ndependent fate propensity in D. discoideum\t (Gruenheit et al., 2018). The model \nassumes all cells experience similar levels of a stalk-inducing factor, such as DIF-1. \nDIF-1 is easily diffusible and the rapid movement of cells within an aggregation result \nin a well-mixed population. Furthermore, levels of DIF-1 are regulated by negative \nfeedback that can buffer perturbations in DIF-1 synthesis (Insall et al., 1992). \nExperimental observations instead suggest cells differ in their intrinsic threshold of \nresponsiveness to this signal, with those above this threshold deterministically \ndifferentiating into pre-stalk cells (Chattwood and Thompson, 2011; Stevense et al., \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n6\t\n2010). Consistent with the wealth of experimental evidence, the model assumes that \nthese differences are partly determined by cell cycle\tposition (Gomer and Ammann, \n1996; Gomer and Firtel, 1987). It has been proposed that this is because a cell cycle \nassociated factor (CCAF) rapidly accumulates at mitosis and increases stalk \npropensity (Gruenheit et al., 2018). Following mitosis, the level of CCAF decays, \nresulting in a decreasing propensity of cells to adopt stalk cell fate (and increasing \nspore propensity) as the cell cycle progresses. This model provides a relatively good \nfit to observed population level cell fate proportions. However, a deterministic factor \ngoverning cell fate would be expected to drive cells in the same state towards the same \nfate. Instead, however, cells at the same cell cycle position can adopt different fates \n(Gruenheit et al., 2018). Furthermore, a truly deterministic model would be expected \nto result in a discrete change from stalk to spore fate at the point in the cell cycle where \nCCAF level drops below the threshold that results in stalk fate (see Figure 1 parts A.i \nand A.ii). Instead, there is a gradual decrease in stalk propensity after mitosis.  \nTo develop a model that generates the sort of probabilistic differentiation \nobserved in D. discoideum we first incorporated the deterministic influence of the cell \ncycle on responsiveness or sensitivity to stalk inducing factors (CCAF) (Figure 1A.i):   \n \n!! = !\" − $% (1) \n \nwhere !\" is the starting level of CCAF, $ is its rate of decay, and % is the amount of \ntime after the end of the previous mitosis. CCAF is measured on the scale of its \nbiological effect rather than in terms of its molecular concentration. In theory, the decay \nin the level of CCAF could show a range of shapes (e.g., exponential/convex, linear, \nor parabolic/concave), which depends both on how the underlying factor(s) decay at \nthe molecular level and how molecular concentrations translate into biological effects. \nWe assume linear decay both for simplicity, and because fitting alternative models to \nexperimental data indicate that it is the best fit model (see Supplementary text). Cells \nwith a CCAF level above a threshold value (denoted &) adopt stalk cell fate (in \nresponse to stalk-inducing factors). As a result, stalk propensity at time % after the end \nof the previous mitosis ('!) is determined by the proportion of cells with a value of \nCCAF (given by !!) above &. This model would produce the steplike pattern of cell \nfate through the cell cycle (as described above), where all cells adopt stalk cell fate \nduring the period of the cell cycle where CCAF levels are above the threshold (!! ≥ &) \nand switch to spore cell fate when CCAF levels decay below the threshold (!! < &) \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n7\t\n(Figure 1A.ii). This steplike pattern will occur irrespective of how CCAF levels change \nthrough the cell cycle (i.e. regardless of whether it is linear or nonlinear) since cells in \nthe same cell-cycle state can only be above or below the CCAF threshold that results \nin stalk fate.  \nA purely deterministic model clearly does not recapitulate the quantitative shift \nin cell fate propensity observed experimentally (Figure 1A.iii). We next reasoned that, \nfor the system to show a deterministic change through the cell cycle, but result in a \nprobabilistic output (i.e., stalk propensity), it must integrate another source of cell-cell \nvariation that is independent of the cell cycle (along with a deterministic cell-cycle \ndependent factor, CCAF). We therefore hypothesised that there is also a stochastically \nvariable cell-cycle independent factor (CCIF) contributing to sensitivity to stalk-\ninducing factors. The addition of such intrinsic variability makes cell fate probabilistic \nby allowing otherwise identical cells to adopt a distribution of possible responsiveness \nto stalk-inducing factors. We further hypothesized that this noisiness across cells \narises from stochastic expression of a set of genes contributing to CCIF. For this, we \nadapt the logic of the telegraph model of gene expression (Peccoud and Ycart, 1995), \nwhere expression of gene * at time %,  +# = {0, 1}, is a Bernoulli process with probability \n1# that +# = 1 (meaning that 1# is scaled to the size of the temporal interval). Although \ngenes might logically vary in the value of 1# , such variability does not impact our \nresults, so we simply consider the average probability of being expressed across all \ngenes: 1̅. When gene * is expressed, it contributes 3#  to the total value of CCIF, \nmaking its realised contribution 4# = +# 3# and expected contribution in any time interval \n5[4# ] = 1#3# = 1̅3#. We assume that genes vary in their contribution to CCIF according \nto 3~9[3̅, :$]. The total level of CCIF at time % is the sum of the 4# values for the set \nof ; stochastically expressed genes: < = ∑ 4#\n%\n#&' = ;1̅3̅ and (following the law of total \nvariance\t (Cornell and Benjamin, 1970)) its variance would be >$ = ;1̅:$ +\n1̅(1 − 1̅);3̅$ . This scenario results in a normal distribution of CCIF levels, \nCCIF~9[<, >$], which we use to model the properties of the aggregate effect of all \nstochastically expressed genes on CCIF.   \nIf we were to only consider the contribution of this stochastic variation, the \nproportion of cells adopting stalk fate ('!) would, be determined by the proportion of \nthe distribution that exceeds the threshold value (&) that leads to stalk fate (Figure \n1B.i), which is given by the complementary cumulative distribution of CCIF values \nabove the threshold &: \n \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n8\t\n'! =\n'\n$ B1 + erfF(< − &)/B>√2JKJ, (2) \n \nwhere erf is the Gauss error function, defined as erf(L) = B2 √M⁄ J ∫ +()!\nP3\n*\n\" . Under \nthis scenario, the proportion of prestalk cells would remain constant throughout the \ncell-cycle (Figure 1B.ii), which does not follow the observed progressive shift in stalk \nfate probability after mitosis (Figure 1B.iii).  \nWe integrate the deterministic influence of cell-cycle state and noisy gene \nexpression by considering that the total level of sensitivity to stalk-inducing factors is \nthe sum of the value of CCAF and CCIF. We achieve this integration by assuming that \nthe mean sensitivity across cells has a constant component (CCIF) given by < (cf. eqn. \n2), and a cell-cycle dependent component (CCAF) given by !!  (cf. eqn. 1). This \nproduces an expression for mean sensitivity at time % (Q!) that is analogous to equation \n(1): Q! = < + !\" − $% = Q\" − $%, where Q\" = < + !\" (i.e., is the sum of the mean CCIF \nvalue and starting CCAF value). Note that the change in average sensitivity through \nthe cell cycle depends solely on changes in CCAF levels ($%), but the change in the \naverage stalk propensity does not change linearly through the cell cycle because it \ndepends on the proportion of cells with a value above & (see Figure 1.C.i). Using this \nexpression we can rewrite the overall stalk propensity ('!) at time % by replacing < in \nequation (2) with Q! (see Figure 1C.i): \n \n'! = 1\n2 R1 + erf SQ\" − $% − &\n>√2\nTU (3) \n \n(which represents the integral over the distribution of sensitivity values above \nthe value & at time %). This ‘stochastic-deterministic’ model generates a non-linear \nchange in stalk propensity through time (Figure 1C.ii) that matches the probabilistic \nnature of cell stalk fate after mitosis (Figure 1C.iii). The model is also flexible in that, \nat its limits, it can produce anywhere from the step function (Figure 1A) expected for a \npurely deterministic process (i.e., as > → 0) and the constant proportioning (Figure 1B) \nexpected for a purely stochastic process (i.e., as $ → 0) (see Supplementary text). To \nfacilitate fitting of the model in equation (3) to experimental data, we can reduce the \nparameter space by combining the three constants (Q\", &, and >) into a single term, \ndenoted Q\"\n∗, which represents a sort of reference point for the model. This rescaling is \nachieved by assuming that the parameters are all measured in standard deviation units \nand that sensitivity is measured as a distance from the threshold that leads to stalk cell \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n9\t\nfate (&), such that Q\"\n∗ = Q! − &. This results in a two parameter version of equation (3), \n'\n$ B1 + erfF(Q\"\n∗ − $%) √2⁄ KJ, that can be fitted to the stalk propensity data (see below) to \nestimate the two parameters ( Q\"\n∗  and $ ), where the parameter estimates are in \nstandard deviation units (meaning Q\"\n∗ represents how far mean sensitivity is from the \nthreshold, measured as a Z-score). To test this model empirically, we started by fitting \nthe model to existing data on stalk propensity through the cell cycle (see \nSupplementary Data for the original raw data and adjusted values used in the \nanalysis). These data come from an experiment (Gruenheit et al., 2018) where time \nlapse microscopy was used to group cells according to their cell cycle position at the \ntime of starvation. The fate of each cell was monitored using live cell reporter genes. \nFor our analysis, we used the measures of relative stalk propensity over six hours \n(after which cells stochastically re-enter mitosis). We find that the stochastic-\ndeterministic model provides an excellent fit to the data (adjusted R-squared = 0.92, \nAIC = −56.37), yielding estimates of Q\"\n∗  = 0.57 and $  = 0.41 (Figure 1C.iii) that \ncorrespond to an expected steady-state stalk propensity of 0.35 and an expected \nstarting propensity of 0.72. Moreover, the stochastic deterministic model provides a \nsignificantly better fit to the data than the previously proposed model for exponential \ndecay of CCAF (Gruenheit et al., 2018) (it is 98.35 times more likely; see \nSupplementary text and Supplementary Figure 1). We also evaluated the support for \nour model assumptions by comparing the fit of other models derived using alternative \nassumptions. We evaluated the assumption of linear decay in CCAF by comparison to \nmodels with exponential, quadratic, and cubic decay functions (which together capture \na broad range of possible shapes), and the assumption of Gaussian variation in CCIF \nby fitting a model based on gamma distributed variation (which can capture a diversity \nof distributions, including ones approximating normality). These model comparisons all \nsupport our model assumptions (see the Supplementary text for the methods and \nresults of these comparisons).  \n \nCell cycle independent stochastic gene expression variation is extensive in \ngrowing cells \n \nThe stochastic-deterministic model suggests that cell fate choice in D. discoideum \nshould depend on stochastic variation in the expression of genes associated with fate \nchoice, as well as deterministic cell cycle dependent cell-cell variation. Single cell \nRNA-seq (scRNA-seq) data from D. discoideum cells prior to starvation and exposure \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n10\t\nto differentiation inducing signals has previously been used to identify cell cycle \nposition associated gene expression variation (Gruenheit et al., 2018). Because these \ndata are from a relatively small number of cells, but sequenced to high depth, it allowed \nus to determine the extent to which there is also cell cycle independent gene \nexpression variation.  We first determined the coefficient of variation (CV2) of \nexpression for all genes. As expected, this tends to decrease as average expression \nlevel increases (Supplementary Figure 2). When this trend is accounted for, genes \nwith greater cell-cell variability than expected for their level of expression (FDR < 0.05) \ncan be identified (Figure 2A). Next, we used Area Under the Receiver Operating \nCharacteristic Curve (AUROC) to identify genes with variation that can be explained \nbecause their variation in expression is associated with cells previously shown to be \nin early G2, late G2 or M/S phase (Supplementary Figure 2). Despite using a \nconservative AUROC threshold (>0.65), when these genes were removed, many \ngenes remain that exhibit variation that cannot be explained by differences in cell cycle \nposition (Figure 2B). Unlike cell cycle genes, principal component analysis does not \nresult in cell groupings (Figure 2C), which suggests that this represents stochastic \nvariation rather than a consequence of hitherto unknown extrinsic cues. Finally, we \nwere able to determine the extent to which cell cycle associated genes exhibit \nstochastic gene expression. For this, the CV2 of each gene was recalculated within \ngroups of cells from each of the different cell cycle stages. Most cell cycle dependent \ngenes were found to exhibit greater within group variation than expected (Figure 2D). \nThis is not due to the low level of expression at specific cell cycle stages, as variation \nis higher at all stages, including when they are maximally expressed (Supplementary \nFigure 2). These results thus reveal that stochastic gene expression variation is \nwidespread in growing cells. \n \nStochastically expressed genes are associated with cell fate determination  \n \nTo test whether genes that exhibit variability in their expression are associated with \ncell fate choice, we compared the developmental timing of expression of stochastically \nexpressed genes, cell cycle associated genes and non-variable genes. The average \nexpression of each gene during developmental stages was compared to expression \nduring growth to generate a developmental index (where 0 is exclusive to growth and \n1 is exclusive to development) (de Oliveira et al., 2019). Stochastic genes were greatly \nenriched (p≤0.001, binomial test) for developmental genes (index >= 0.9), whereas cell \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n11\t\ncycle and non-variable genes showed no enrichment compared to the genome wide \nexpectation (p>0.5, binomial test) (Figure 3A). Next, we tested whether any of these \ngroups of genes were associated with prestalk or prespore cell fate. First, RNA-seq \ndata from prestalk and prespore cells was analysed to identify genes that exhibit cell \ntype specific gene expression (where 0 is exclusive to prespore cells and 1 is exclusive \nto prestalk cells). Again, stochastic genes were significantly enriched in cell type \nspecific genes (p≤0.001), with both prestalk (p≤0.001, binomial test) and prespore \ngenes (p= 0.013) contributing to this enrichment (Figure 3B). Next, we determined \nwhether these genes are more likely to be required for fate choice. We performed an \nunbiased large scale REMI-seq forward genetic screen (Gruenheit et al., 2021) to \nidentify genes required for stalk cell differentiation. REMI-seq technology permits the \nabundance of thousands of mutants to be simultaneously quantified before or after a \nselection regime, such as selection imposed by the ability to undergo prestalk cell \ndifferentiation. The REMI-seq library was plated at low cell density and treated with \ncAMP to induce competence to differentiate followed by treatment with DIF-1 to induce \nprestalk cell differentiation (Figure 3C). Prestalk cells terminally differentiate as dead \nstalk cells after prolonged DIF-1 incubation, and thus surviving mutants with defects in \nprestalk cell differentiation can be enriched. After 2 and 6 rounds of growth and \nselection, gDNA was prepared from each biological replicate for sequencing and \nquantitative analysis. In order to ensure that enrichment was due to a failure to respond \nto DIF-1, rather than increased growth rate, we compared this set of mutants to a \ncontrol selection in which cells were simply taken through an equivalent number of \ngenerations of growth (Gruenheit et al., 2021). An additional control screen was \nperformed to identify mutants that are incompetent to differentiate at all (as either stalk \nor spore) because they fail to respond to cAMP. Cells were incubated in the presence \nof the cAMP analogue, 8-Br-cAMP, which triggers spore cell differentiation. Cells that \ncannot differentiate as spores were killed by detergent treatment, thus reducing their \nfrequency in the population. REMI-seq identified 244 mutants with defects in stalk cell \ndifferentiation (Figure 3D, Supplementary Data). A subset of these mutants was \nrecreated in the parental strain and tested in stalk cell induction assays and most \nmutants (8/9) exhibited defects in stalk cell differentiation (Figure 3E), illustrating the \nquantitative success of the REMI-seq approach. This allowed us to test whether genes \nrequired for differentiation are more likely to exhibit cell to cell gene expression \nvariation. REMI mutants were assigned to 199 genes with intragenic insertions, or \nmutations where the REMI insertion lay within upstream promoter sequences (within \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n12\t\n500bp of the transcription start site). These genes were significantly enriched for genes \nwith variable expression (p≤0.001, binomial test) (Figure 3F). This is due to both cell \ncycle associated and stochastically expressed genes, as the relative number of genes \nidentified in each class does not vary significantly from expected (chi square, p=0.77). \nThese genes also exhibit greater variability than expected when the CV2 was \nnormalised to their expression (*** p≤0.001, t-test) (Figure 3F). These results suggest \ngene expression variation is a feature of genes associated with fate choice and cell \ntype proportioning. \n \nStochastically expressed developmental genes exhibit differential patterns of \nH3K4 methylation  \n \nFate choice in D. discoideum shares features with lineage priming in embryonic stem \ncells, including the extensive cell-cell variation of genes associated with differentiation \n(Chang et al., 2008). The expression of genes associated with the differentiation is \nthought to be under epigenetic control due to the presence of specific patterns of \nepigenetic marks, including the co-occurrence of H3K4me3 and H3K27me3.  The role \nof these modifications is not fully understood due to the difficulty with which epigenetic \nmarks can be altered at a genome wide scale in higher organisms. The apparent \nabsence of polycomb like proteins in D. discoideum suggests H3K27me3 modification \nis unlikely to play any role (Kaller et al., 2006). However, H3K4 mono, di, or tri-\nmethylation (H3K4Me1-3), which is dependent on Set1/COMPASS, is present (Chubb \net al., 2006b). Similarly, H3K9/K14 acetylation is present, which is consistent with the \nidea that H3K4me3 targets the Gcn5 H3K9/K14 histone acetyl-transferase to specific \nloci (Huang et al., 2021). Finally, recent studies have linked histone modifications to \nthe control of transcriptional burst frequency (Weinberger et al., 2012; Wu et al., 2017), \nwhich will in turn affect the level of cell-cell variation in transcription. We therefore \ntested whether genes that exhibit cell-cell variation in expression in D. discoideum also \nshow hallmarks of differential regulation by H3K4 methylation. Analysis of ChIP-seq \ndata (Wang et al., 2021) revealed that H3K4 methylation exhibits a characteristic gene \nexpression level dependent pattern around the transcription start site and gene body \n(Barski et al., 2007; Soares et al., 2017) (Figure 4A). To compare patterns in variable \nand non-variable genes, it was thus necessary to divide genes into bins with similar \ngene expression levels. Ten random samples of non-variable genes with the same \ndistribution of expression levels as those seen in variable genes were used to compare \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n13\t\nthe profile and number of genes with H3K4 methylation. This revealed variably \nexpressed genes exhibit different profiles around the gene promoter and gene body \n(Figure 4B). In addition, when these were divided into stochastic and cell cycle genes, \nboth groups were enriched for H3K4Me1 (binomial test p<0.001, Figure 4C), whilst \nstochastic genes were depleted for H3K4Me3 (binomial test p<0.001, Figure 4C).  \n \nH3K4 methylation is required for normal expression of stochastic genes \n \nVariably expressed genes exhibit different patterns of H3K4 methylation compared to \nnon-variable genes. To test whether their expression also depends on H3K4 \nmethylation, we examined a D. discoideum Set1 knockout strain in which H3K4 \nmethylation is abolished (Chubb et al., 2006a) (Figure 5A). Bulk RNA-seq was \nperformed on wild type and Set1 mutant cells. Most genes increased in expression, \nwith 459 up-regulated, compared to 132 down-regulated genes in Set1- mutant cells \n(Figure 5B). The affected genes were strongly enriched for variable genes identified \nby scRNA-seq (binomial test, p<0.01, Figure 5C). The level of enrichment increases \nas the stringency with which variable genes are defined is increased. This suggests \nthat Set1 dependent genes are among the most variably expressed genes during \ngrowth (Figure 5C, Supplementary Figure 5). Furthermore, if variable genes are \ndivided into cell cycle and non-cell cycle associated stochastic genes, Set1 dependent \ngenes are enriched in both groups (binomial test, p<0.01) (Figure 5D), suggesting a \ngeneral role in controlling the expression of variably expressed genes associated with \nfate choice. These findings were validated with reporter genes in which the promoters \nof representative Set1 dependent variable genes (GtaU, HspF-2) were used to drive \nRFP expression (Supplementary Figure 6). The actin15 promoter was used as a \ncontrol because it exhibits little cell-cell expression variation and is unaffected by \ndisruption of Set1 in RNA-seq. Actin15-RFP expression was similar in wild type and \nSet1- mutant cells (Supplementary Figure 6), but both GtaU and HspF-2-RFP \nexpression increased markedly in most cells (Supplementary Figure 6).  \nChanges in gene expression can be caused by changes in burst frequency or \nburst size. To better understand the effects of Set1 on gene expression, we adapted \nthe stochastic deterministic model to examine the properties of expression variation of \nindividual genes instead of their aggregate influence on CCIF levels. For this, the \nprobability that a gene is expressed, 1# , can be interpreted as a measure of the burst \nfrequency, since burst frequency will dictate the probability of observing gene * being \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n14\t\nexpressed in interval %. When gene * is expressed, it produces an amount of transcript \ngiven by W#, which is a measure of burst size. Cells in the ‘off’ state are assigned a \nvalue of W# = 0. There can be variability in the level of expression of gene * among cells \nin the ‘on’ state, denoted X$, which can reflect both error variation in experimental \nestimation and biological noise/variability across cells, such that the distribution among \nthese cells is ~9[W# , X$]. Hence, the average level of expression for gene * across a \nset of cells (which includes cells in both the on and off states) is 1# W#, while its variance \nin expression would be 1# X$ + 1# (1 − 1# )W#\n$ . Consequently, an increase in burst \nfrequency would be expected to lead to an increase in mean expression. Because both \nthe expected mean and variance in expression levels change as a function of a change \nin burst frequency, we can measure the impact of a change in burst frequency on \nexpression variability as the squared coefficient of variation (CV2), which has an \nexpected value of (1 − 1# )/ 1# + X$/(1#W#\n$). The value of CV2 would, therefore, be \nexpected to decrease if burst frequency were to increase (i.e.,\t∆1# > 0) and vice versa.   \nTo test how Set1 affects gene expression, we examined the effects of Set1 \ndisruption on gene expression in single cells. scRNA-seq data was generated from \nwild type and set1- mutant cells. The level of gene expression and cell-cell variation \n(CV2) was determined for genes shown by bulk RNA-seq to be under the control of \nSet1 and by high depth scRNA-seq to exhibit significant variation in wild type cells. We \nfind that the average expression level of these genes was affected in set1- mutant cells \n(Figure 5E). As expected, based on the model predictions, the level of cell-cell \nvariability measured by CV2 significantly decreased in up-regulated genes (Figure 5F), \nwhich is consistent with an increased burst frequency. These findings are supported \nby quantification of GtaU and HspF-2 reporter gene expression levels and cell-cell \nvariability (Figure 5G and H). The observed changes in gene expression are thus \nconsistent with the idea that disruption of Set1 affects gene expression by increasing \nburst frequency and thus the number of expressing cells (which also reduces cell-cell \ngene expression variation) rather than simply further increasing the level of expression \nin the subpopulation of cells that already express these genes.  \n \nLoss of Set1 dependent H3K4 methylation affects cell fate choice \n \n Genes associated with cell fate choice exhibit cell-cell variation in gene \nexpression, they are enriched for Set1 dependent histone modifications and their \nnormal expression is dependent on Set1. We therefore tested whether Set1 disruption \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n15\t\naffects fate choice. Disruption of Set1 did not result in defects in developmental timing, \nbut slugs exhibit a slightly elongated and twisted morphology (Supplementary Figure \n7). Clear defects were, however, evident in chimeric development between wild type \nand set1- mutant cells (Figure 6A and Supplementary Figure 8). set1- mutant cells \nsorted strongly towards the posterior and collar regions in chimeric slugs (Figure 6A) \nand to the upper and lower cup of the fruiting body, which are normally occupied by \nprestalk (pstB and/or pstO) cell subtypes (Supplementary Figure 8). This sorting \nbehaviour suggests H3K4 methylation is required for normal pre-spore cell \ndifferentiation, and that mutant cells instead differentiate as pstB and/or pstO cell \ntypes. Indeed, when RNA-seq was performed on set1- cells separated by FACS after \nchimeric development, set1- cells express pre-stalk genes more strongly, whereas pre-\nspore genes are poorly expressed (Figure 6B). Chimeric development of wild type and \nset1- strains transformed with prestalkO (ecmO-lacZ), prestalkB (ecmB-lacZ), or \nprespore (pspA-RFP) reporter constructs further confirm these findings. Both ecmB-\nlacZ and ecmO-lacZ reporter genes are more strongly expressed in set1- cells than \nwild type (Figure 6C), whilst the pspA-RFP marker gene is only very weakly expressed \n(Figure 6C). We next used RNA-seq transcriptome profiling as it provides a sensitive \nmethod to detect defects in developmental gene expression during clonal \ndevelopment. Genes normally expressed in pre-stalk cells were found to be \noverexpressed, whereas pre-spore genes are poorly expressed (Figure 6D). \nFurthermore, representative pre-stalk lacZ marker genes with a long half-life, are \nstrikingly mis-expressed throughout the slug (Figure 6E).   \nThe data from chimeric and clonal development suggest set1- mutant cells \nexhibit an intrinsic defect in pre-spore cell differentiation and tendency to differentiate \nas pstO and pstB cell types. To determine if this is due to a lack of H3K4 methylation, \nrather than some hitherto unknown function of Set1, a strain was generated in which \nanother COMPASS complex component, Ash2, was knocked out. Again, this resulted \nin a severe reduction in the levels of H3K4 methylation (Figure 6F). A strain was also \ngenerated in which Gcn5, which is required for H3K9/14 acetylation, was disrupted \n(Figure 6G). The effects of H3K4 methylation on gene expression are thought to be, in \npart, due to its effect on targeting H3K9/14 residues for acetylation. Indeed, disruption \nof set1 in D. discoideum has previously been shown to decrease H3K9/14 acetylation \n(Hsu et al., 2012). Both ash2- and gcn5- mutants phenocopied set1- knockout cells in \nchimeric development with wild type cells (Figure 6G). In addition, chimeric \ndevelopment of a hypomorphic Set1-FLAG knock-in allele (which results in reduced \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n16\t\ndi-methylation and absence of tri-methylation but little effect on mono-methylation \n(Supplementary Figure 9)) results in a qualitatively weaker sorting phenotype than that \nseen with set1- or ash2- mutants (Supplementary Figure 9). Together these results \nsupport the idea that H3K4 di and tri methylation is required for normal pre-spore cell \ndifferentiation.  \n \nCell cycle position and gene expression variation interact to control cell type \nproportioning \n \nSet1 disruption predominantly affects the expression of stochastically expressed \ndevelopmental genes and results in defects in cell fate choice during development. \nThis is consistent with predictions of the stochastic-deterministic model at the single \ncell level. A perturbation to the regulation of stochastic expression that increases the \nprobability that genes will be expressed (i.e., increases 1̅) will necessarily lead to an \nincrease in the average level of CCIF (since the mean < = ;1̅\\] , and such a \nperturbation would increase 1̅), and thus will increase stalk cell differentiation. This, \nhowever, assumes that the regulation of CCAF and CCIF are independent and there \nare no effects on the cell cycle or the level of CCAF. To test this, we analysed the \neffects of perturbing cell cycle associated gene expression variation on stochastic \nvariation, and vice versa. Cell cycle progression was blocked through cold shock, \nwhich results in widespread changes in cell cycle associated gene expression \n(Strasser et al., 2012). However, this did not affect the expression of representative \nSet1 dependent reporter genes that are not associated with the cell cycle (Figure 7A \nand B). Next, we perturbed the level of expression of variably expressed genes through \nSet1 disruption. scRNA-seq reveals set1- cells still exhibit cell cycle associated \ndifferential gene expression with the number of M/S and G2 phase cells unaffected \n(Figure 7C, Supplementary Figure 10). Furthermore, cell cycle progression and timing \nof cell division are unaffected by Set1 disruption (Figure 7D). These data thus suggest \nSet1 dependent variable genes and cell cycle dependent variation are independently \ncontrolled.  \nThe stochastic deterministic model predicts an increase in CCIF should \nincrease the likelihood of each single cell responding to stalk inducing signals (such \nas the stalk inducer DIF-1). To test this idea, DIF responsiveness was quantitatively \ncompared at the single cell level in wild type and set1- mutant strains in which a GFP \nreporter gene had been knocked into the DIF sensitive ecmA or ecmB prestalk genes \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n17\t\n(Supplementary Figure 11). The number of ecmB GFP positive cells was significantly \nhigher in set1- mutant cells at all DIF-1 concentrations tested, and the number of ecmA \npositive cells was higher at the lowest concentrations (Figure 8A and B). However, the \nlevel of GFP expression in each responding cell did not significantly change. This \nsuggests that the number of cells that are in a DIF responsive state is dependent on \nthe level of Set1 dependent stochastic gene expression (CCIF), as well as the position \nof in the cell cycle (CCAF). We next analysed the relationship between cell cycle phase \nand cell fate choice in set1- mutant type cells. Cells were filmed growing at low density \nfor 12-14 hours, which allowed each cell to be tracked for at least one cell division. The \ngrowth medium was then removed, and cell type differentiation induced. Cell tracking \nwas continued, and the final fate of each cell (expression of cell type specific GFP \nreporter gene) was compared to the cell cycle position (timing of the last division) when \ndifferentiation was induced. Consistent with expectation from the stochastic-\ndeterministic model (where expected stalk propensity, ;1̅3̅ ,increases will burst \nfrequency), set1- mutant cells exhibit a higher probability of adopting the pre-stalk cell \nfate being higher throughout the cell cycle in set1- mutant cells (Figure 8C). Finally, we \ntested whether the additive effects predicted by the stochastic-deterministic model are \nalso seen during normal development. Developmental timing and fruiting body \nmorphology is normal when cell cycle progression is perturbed in wild type cells (by \ngrowth at 11.5°C). Similarly, although changes in the level of expression of variable \ngenes (through mutation of set1) affect proportioning, normal fruiting bodies are \nformed (Figure 8D). In contrast, when both inputs are simultaneously perturbed, the \neffects on development are dramatic. Few fruiting bodies formed, even 24 hours after \ndevelopment was complete in all other samples (Figure 8D). Together, these results \nare consistent with a model in which symmetry breaking, fate choice and cell type \nproportioning in D. discoideum depends on the integration of two independent variable \ninputs. Since variation is widespread in biological systems, there is immense scope for \nnatural selection to harness these properties for robust biological decision making.  \n  \nDiscussion \n \nThe ability of isogenic cells to break symmetry and adopt different fates allows \nunicellular organisms to cope with dynamically changing environments and underpins \nthe division of labour, which is a key feature of the evolution of multicellularity (West \nand Cooper, 2016). Cell state heterogeneity is an inevitable feature of biological \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n18\t\nsystems and can provide a reliable substrate for symmetry breaking. For example, \nstochastic intrinsic variation can be used to predictably ‘sample’ cells from different \nstates if population sizes are sufficiently large (even if it is impossible to know what \nstate each cell will be in).  In other cases, cells may transition reproducibly between \nstates, such as their position in the cell cycle. Moreover, if cell state reflects properties, \nsuch as energetic or resource state, it can be used to best determine the utility of cells \nfor different roles.  \n \nSet1 dependent methylation controls transcriptional burst frequency \n \nStochastic cell-cell variation can arise from noisy gene expression, which is an \ninevitable consequence of transcriptional bursting. Changes in burst frequency will \naffect the level of transcriptional cell-cell variation, as well as mean expression \nlevels.  Bursting parameters and thus levels of transcriptional noise are affected by \nhistone modifications. Roles in differentiation are suggested by studies in embryonic \nstem cells, as well as other systems suggest that gene networks associated with \nlineage choice are often associated with specific histone modifications (Hong et al., \n2011; Yadav et al., 2018). This includes the presence of marks associated with gene \nactivation (e.g. compass complex dependent H3K4 methylation and polycomb \ncomplex dependent H3K27 methylation). Single-cell RNA-seq analysis of mouse \nembryonic stem cells has also shown that treatment of cells with nucleoside analogues \nthat are removed by the base excision repair pathway result in genome wide increases \nin cell-to-cell variability in transcript abundance of thousands of genes\t(Desai et al., \n2021).  Importantly, this increase in transcriptional noise does not markedly change \nexpression mean. This also affects the likelihood of differentiation, presumably \nbecause it changes the probability of cells being in a responsive state to fate inducing \nsignals. It is, however, unknown whether this reflects endogenous control mechanisms \nthat regulate the levels of transcriptional noise.  \nOur results also support the idea that gene expression noise is important for \ncell fate. Our studies center on Set1 dependent H3K4 methylation, which is known to \nrecruit various HATs, HDAC’s and chromatin remodelers (e.g. ISWI, NURF, CHD ) \n(Bian et al., 2011; Santos-Rosa et al., 2003; Shi et al., 2006; Shi et al., 2007; Sims et \nal., 2005; Taverna et al., 2006; Wysocka et al., 2006). Although Set1 dependent \nhistone modifications are associated with more highly expressed loci (Barski et al., \n2007; Soares et al., 2017), this relationship is not necessarily causal. In fact, genetic \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n19\t\nstudies have revealed that absence of H3K4 methylation only affects a small subset \nof genes, and these are typically upregulated (Hsu et al., 2012; Lorenz et al., 2014; \nMargaritis et al., 2012; Ramakrishnan et al., 2016). In addition, previous studies have \nalso noted a correlation between the breadth of the H3K4 methylation domain and the \nlevel of gene expression variability (Benayoun et al., 2014; Rotem et al., 2015; Sze et \nal., 2020; Wu et al., 2017) . Indeed, disturbance of these patterns affects transcriptional \nconsistency (Sze et al., 2020; Wu et al., 2017). Our studies provide further support for \nthese ideas. Moreover, they support the idea that the modification of chromatin \nstructure provides a means by which the level of gene expression and gene expression \nvariation can be controlled to facilitate cell fate decisions.  \n \nNoisy gene expression has the potential to buffer development against \nenvironmental perturbations \n \nCell fate choice and symmetry breaking in D. discoideum depend on the interplay \nbetween deterministic effects of cell cycle position and stochastically expressed \ndevelopmental genes on responsiveness to fate inducing signals fate. Why should \nsuch a mechanism evolve when a purely stochastic system could achieve correct \nproportioning simply by random sampling of a sufficiently large population of cells? \nSimilarly, a cell cycle system alone could also achieve correct proportioning if cells are \nrandomly distributed through the cell cycle. D. discoideum cells that have recently \ncompleted mitosis (i.e., those that have recently divided their resources) are biased \ntowards the stalk fate, Use of the cell cycle position presumably reflects an adaptation \nthat helps ensure aggregations to disproportionately sacrifice ‘lower value’ cells to the \nstalk for building the stalk and benefit higher value (higher resource state) cells that \nform spores. Resource based division of labour is also seen in Sinorhizobium meliloti, \nwhere cells with highest levels poly-3-hydroxybutyrate (PHB) (which allows survival in \nlong term starvation) do not grow and instead leave resources to be utilised by low \nPHB containing cells (Ratcliff and Denison, 2010).  \nWhile such examples suggest cell state heterogeneity can be utilised \nadaptively to make cellular decisions, it is also vulnerable to systematic perturbations \nthat affect the proportion of cells in different states. As soil dwellers, D. discoideum cell \npopulations are likely to be exposed to varying environmental conditions, such as \ntemperature, pH, or available nutrients that can all potentially affect developmental \nsignalling, cell physiology, and cell type proportions. For example, D. discoideum cells \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n20\t\ndepleted in glucose or that have experienced cold shock will arrest around mitosis (and \nend up biased towards become stalk), which means that certain environments could \nlead to extreme stalk fate bias. Therefore, the evolutionary success of this organism \nmeans that mechanisms likely evolved to ensure development is buffered against \nenvironmental perturbations. Early studies identified a mechanism by which cell type \nproportioning could be corrected by feedback loops where the excess of one cell type \nleads to its trans-differentiation into the other (Belcher et al., 2022; Insall et al., 1992; \nKay and Thompson, 2001). It has also been shown that disturbances that affect the \ncell cycle result in compensatory gene expression changes that alter the threshold of \nresponsiveness (Chattwood et al., 2013). Our findings provide another potential \nsolution to this problem, in which the integration of different sources of cell state \nvariation allows systems to be robust against environmental perturbations, while still \nbeing able to adaptively exploit differences in cell state. This is because stochastic cell-\ncell variation, could buffer against catastrophic shifts in cell-type proportioning, by \nensuring some cells always adopt different fates (Supplementary Figure 12). This \nwould reduce the range over which feedback loops that refine proportioning need to \noperate. It is tempting to speculate that this complexity is a consequence of the \nstrength of selection to ensure the correct ratio of stalk and spore cells under \nfluctuating natural environments that may affect signalling and cell fate choice (e.g. \ntemp, pH, light, salt levels, humidity, food availability). Moreover, it highlights the \nimpacts that cell-cell variation and environmental variation may have on the evolution \nof developmental signalling pathways. Since variation can promote or hinder \ndevelopmental robustness, it is likely that many developmental systems will exhibit the \ngoldilocks principle when it comes to adaptive exploitation of heterogeneity, where they \nhave been evolutionarily tuned so that they are reliably and repeatedly exposed to ‘just \nthe right amount’ of variation. \n \nFigure Legends \n \nFigure 1. Theoretical patterns of stalk proportioning through the cell cycle.  \nPlots illustrate the model structure, theoretical expectations for stalk proportioning ('!) \nthrough the cell cycle, and corresponding fits to empirical data (from (Gruenheit et al., \n2018)) for scenarios in which cell fate is controlled by either cell cycle position (time \npost mitosis), noisy gene expression, or a combination of the two.  The five red points \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n21\t\nin the illustrations of the model structure and corresponding theoretical expectations \napproximately match the five timepoints used in fitting the empirical data.  \n \nA. Stalk proportioning through the cell cycle is deterministic. i) In this scenario, \nsensitivity to stalk-inducing factors depends solely on the level of a cell-cycle \nassociated factor (CCAF), which has a maximal level just after completing mitosis \n(given by Q\") and decays linearly through the cell cycle (at a rate given by $; see eqn. \n1). ii) When the CCAF level is above a threshold value (&), cells adopt stalk fate and \nwhen the level drops below the threshold, they switch to spore fate. This results in a \nstepwise change in cell fate. iii) The best fit to the data shows a clear lack of fit, with \nthe empirical pattern showing neither the predicted all or none pattern of cell fate nor \na sharp shift in stalk propensity.  \n \nB. Stalk proportioning depends solely on noisy gene expression. i) In this \nscenario, sensitivity to stalk-inducing factors depends on a cell-cycle independent \nfactor (CCIF), whose level is normally distributed (with mean <  and a standard \ndeviation >) because it is determined by the sum across many noisily expressed \ngenes. All cells with a value of CCIF above a threshold value (&) adopt stalk fate, while \nthose below the threshold adopt spore fate (see eqn. 2 for the expected stalk \nproportioning). ii) Because the level of CCIF does not depend on cell cycle, expected \nstalk proportioning does not change through time. iii) The best fit to the data shows a \nclear lack of fit, with the empirical pattern showing an obvious shift in stalk propensity \nthrough the cell cycle. \n \nC. Stalk proportioning depends on the combination of the deterministic cell-\ncycle dependent change in CCAF and the stochastic input of CCIF from noisy \ngene expression. i) In this scenario (the stochastic-deterministic model), sensitivity to \nstalk-inducing factors is determined by the sum of the contribution of CCAF, which \ndecays through the cell cycle, and CCIF, which shows a fixed pattern of variation at all \ntimepoints. The proportion of cells at each timepoint that have a total sensitivity above \nthe threshold value adopt stalk fate, while those below the threshold adopt the spore \nfate. ii) As CCAF decays through the cell cycle, the proportion of cells above the \nthreshold declines, resulting in the non-linear pattern of stalk proportioning seen in the \nrighthand plot. iii) The best fit to the data shows a very good match between cell fate \nbehaviour and the pattern predicted by the model.  \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n22\t\n \nFigure 2. Identification of gene expression variation. \n \nA. Gene expression variability in single cells. Plot shows log10 transformed relative \ngene expression variability (observed CV2/expected CV2) versus log10 transformed \naverage gene expression level for all genes in 81 single cells. 2073 genes that show \nsignificantly greater variation (FDR < 0.05; red dashed line) than expected (black \ndashed line) were identified (yellow dots).  \nB. Variable genes can be divided into cell cycle associate and cell cycle \nindependent genes. Variable genes were defined as cell cycle associated (1016 \ngenes) or cell cycle independent (1057 genes). i) Hierarchical clustering reveals cell \ncycle associated genes show a strong cell cycle associated gene expression pattern \n(blue (S/M phase, 669 genes), orange (G2.1 phase, 247 genes) and red (G2.2 phase, \n100 genes). ii) Cell cycle independent genes show little or no pattern with respect to \nthe cell cycle groups. Cell order is the same in both plots. \nC. Non-cell cycle associated genes do not drive cell groupings. i) Plotting PC1 \nagainst PC2 from a PCA of expression variation in cell cycle associated variable genes \nidentifies three cell groups (blue, orange, red circles). ii) PCA of non-cell-cycle variable \ngenes does not identify any grouping of cells. \n \nD.  Cell cycle dependent genes also exhibit extensive stochastic cell-cell \nvariation. The CV2 was calculated for all genes within each group of cells at different \ncell cycle stages and compared to the expected value based on average gene \nexpression level.  Genes with more variation than expected (adjusted p value <0.05, \nabove dashed line) are shown in yellow. The 1016 cell cycle associated genes (black) \nare generally more variable than expected at each cell cycle stage.  \n \nFigure 3. Variably expressed genes are important for prestalk cell differentiation. \n \nA. Developmental index of different groups of genes. Average gene expression \nduring developmental stages was compared to average expression during growth + \naverage expression during development. This developmental index ranges from zero \nto one. Zero is exclusive to growth and one is exclusive to development) (*** p≤0.001). \n \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n23\t\nB. Cell type index of different groups of genes. Gene expression in prestalk cells \nwas compared to expression in prestalk + prespore cells. This cell type index ranges \nfrom zero to one (zero is exclusive to prespore cells and one is exclusive to prestalk \ncells). (*** p≤0.001). \n \nC. Schematic of screens carried out to identify mutants defective in prestalk cell \ndifferentiation. i) Selection for DIF-1 insensitivity. Cells were incubated with cAMP \nand DIF-1 to induce stalk cell differentiation. Only cells that fail to respond to these \nsignals remain as amoebae and can re-enter growth upon addition of growth medium. \ngDNA was extracted for sequencing after two or six rounds of selection. ii) Selection \nfor competence to respond to cAMP. Cells were treated with 8-Br-cAMP to induce \nterminal spore cell differentiation, before treatment with detergent to remove cells that \nfailed to respond and remain as amoeba. A single round of this assay followed by \nsequencing allowed mutants that dropped out (failed to respond to 8-Br-cAMP) to be \nidentified. iii) Selection for mutants with growth advantage. Cells were grown for a \nsimilar number of generations to that estimated from re-growth after stalk cell \ninduction.  \n \nD. Identification of mutations that affect stalk cell differentiation. Heatmap of all \n315 REMI mutant positions that were enriched in round 2 or 6 of the DIF-1 selection. \n244 insertions were classified as DIF signalling mutants that affect stalk cell \ndifferentiation. Other mutants enriched in the screen were removed because they \nincreased in frequency after only growth (63 growth mutants) or failed to differentiate \nin response to Br-cAMP (8 cAMP non-responsive mutants). \n \nE. Validation of mutants identified by REMI-seq. Stalk cell inductions were carried \nout on independently generated mutants from round 2 or 6. 8/9 mutants produce fewer \nstalk cells than wild type (* p≤0.05,**  p≤0.01,*** p≤0.001,**** p ≤ 0.0001). Results are \nan average of at least three independent experiments. \n \nF. REMI mutants that affect development are enriched for variable genes. i. \nComparison of expected vs observed numbers of REMI mutants with variable, cell \ncycle or stochastic expression ii. Comparison of normalised CV2 of genes associated \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n24\t\nwith REMI mutants that affect development vs the genome wide average (* p !\n0.05,**  p!0.01,*** p!0.001). \n \nFigure 4. Variably expressed genes exhibit differential patterns of histone \nmodifications. \n \nA. The H3K4 methylation profile in D. discoideum.  \nGenes were assigned to ten bins according to their average expression level in \nvegetative AX4 cells (Supplementary data). Genes were identified that contain \nH3K4me1 or H3K4me3 peaks around the TSS (-2500bp to +2500bp) and the \ndistribution of peaks for each bin was calculated. Both H3K4me1 (left) and H3K4me3 \n(right) exhibit characteristic distributions around the TSS, which correlate with \nexpression level.  \n \nB. H3K4 methylation patterns differ in variable and non-variable genes.  \nThe peak density of variable genes with at least one H3K4me1 or H3K4me3 peak \nbetween 2500bp upstream and 2500bp downstream of the TSS was plotted (blue line). \nTo control for the effect of expression level on peak density, the distribution of \nexpression levels of variable genes was used to select ten random samples of non-\nvariable genes with the same expression distribution (Supplementary Fig 4). The peak \ndensity of the ten random samples was averaged (red line).  \n \nC. Number of genes with H3K4 methylation  \nComparison of the numbers of observed and expected genes associated with \nH3K4Me1 and H3K4Me3 marks (*** p≤0.001). \n \nFigure 5. Set1 controls cell-cell gene expression variation. \n \nA. Set1 is required for H3K4 methylation. Western blot of mono, di and tri \nmethylation at position H3K4 in wild type Ax4 and mutant Set1- cells. Disruption of \nSet1 leads to the complete loss of all three methylation marks. Histone 3 levels are \nunaffected.  \n \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n25\t\nB. Deletion of Set1 leads to specific changes in gene expression. Volcano plot \nshowing differential gene expression between Set1- mutant and wild type cells. Up-\nregulated (green) and down-regulated (blue) genes are highlighted based on cut offs \nof two-fold change and an adjusted p-value of < 0.01.  \n \nC. Set1 dependent genes are enriched for variable genes. Quantification of \nenrichment of Set1 dependent genes in variable gene populations. As the threshold \nfor the definition of a variable genes increases, the enrichment for Set1 repressed \ngenes also increases. *** binomial test, p<0.01.  \n \nD. Set1 dependent variable genes are enriched for cell cycle associated and cell \ncycle independent variable genes. Plot shows the number of genes mis-expressed \nin Set1 mutant cells compared to a random sampling. \n \nE. Variable genes show increased expression in Set1- single cells. Expression \nlevel was calculated from 310 wild type and 310 Set1- cells. Variable genes (FDR < \n0.05) were defined as those up- or down-regulated in the bulk Set1 mutant sequencing. \nUpregulated genes show a statistically significant increase (p < 0.01) in expression. \nDownregulated genes show a small non-significant decrease in expression  \n \nF. Variable genes show decreased expression variability in Set1- single cells. CV2 \nwas calculated from 310 wild type and 310 Set1- cells. Variable genes (FDR < 0.05) \nwere defined as those up- or down-regulated in the bulk Set1 mutant sequencing. Up-\nregulated genes display a significant decrease in CV2 in the Set1 mutant compared to \nWT cells (p < 0.05). Downregulated genes show a small non-significant decrease in \nvariability. \n \nG. GtaU and HspF-2 show increased expression in Set1- single cells. \nFACS quantification of GtaU and HspF-2 reporter gene expression levels normalised \nto actin15-GFP. \n \nH. GtaU and HspF-2 show decreased cell-cell variability in Set1- single cells. \nCell-cell variability in normalised expression (CV2) from FACS analysis of GtaU and \nHspF-2 reporter genes. \n \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n26\t\nFigure 6. Set1 is required for cell fate choice. \n \nA. set1- mutant cells tend to occupy regions of the slug associated with prestalk \ncell differentiation in chimera with wild type. Schematic showing location of \nprestalk (pstA, pstO, pstB) and prespore (psp) cell types in the slug. Chimeric \ndevelopment of wild type and set1- mutant cells. 10% actin-RFP expressing cells were \nmixed and developed with 90% unlabelled cells. Set1- mutant cells sort to the collar \nand back of slugs when mixed with wild type cells. Scale bar = 0.5mm. \n \nB. RNA-seq of FACS purified Set1 mutant and wild type cells. Actin15-GFP \nexpressing wild type cells were mixed with actin15-RFP expressing Set1- mutant cells \nin a 50:50 ratio. GFP and RFP cells were separated at the slug stage and subjected \nto RNA-seq. The cell type index (prestalk expression/(prestalk + prespore expression)) \nwas calculated for genes differentially expressed in wild type (blue bars) or Set1 \nmutant cells (yellow bars). >0.5 = prestalk, <0.5 = prespore. Interleaved histogram \n(light grey) shows the average distribution of 1000 random samples of the same \nnumbers of genes. Scale bar = 0.5mm. \n \nC. Chimeric development of wild type and set1 mutant cells expressing cell type \nspecific markers. 10% labelled cells were mixed with 90% unlabelled cells. Set1 \nmutant cells strongly express pre-stalk markers, whereas a pre-spore marker is \nexpressed weakly. Scale bar = 0.5mm. \n \nD. RNA-seq reveals Set1 mutant slugs exhibit defects in cell type proportioning. \nClonal wild type or Set1 mutant cells were developed to the slug stage. The cell type \nindex (>0.5 = prestalk, <0.5 = prespore) was calculated for genes identified as \ndifferentially expressed by RNA-seq in Ax4 (blue bars) and Set1- (yellow bars) cells. \nInterleaved histograms (light grey) show the average distribution of 1000 random \nsamples of the same numbers of genes.  \n \nE. Set1 mutant cells overexpress prestalk marker genes. Clonal wild type or Set1 \nmutant cells transformed expressing ecmO-lacZ- or ecmB-lacZ markers were \ndeveloped to the slug stage. ecmB and ecmO are expressed in the collar and posterior \nanterior like cells in wild type. In Set1 mutant slugs, pre-stalk markers are expressed \nthroughout the length of the slug. Scale bar = 0.5mm. \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n27\t\n \nF. Western blot of H3K4 methylation in Ash2 mutant cells. ash2 mutant cells have \nlost H3K4 methylation. Levels of histone H3 are unaffected. \n \nG.  Western blot of H3K9/14 acetylation in Gcn5 mutant cells. Gcn5 mutant cells \nhave lost H3K9 and H3K14 acetylation. Levels of histone H3 are unaffected. \n \nH. Chimeric development of Ash2 mutant cells with wild type cells. Like set1- \ncells, when labelled ash2- cells are mixed with WT cells they show a strong localisation \nin the collar and posterior of the slug. In the reciprocal mix, WT cells occupy the anterior \npre-spore region. Scale bar = 0.5mm. \n \nI. Chimeric development of Gcn5 mutant cells. Like Set1- and Ash2- when labelled \nGcn5- cells are mixed with WT cells they are localised to the collar and back of the \nslug. Scale bar = 0.5mm. \n \n \nFigure 7. Cell cycle and stochastic variation are independent. \n \nA. Variation in expression of Set1 dependent stochastic genes is unaffected by \ncell cycle disruption. Dual reporter lines for actin15, gtaU and hspF-2 were incubated \nat either 22°C or 11.5°C and expression was quantified by FACS.  \n \nB. Level of expression of Set1 dependent stochastic genes is unaffected by cell \ncycle disruption. Dual reporter lines for actin15, gtaU and hspF-2 were incubated at \neither 22°C or 11.5°C and average level of expression calculated. Inset shows \ncomparison of proportion of positive fluorescent cells (above background) at 11 and \n22°C.  \n \nC. Quantification of numbers of cells in different cell cycle phases. Single cells \nwere assigned to different cell cycle stages based on the expression of MS and G2 \nmarkers from scRNA-seq (see Supplementary figure 10 and methods).  \n \nD. Growth rate is unaffected by Set1 disruption. Cells were plated at low density \nand tracked by live imaging to determine cell cycle length of individual cells. \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n28\t\n \nFigure 8. Cell type proportioning and responsiveness to differentiation cues \ndepends on both stochastic and cell cycle variation  \n \nA and B. Set 1 regulates the threshold of responsiveness to differentiation cues. \necmA-GFP (A) and ecmB-GFP (B) knock-in lines were generated in wild type and Set1- \nmutant cells. Cells were induced to differentiate in culture in the presence of varying \namounts of the prestalk inducer DIF-1. Marker gene expression was quantified by flow \ncytometry. Expressing cells were divided into bins of 0.05 (for log10 GFP expression). \n \nC. Cell type differentiation at different cell cycle positions. Cells were grouped into \nhourly blocks according to the time of the last cell division before the addition of \ndifferentiation inducing signals. The percentage of cells that differentiate as prestalk \ncells was calculated. Results show the average of 8 movies. \n \nD. Development is blocked when stochastic and cell cycle gene expression is \naltered simultaneously. WT and Set1 mutant cells were incubated at 22°C, or at \n11.5°C to inhibit cell cycle progression before being induced to develop. \n \n \n \nSupplementary Figure Legends \n \nSupplementary Figure 1 \nFit of experimental data to models of fate choice \nA. Comparison of the fit of empirical measures of stalk proportioning ('!) through the \ncell cycle (from (Gruenheit et al., 2018)) to three different models for the control of \nproportioning. i) Expected stalk proportioning through the cell cycle predicted by a \nmodel where there is deterministic exponential decay in CCAF levels (after Gruenheit \net al., 2018). The expected pattern of stalk proportioning for the best fit model is shown \nby the blue line (see Supplementary Information). ii) Expected stalk proportioning \nthrough the cell cycle predicted by a version of the stochastic-deterministic model for \nthe case where gene expression noise is given by a gamma distribution (instead of \nbeing Gaussian) (see Supplementary Information). The gamma distribution has two \nadditional parameters corresponding to the gamma distribution shape and scale \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n29\t\nparameters \\ and ^. The best fit model corresponds to the gamma distribution shown \nin part B.ii of this figure, which is very similar to the shape of a Gaussian distribution \n(see part B.i of this figure). Iii) Expected stalk proportioning through the cell cycle for a \nversion of the stochastic-deterministic model with an exponential decay in CCAF (see \nSupplementary Information). The best fit of this model corresponds to the pattern of \ndecay in CCAF shown in part C.i of this figure.  \nB. The two figures show a comparison of i) Gaussian distributed noise and ii) the \npattern of gamma distributed noise (see eqn. S1) corresponding to the best fit model \nshown in part A.iI of this figure. \nC. The three figures show the pattern of decay in CCAF associated with the best fit \nmodels for i) a model with exponential decay (see eqn. S2), ii) a quadratic model of \ndecay in CCAF, and iii) a cubic model of decay in CCAF. In each case, the best fit \ndecay model is approximately linear and in all three cases, the model with linear decay \npresented in the main text has a better fit to the data.   \n \nSupplementary Figure 2 \nCell cycle dependent genes defined through AUROC analysis.  \nA. Previous analyses of scRNAseq of 81 single cells reveals cells can be divided into \ntwo or three cell groups based on cell cycle position 31. To identify markers associated \nwith the cell cycle, relaxed AUROC (Area Under the Receiver Operating Characteristic \nCurve) thresholds of > 0.65 and p < 0.05 were used to compare gene expression in 25 \nM/S phase, and cells from different G2 phases (31 cells and 25 cells). Additional \nmarkers were identified by comparing MS cells to all G2 cells (31 + 25 = 56). A total of \n5529 cell cycle markers were identified.  \nB. Intersection of cell cycle markers identified using two or three cell groups. \nC. Intersection of variably expressed genes (2073 genes) with cell cycle markers (5529 \ngenes) shows the cell cycle can account for variation in expression of 1016 of 2073 \nvariable genes. \n \nSupplementary Figure 3 \nMarkers of different cell cycle phases also exhibit more variation within cells at \nthe same cell cycle positions.  \n \nColumns represent the level of variability of all genes when calculated based on cells \nfrom M/S (25 cells), G2-1 (31 cells) and G2-2 (25 cells) phases. Genes that significantly \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n30\t\ndeviate from the trend are highlighted in yellow. Rows show the markers associated \nwith each cell group (black dots) mapped onto these plots. Genes are enriched more \nvariable at all cell cycle stages.  \n \nSupplementary Figure 4 \nNormalisation of expression levels of variable and non-variable genes for ChIP \nanalyses \nA. Distribution of expression levels of variable and non-variable genes. The \naverage expression of variable genes and non-variable genes (limited to the lowest \nand highest expressed variable genes).  \n \nB. Sampling non-variable genes with the same distribution as variable genes. \nTen samples of non-variable genes were taken. To ensure that the samples matched \nthe expression distribution of variable genes, the probabilities of sampling a non-\nvariable gene was weighted based on the percentage of variable genes in 30 different \nexpression bins. Each sample contains 2024 genes, which is the same as the number \nof variable genes. The expression distribution of the samples matches the variable \ngenes. \n \nC and D. Variable genes and sampled non-variable genes containing H3K4me1 \npeaks show the same expression distribution. Variable genes and sampled non-\nvariable genes were overlapped with lists of genes that contained H3K4me1 in their \nTSS adjacent regions. This was done for two ChIP replicates C and D). Sampled non-\nvariable genes with H3K4me1 marks have the same expression distribution as variable \ngenes. \n \nE and F. Variable genes and sampled non-variable genes containing H3K4me3 \npeaks show the same expression distribution. Variable genes and sampled non-\nvariable genes were overlapped with lists of genes that contained H3K4me3 in their \nTSS adjacent regions. This was done for two ChIP replicates E and F). Sampled non-\nvariable genes with H3K4me3 marks have the same expression distribution as variable \ngenes. \n \nG and H. Variable genes tend to be enriched for genes with H3K4me1 marks and \ndepleted for H3K4me3. The number of genes that contained at least one H3K4me1 \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n31\t\n(G) or H3K4me3 (H) peak within the region -2500bp upstream to 2500bp downstream \nof the TSS was calculated. This number was calculated for variable genes and \ncompared to the average of ten samples of non-variable genes.  \n \nSupplementary Figure 5 \nSet1 dependent genes are more variable than expected.  \n \nPlot of gene expression variability (CV2) against log10 expression in wild type cells. \nDifferent thresholds for the definition of variable genes are shown (FDR < 0.05 – red, \nFDR < 0.01 – green, FDR < 0.001 – pink, FDR < 0.0001 – purple). Set1 dependent \ngenes (black dots) are enriched in variable populations. \n \nSupplementary Figure 6 \nDual reporter gene analysis of gene expression variation. \n \nA. Schematic of dual reporter plasmid. Promoters of interest were cloned upstream \nof RFP in the pDM324 vector which also carries an actin15-promoter-GFP-actin8-\nterminator cassette.  \nB. scRNA-seq CV2 plot illustrating the variability of genes used in dual reporter \nassay. GtaU (DM =  4.8, FDR = 0.001) and HspF-2 (DM = 6.4, FDR = 5.2 x 10-5) are \nhighlighted. Actin-15 is not variable (DM = 1.26, FDR = 0.71). \nC. Single cell analysis of representative Set1 dependent reporter genes \nFACS analysis of cells expressing RFP under the control of the actin-15 (i), gtaU (ii) \nand hspf2 (iii) promoters in wild type (blue) and Set1 mutant (yellow) cells.  \n \n \nSupplementary Figure 7 \nDisruption of set1 does not affect developmental timing but results in slightly \naberrant slug morphology \nRepresentative images of wild type and set1- mutant development. \n \nSupplementary Figure 8 \nSet1- mutant cells sort to upper and lower cup of fruiting bodies \n \nA. Schematic of cell type differentiation in the D. discoideum fruiting body \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n32\t\n \nB. Sorting pattern of wild type and Set1 mutant cells in chimeric fruiting bodies \nSet1 mutant cells sort to the upper cup, lower cup and stalk. Wild type cells sort to \nspore region of the fruiting body. \n \nSupplementary Figure 9 \nA hypomorphic Set1 mutant exhibits a weaker developmental phenotype than \nknockout cells. \n \nA. Western blot of H3K4 methylation. Blot of wild type and Set1-FLAG Knock-in cells \nfor different H3K4me marks reveals that tagged cells exhibit a complete loss of \nH3K4me3 (like null mutant cells), but retain ~30% of H3K4me2 and accumulate \nincreased mono-methylation compared to the WT. \n \nB. Chimeric development of Set1-Flag knock-in cells with wild type cells. Mixing \nof 90% WT, FLAG-tagged Set1 or mutant Set1- cells with 10% WT cells labelled with \nconstitutively expressed actin15-RFP. Wild type cells are largely absent from the \nprestalk region and are found throughout the prespore zone when mixed with Set1- \nmutant cells.  \n \nSupplementary Figure 10 \nThe cell cycle is unaffected in Set1 mutant cells. \n \nA. i. Dimensionality reduction of high-depth scRNA-seq data from 81 wild-type cells \nwas performed. using genes previously defined as markers of M/S or G2 cells 31. tSNE \nindicates cells can be assigned to two groups. The cells in each grouping match those \ndefined using the 500 most variable genes in the genome (Gruenheit et al., 2018).Cells \nare colour-coded by the normalised ratio of the expression of all M/S genes to G2 \ngenes (see methods). ii Histogram of cells from split into either M/S (blue) or G2 (pink) \ngroups. \n \nB. scRNA-seq was performed on 310 wild-type and Set1- single cells. The threshold \nratio of M/S to G2 cells from wild type cells sequenced at high depth (A, ii, dotted line) \nwas used as to define cells from wild type (i) or the Set1 mutant (ii) as either M/S or \nG2.  \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n33\t\n \nSupplementary Figure 11 \nGeneration of ecmA and ecmB-GFP knock-in strains \n \nA. Schematic of the construction of ecmA/B-GFP knock-in strains. A plasmid \ncarrying GFP and a blasticidin resistance cassette flanked by loxP sites was integrated \ninto the genome of wild type or Set1 mutant cells through homologous recombination \nat either the ecmA or ecmB locus. After selection the blasticidin cassette was removed \nby expression of Cre-recombinase.  \nB. GFP expression in knock-in strains. Gene replacement does not affect \nexpression timing or positioning. ecmB-GFP expressing cells are found in the stalk \ntube, basal disc, lower and upper cup. ecmA-GFP expressing cells are found in the \nstalk tube, apical tip and lower cup. \n \nSupplementary Figure 12 \nStochastic variation can buffer against changes in deterministic fate cues \n \nA. Stochastic variation smoothens the pattern of stalk proportioning through the \ncell cycle. The lines represent the expected stalk proportioning through the cell cycle \nfor the stochastic-deterministic model (given by eqn. 3) with different levels of \nstochastic variability (>, see embedded legend). When the level of stochastic variability \n(arising from noisy gene expression) is close to zero, we see the emergence of the \nsteplike shift from stalk to spore (cf. Figure 1A.ii), and as the level of stochastic \nvariability increases the system shows a more gradual shift in cell fate through the cell \ncycle. The lines were calculated using equation (3) with the other parameter values \n(i.e., other than >) set to: & = 0.5, Q\" = 0.75, and $ = 1.  \n \nB. Stochastic variation makes the system less sensitive to cell cycle biases.  The \nlines represent the relative change in stalk proportioning resulting from a given level of \ncell cycle bias. The baseline level of stalk proportioning was calculated assuming \nuniform sampling of cells across the cell cycle. Biased sampling was achieved by \nchanging the sampling of cells such that there was a linear increase or decrease in the \nsampling probability through the cell cycle (with the degree of sampling bias given by \n!; see Supplementary Information). The bias values on the X-axis correspond to the \nproportion of cells sampled in the first or second half of the cell cycle as a result of \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n34\t\nnon-uniform sampling (e.g., a value of -¾ indicates that ¾ of cells would be in the first \nhalf of the cell cycle instead of expectation of ½ for unfirm sampling, meaning it \nrepresents an oversampling of ¼ in the first half of the cell cycle and an under-sampling \nof ¼ in the second half of the cycle). The relative change in stalk proportion was \ncalculated by taking the difference between the expected stalk proportion (given the \nlevel of cell cycle bias) and the baseline proportioning (with uniform sampling across \nthe cell cycle) and dividing by this baseline value. For example, a value 0.75 means \nthat the cell cycle bias results in 75% more stalk cells than that expected without any \nbias.  \n \n \nAcknowledgements \n \nThis work was supported by a Wellcome Trust Investigator Award (WT095643AIA) to \nC.R.L.T; and grant from NERC (NE/V012002/1) to C.R.L.T and J.B.W. \n \n \n \n \n \n  \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n35\t\nMethods \n \nStrains, culture and development \nDictyostelium discoideum strains were derived from parental Ax3 or Ax4 strains. Cells \nwere cultured in 10cm tissue culture dishes containing HL5 media including glucose \n(Formedium) or on SM agar plates (Formedium) supplemented with Klebsiella \naerogenes. For development, cells in tissue culture were diluted to 1 x 105 cells/ml and \nallowed to grow for 2-3 divisions before harvesting. Cells were induced to develop at \na density of 5.1 x 105 cells/cm2 on non-nutrient KK2 agar (16.1mM KH2PO4, 3.7mM \nK2HPO4) plates containing 1.5% purified agar (Oxoid). For chimeric development, cells \nwere mixed prior to spreading on agar. Set1 mutant lines were generated using a \nknock-out construct kindly provide by J. Chubb. Ash2 and Gcn5 mutants were selected \nfrom the GWDI project (https://remi-seq.org) and verified by diagnostic PCR. A strain \nwith a random intergenic insertion (GWDI_42_D_7) was utilised as a control. For cell \ncycle synchronisation by cold shock, exponentially growing cells were seeded at 1.5 x \n106 cells/ml and incubated for 20 hours.  \n \nReporter gene analyses \nTo measure cell-cell variation in gene expression of stochastic genes, promoter \nsequences 1kb upstream of the start codon were amplified and cloned into pDM324\t\n(Veltman et al., 2009). The vector pDM327 was digested with NgoMIV to add \nsequences containing the actin15-promoter-GFP-actin8-terminator. ecmA-GFP or \necmB-KI GFP knock-in strains were generated by gene replacement (Chattwood et \nal., 2013). To quantify prestalk cell differentiation by FACS, growing cells were re-\nsuspended at 1 x 105 cells/ml in 1.5ml of 1x stalk solution supplemented with 5mM \ncAMP and incubated in 35mm plastic bottom dishes for 9hrs at 22°C. Inductions were \ncarried out for a further 9hrs at 22°C by replenishing the cAMP and adding different \ndoses of DIF-1. For analysis, cells were washed and collected in 1ml of KK2 + 20mM \nEDTA. Fluorescence was measured by FACS (Nxt Flow Cytometer (Attune)). Relative \nproportion of RFP and GFP positive was assessed using FlowJo software. LacZ \nstaining of developed structures was carried out using strains expressing β-\ngalatosidase under promoter control of fate specific markers ecmO and ecmB. Strains \nwere fixed for 10 minutes in 4% formaldehyde and Z-buffer (10mM KCl, 60mM \nNa2HPO4, 40mM NaH2PO4, 1mM MgSO4). Structures were washed twice in Z-buffer \nbefore being permeabilised for 20 minutes in 0.1% NP40. Another two washes were \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n36\t\ncarried out before adding staining solution (5mM K3Fe(CN)6, 5mM K4Fe(CN)6, \n1mg/ml X-gal, 1mM EGTA).  \n \nWestern blotting \nGrowing cells were lysed in nuclei buffer (40mM Tris-pH7.8, 1.5% Sucrose, 0.1mM \nEDTA, 6mM MgCl2, 40mM KCl, 5mM DTT, 0.4% NP40). Pellets were re-suspended at \na density of 1 x 107 nuclei/ml in 1x PBS including 1x protease inhibitors (Promega \n#G6521) and 1X SDS-PAGE buffer. Western blots were probed with antibodies to \nHistone-3 (ab1791 - polyclonal-rabbit IgG), H3K4-me1 (ab8895 - polyclonal-rabbit \nIgG), -me2 (Millipore #07-030 - polyclonal-rabbit IgG), -me3 (ab8580 - polyclonal-rabbit \nIgG), β-actin (Santa-Cruz Biotechnology – monoclonal- mouse IgG). Gcn5- blots were \ncarried out as previously described (Huang et al., 2021). Blots were imaged using a \nChemidoc MP imaging system (Biorad) and Image Lab 5.2.1 software (Biorad).  \n \nSingle cell analysis of cell cycle and cell fate \nTo follow differentiating cells, cells were transformed with the pspA-GFP cell type \nspecific reporter gene. Log phase cells were resuspended in HL5 glucose media and \n7.5 x 103 cells were plated in 750µl deposited on a 35mm glass bottom dish (WPi). \nCells were grown for a minimum of 16 hours during which time cells were imaged every \n5 minutes. The time between cell divisions was determined by manually tracking single \ncells. After 16 hours, HL5 media was removed and cells were washed five times in \nKK2 buffer to remove residual growth medium. To induce differentiation, 750µl of \nconditioned media supplemented with 10nM DIF-1 was added. Conditioned media was \ncollected from cells developed at a high density (1 x 106 cells/ml) in 1x stalk solution \n(10mM MES pH6.2, 1mM CaCl2, 2mM NaCl, 10mM KCl, 200mg/ml streptomycin \nsulphate) supplemented with 5mM cAMP for 16 hours. Conditioned media was \ncollected and briefly centrifuged before DIF-1 was added to washed cells. Images were \ntaken at 5 minute intervals for another 20 hours. A final fluorescence image was taken \nto determine the number of prestalk and prespore cells. Cells were tracked by hand to \ndetermine the timing of the last cell division prior to the induction of cell differentiation.  \n \nREMI-seq screen for DIF-insensitive mutants \nA REMI mutant library consisting of >10,000 mutants (Gruenheit et al., 2021) was \ngrown to log phase . Cells were plated in triplicate at 2x105 cells/ml in stalk medium in \n10cm diameter tissue culture dishes with 5mM cAMP for 24h. Cells were then washed \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n37\t\ntwice with KK2 before 24h incubation with 10nM DIF. Stalk medium was then removed \nand replaced with growth medium (HL5) and cells were allowed to grow until reaching \nconfluency. Genomic DNA was prepared from the mutant library following 2 and 6 \nrounds of this selection and processed for sequencing. To control for mutants that are \nunable to differentiate in response to cAMP, an 8-Br-cAMP monolayer assay screen \nwas performed. Cells from the mutant library were seeded in triplicate at 2x105 cells/ml \nin stalk medium supplemented with 10mM 8-Br-cAMP. After 48 hours, detergent was \nadded (0.1% NP40, 10mM EDTA) to remove cells that had not formed spores. Stalk \nmedium was then removed and replaced with growth medium (HL5) and the cells were \ngrown until reaching confluency. Genomic DNA was prepared from the mutant library \nfollowing 1 round of this selection and processed for sequencing (Gruenheit et al., \n2021). Analysis of mutant pools was carried out as previously reported (Gruenheit et \nal., 2021) using Z-score thresholds of > 1.5 for enriched mutants and < -1 for depleted \nmutants. Mutants (Supplementary data file 1) were identified as DIF-insensitive if they \nwere enriched in the cAMP removal screen, not down-regulated in 8-Br-cAMP screen \nand not previously identified as having a growth phenotype (Gruenheit et al., 2021). \n \nSample preparation and analysis of bulk RNA-seq data \nRNA was extracted from log phase growing cells or cells developed for 16.5 hours on \n1.5% non-nutrient agar at 22°C.  For RNA sequencing of chimeric development, wild \ntype cells expressing actin15-GFP were mixed in a 1:1 ratio with set1- cells expressing \nactin15-RFP and developed for 16.5 hours. Chimeric slugs were disaggregated by \npassing through a 25G needle before sorting (BD FACSaria) into GFP and RFP \npositive populations. RNA was extracted and RNA integrity number (RIN) of samples \nwas determined by Tapestation (Agilent). Libraries were prepared from samples \ndeemed with a RIN > 8 using an Illumina TruSeq kit and sequencing was undertaken \non a Hiseq-4000 (illumina) using 100bp pair-end chemistry. Sequences were trimmed \nof Truseq adapters and quality controlled (Trimmomatic) by discarding reads shorter \nthan 20bp or those where the average quality score dropped below an average of 15 \nin a sliding 4 base pair window. Leading and trailing bases of reads below a phred \nscore of 30 were also removed from tags. Reads were aligned (Bowtie2) to the D. \ndiscoideum genome (an inverted repeat on chromosome 2 was masked), bamfiles \nwere sorted (Samtools) and reads counted using the RPKM_count.py script (RSeQC). \nDESeq2 v1.26.0 was used for differential expression analyses. Thresholds for \ndifferential expression between samples were set at a p.adj < 0.01 with a fold change \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n38\t\nof > 2 between samples. To calculate differences in cell type specific gene expression, \nprestalk  and prespore RNA-seq data was downloaded from the SRA (PRJNA543665) \n(de Oliveira et al., 2019). Genes with less than 10 read counts were removed. A cell \ntype index was calculated for the remaining 5319 genes (Cell type index = expression \ncount in prestalk cells / (expression count in prestalk cells + expression count in \nprespore cells). \n \nChIP-seq analysis \nBulk RNA-seq data from vegetative cells (this study) was used to rank genes based \non their level of expression. Genes with detectable expression (ie > 0 normalised \nreads) were divided into ten equally sized bins. ChIP-seq data for two H3K4me1 and \nH3K4me3 replicates was downloaded from the GEO database (accession \n#GSE137604 Sub-Series GSE137599) as narrowPeak files 39. Promoter regions \naround the TSS for each gene in each bin were identified (-2500bp to + 2500bp \nup/down-stream of the TSS) and annotated. Using functions from the chIPSeeker \npackage and custom scripts (https://github.com/WilliamSalvidge/dictyChipSeq) \nannotated regions were intersected H3K4-me1 or -me3 peaks as defined by \nnarrowPeak files. Overlaps were averaged for each expression bin and plotted using \nthe plotAvgProf function from the chIPSeeker R package. This accounts for differing \nnumbers of peaks in different expression bins and allows patterns of peak density \nbetween expression bins to be compared. To compare peak distribution in variable \nand non-variable genes, an equal number (2024) of genes was sampled using a \nweighted probability based on the expression of variable genes. \n \nIdentification of variable genes using single cell RNA-sequencing \nData for 81 single wild type cells isolated using the Fluidigm C1 platform were \ndownloaded from the SRA (SAMN07833758 - SAMN0783383). Reads were \nnormalised (DESeq2 v1.26.0) and the coefficient of variation squared (CV2) was \ncalculated and plotted against mean expression. A trend line was fitted to the data \nusing non-linear least squares regression (Scran v1.15.9). Genes were defined as \nvariable (2073 genes) based on a one-sided test assuming a normal distribution \naround the trend but one where deviation changed depending on the mean expression \nof a given gene (Scran v1.15.9 - modelGeneCV2) with a FDR of < 0.05.  To identify \ngenes with a cell cycle signature, single cells were clustered using M3Drop v1.12.0 as \npreviously reported (Gruenheit et al., 2018). Cells could be organised into three \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n39\t\nclusters of 25, 31 and 25 cells. Marker genes for each cluster were identified using \nlogistic regression (M3Drop v1.12.0). A relaxed AUROC threshold (> 0.65 ) ensured \nthat all genes possessing weak cell cycle signature could be identified (5529 genes). \nThis allowed genes variable genes to be identified that where variation is dependent \non cell cycle position (1016 genes) and independent of cell cycle position (1057 \ngenes).  \n \nSingle cell sequencing of wild type and set1- cells \nActin15-GFP expressing wild type or set1- cells were grown on tissue cultures dishes \nand harvested during log-growth. Cells were washed in 1x PBS and incubated with \nDAPI (2.5µg/ml). Cells were resuspended at a density of 2.8 x 104 cells/ml and \ndispensed into a SMARTer ICELL8 3’ DE Chip using the ICELL8 cx Single Cell \nSystem. Wells of the 3’ DE Chip contain pre-printed oligonucleotides possessing well \nspecific barcodes and UMI connected to a polydT region for hybridisation with \npolyadenylated transcripts. For each well of the chip, 50nl of stained cell solution was \naliquoted to maximise the number of wells containing a single cell. Cells were imaged \ndirectly dispensing into nano-wells using DAPI and FITC filters and the 3’ DE Chip was \nfrozen at -80°C. Images taken were analysed to identify wells containing individual \ncells based on DAPI and GFP fluorescence. In total 799 wells were identified that \ncontained single cells (399 wild type and 400 set1-). The 3’ DE Chip was then thawed \nto lyse cells and loaded onto the ICELL8 cx Single Cell System where components for \nreverse transcription (RT) and cDNA amplification were dispensed into chosen wells. \nAfter RT-PCR, products from separate wells were collected into a single sample, \nconcentrated and purified according to the manufacturer’s instructions. Samples were \nprepared for sequencing using a Nextera XT Library Prep Kit (illumina) and sequenced \non a NextSeq 500 (illumina) system utilising one flow cell and a NextSeq High-Output \nkit (2 x 75bp reads). One read was used to sequence the well barcode and transcript \nUMI, with the second reading the 3’ end of the transcript itself. This yielded >500 million \nreads. FASTQ files were demultiplexed (mappa), reads were then quality controlled \n(reads shorter than 15bp discarded, a minimum of 30% N’s allowed, phred score of \n20) and trimmed of adapters (cutadapt). Reads were aligned (STAR) to the latest \nversion of the D. discoideum genome (v2.7), sorted (Samtools) and tags counted (UMI-\ntools). Cells were quality controlled (Scater v1.14.6) and cells over 2 median \nassociated deviations (MADs) from the median for library size, total number of features \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n40\t\nor mitochondrial reads excluded as outliers. This left 310 wild type and 310 set1- cells. \nGenes with < 5 reads were also removed from further analyses. \n \nCell cycle analysis \nGenes previously defined by scRNA-seq as markers of M/S and G2 cells from 81 \nFluidigm-sorted cells (Gruenheit et al., 2018) were used to compare cell cycle patterns \nin iCell8-sorted Ax4 and Set1- cells. The average expression of all 876 M/S genes and \n642 G2 genes with expression in these samples (>0 reads) was determined. The \nnormalised ratio of M/S to G2 marker gene expression was used to define the cell \ncycle position in each cell.  \n \nModel fitting to measurements of stalk cell induction  \nTo evaluate the fit of our stochastic-deterministic model to data, we used data on stalk \nfate measured at different times in the progression through the cell cycle in population \nof cells aligned in the cell cycle\t(Gruenheit et al., 2018). Stalk fate propensity was \nmeasured in two genetically different sets of cells (wildtype AX3 and AX3 with a \nknockout of the gene gefE) grown under two conditions (‘normal’, G+, and low glucose, \nG−, conditions), which alter the stalk propensity of cells. We used data from the first \nsix timepoints, corresponding to zero to five hours after the last division, and combined \nthe four sets of cells by adjusting the values in set such that their mean propensity \nmatched the overall global mean propensity (and hence there is no difference in \naverage propensity of the four sets; see Supplementary Data for the raw and adjusted \nvalues). After combining the four sets, one outlier was identified (gefE− under G− at \nhour 5), which was consistent with those cells passing a checkpoint where they re-\nenter mitosis, and was removed, after which data were re-normalised as described \nabove (see Supplementary Information for a comparison of the model fitting with the \noutlier included). To confirm that the different sets of cells behave similarly, we also \nfitted the model separately to each class and see no evidence of heterogeneity of \nmodel estimates. The stochastic-deterministic model was fitted to these data using the \n‘NonlinearModelFit’ function in Wolfram Mathematica version 14, which uses the \nLevenberg–Marquardt algorithm for least-squares curve fitting. This model fitting \nyielded estimates of Q\"\n∗ and $, the Akaike Information Criterion (AIC) and of the error \nand total sums of squares, which were used to calculate the R-squared. The details of \nthe alternative models that were fitted and the methods used for comparing the fit of \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n41\t\ndifferent models to the stalk propensity data are provided in the Supplementary \nInformation.  \n \nSTAR Methods \n \nSoftware \n \nTrimmomatic - http://www.usadellab.org/cms/index.php?page=trimmomatic \nBowtie2 - http://bowtie-bio.sourceforge.net/index.shtml \nSamtools - http://samtools.sourceforge.net \nRSeQC - http://rseqc.sourceforge.net \nTakara mappa - https://takarabiousa.github.io/mappa_userguide.html \nCutadapt - https://github.com/marcelm/cutadapt \nSTAR aligner manual - \nhttps://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf \nUMI tools manual - https://github.com/CGATOxford/UMI-tools \nD. discoideum genome - \nhttps://protists.ensembl.org/Dictyostelium_discoideum/Info/Index \nMathematica 14.0  - https://www.wolfram.com/mathematica/ \n \nR - packages \n \nDESeq2 - v1.26.0 - \nhttps://bioconductor.org/packages/release/bioc/html/DESeq2.html \nScater - v1.14.6 - https://bioconductor.org/packages/release/bioc/html/scater.html \nScran - v1.15.9 - https://bioconductor.org/packages/release/bioc/html/scran.html \nM3Drop - v1.12.0 - https://bioconductor.org/packages/release/bioc/html/M3Drop.html \nSC3 - v1.14.0 - https://www.bioconductor.org/packages/release/bioc/html/SC3.html \nSuperheat - v.0.1.0 - https://rlbarter.github.io/superheat/ \n \nReferences \n \nAzuara, V., Perry, P., Sauer, S., Spivakov, M., Jorgensen, H.F., John, R.M., Gouti, \nM., Casanova, M., Warnes, G., Merkenschlager, M., et al. (2006). Chromatin \nsignatures of pluripotent cell lines. Nat Cell Biol 8, 532-538. \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n42\t\nBarski, A., Cuddapah, S., Cui, K., Roh, T.Y., Schones, D.E., Wang, Z., Wei, G., \nChepelev, I., and Zhao, K. (2007). High-Resolution Profiling of Histone Methylations \nin the Human Genome. Cell 129, 823-837. \nBelcher, L.J., Madgwick, P.G., Kuwana, S., Stewart, B., Thompson, C.R.L., and Wolf, \nJ.B. (2022). Developmental constraints enforce altruism and avert the tragedy of the \ncommons in a social microbe. Proc Natl Acad Sci U S A 119, e2111233119. \nBenayoun, B.A., Pollina, E.A., Ucar, D., Mahmoudi, S., Karra, K., Wong, E.D., \nDevarajan, K., Daugherty, A.C., Kundaje, A.B., Mancini, E., et al. (2014). H3K4me3 \nBreadth Is Linked to Cell Identity and Transcriptional Consistency. Cell 158, 673-688. \nBernstein, B.E., Mikkelsen, T.S., Xie, X., Kamal, M., Huebert, D.J., Cuff, J., Fry, B., \nMeissner, A., Wernig, M., Plath, K., et al. (2006). A bivalent chromatin structure \nmarks key developmental genes in embryonic stem cells. Cell 125, 315-326. \nBian, C., Xu, C., Ruan, J., Lee, K.K., Burke, T.L., Tempel, W., Barsyte, D., Li, J., Wu, \nM., Zhou, B.O., et al. (2011). Sgf29 binds histone H3K4me2/3 and is required for \nSAGA complex recruitment and histone H3 acetylation. The EMBO journal 30, 2829-\n2842. \nButtery, N.J., Rozen, D.E., Wolf, J.B., and Thompson, C.R.L. (2009). Quantification \nof Social Behavior in D. discoideum Reveals Complex Fixed and Facultative \nStrategies. Current Biology 19, 1373-1377. \nChang, H.H., Hemberg, M., Barahona, M., Ingber, D.E., and Huang, S. (2008). \nTranscriptome-wide noise controls lineage choice in mammalian progenitor cells. \nNature 453, 544-547. \nChattwood, A., Nagayama, K., Bolourani, P., Harkin, L., Kamjoo, M., Weeks, G., and \nThompson, C.R. (2013). Developmental lineage priming in Dictyostelium by \nheterogeneous Ras activation. eLife 2, e01067-e01067. \nChattwood, A., and Thompson, C.R.L. (2011). Non-genetic heterogeneity and cell \nfate choice in Dictyostelium discoideum. Development, Growth & Differentiation 53, \n558-566. \nChubb, J.R., Bloomfield, G., Xu, Q., Kaller, M., Ivens, A., Skelton, J., Turner, B.M., \nNellen, W., Shaulsky, G., Kay, R.R., et al. (2006a). Developmental timing in \nDictyostelium is regulated by the Set1 histone methyltransferase. Developmental \nbiology 292, 519-532. \nChubb, J.R., Bloomfield, G., Xu, Q., Kaller, M., Ivens, A., Skelton, J., Turner, B.M., \nNellen, W., Shaulsky, G., Kay, R.R., et al. (2006b). Developmental timing in \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n43\t\nDictyostelium is regulated by the Set1 histone methyltransferase. Dev Biol 292, 519-\n532. \nCornell, J.R., and Benjamin, C.A. (1970). Probability, Statistics, and Decisions for \nCivil Engineers. (NY: McGraw-Hill). \nde Oliveira, J.L., Morales, A.C., Stewart, B., Gruenheit, N., Engelmoer, J., Brown, \nS.B., de Brito, R.A., Hurst, L.D., Urrutia, A.O., Thompson, C.R.L., et al. (2019). \nConditional expression explains molecular evolution of social genes in a microbe. \nNat Commun 10, 3284. \nDesai, R.V., Chen, X., Martin, B., Chaturvedi, S., Hwang, D.W., Li, W., Yu, C., Ding, \nS., Thomson, M., Singer, R.H., et al. (2021). A DNA repair pathway can regulate \ntranscriptional noise to promote cell fate transitions. Science 373. \nGomer, R.H., and Ammann, R.R. (1996). A cell-cycle phase-associated cell-type \nchoice mechanism monitors the cell cycle rather than using an independent timer. \nDev Biol 174, 82-91. \nGomer, R.H., and Firtel, R.A. (1987). Cell-autonomous determination of cell-type \nchoice in Dictyostelium development by cell-cycle phase. Science 237, 758-762. \nGruenheit, N., Baldwin, A., Stewart, B., Jaques, S., Keller, T., Parkinson, K., \nSalvidge, W., Baines, R., Brimson, C., Wolf, J.B., et al. (2021). Mutant resources for \nfunctional genomics in Dictyostelium discoideum using REMI-seq technology. BMC \nBiology 19, 1-19. \nGruenheit, N., Parkinson, K., Brimson, C.A., Kuwana, S., Johnson, E.J., Nagayama, \nK., Llewellyn, J., Salvidge, W.M., Stewart, B., Keller, T., et al. (2018). Cell Cycle \nHeterogeneity Can Generate Robust Cell Type Proportioning. Developmental Cell \n47, 494–508.e494-494–508.e494. \nHong, S.H., Rampalli, S., Lee, J.B., McNicol, J., Collins, T., Draper, J.S., and Bhatia, \nM. (2011). Cell fate potential of human pluripotent stem cells is encoded by histone \nmodifications. Cell Stem Cell 9, 24-36. \nHsu, D.-W., Chubb, J.R., Muramoto, T., Pears, C.J., and Mahadevan, L.C. (2012). \nDynamic acetylation of lysine-4-trimethylated histone H3 and H3 variant biology in a \nsimple multicellular eukaryote. Nucleic acids research 40, 7247-7256. \nHuang, L.-Y., Hsu, D.-W., and Pears, C.J. (2021). Methylation-directed acetylation of \nhistone H3 regulates developmental sensitivity to histone deacetylase inhibition. \nNucleic Acids Research 49, 3781-3795. \nHuh, D., and Paulsson, J. (2011). Non-genetic heterogeneity from stochastic \npartitioning at cell division. Nature Genetics 43, 95-100. \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n44\t\nInsall, R., Nayler, O., and Kay, R.R. (1992). DIF-1 induces its own breakdown in \nDictyostelium. The EMBO Journal 11, 2849-2854. \nKaller, M., Nellen, W., and Chubb, J.R. (2006). Epigenetics in Dictyostelium. \nMethods Mol Biol 346, 491-505. \nKay, R.R., and Thompson, C.R.L. (2001). Cross-induction of cell types in \nDictyostelium: evidence that DIF-1 is made by prespore cells. Development 128, \n4959-4966. \nLorenz, D.R., Meyer, L.F., Grady, P.J.R., Meyer, M.M., and Cam, H.P. (2014). \nHeterochromatin assembly and transcriptome repression by Set1 in coordination with \na class II histone deacetylase. eLife 3. \nMaamar, H., Raj, A., and Dubnau, D. (2007). Noise in Gene Expression Determines \nCell Fate in Bacillus subtilis. Science 317, 526-529. \nMadgwick, P.G., Stewart, B., Belcher, L.J., Thompson, C.R.L., and Wolf, J.B. (2018). \nStrategic investment explains patterns of cooperation and cheating in a microbe. \nProceedings of the National Academy of Sciences of the United States of America \n115, E4823–E4832-E4823–E4832. \nMargaritis, T., Oreal, V., Brabers, N., Maestroni, L., Vitaliano-Prunier, A., Benschop, \nJ.J., van Hooff, S., van Leenen, D., Dargemont, C., Géli, V., et al. (2012). Two \nDistinct Repressive Mechanisms for Histone 3 Lysine 4 Methylation through \nPromoting 3′-End Antisense Transcription. PLoS Genetics 8, e1002952-e1002952. \nPauklin, S., and Vallier, L. (2013). The Cell-Cycle State of Stem Cells Determines \nCell Fate Propensity. Cell 155, 135-147. \nPeccoud, J., and Ycart, B. (1995). Markovian Modeling of Gene-Product Synthesis. \nTheoretical Population Biology 48, 222-234. \nRaj, A., Peskin, C.S., Tranchina, D., Vargas, D.Y., and Tyagi, S. (2006). Stochastic \nmRNA Synthesis in Mammalian Cells. PLoS Biology 4, e309-e309. \nRamakrishnan, S., Pokhrel, S., Palani, S., Pflueger, C., Parnell, T.J., Cairns, B.R., \nBhaskara, S., and Chandrasekharan, M.B. (2016). Counteracting H3K4 methylation \nmodulators Set1 and Jhd2 co-regulate chromatin dynamics and gene transcription. \nNature Communications 7. \nRatcliff, W.C., and Denison, R.F. (2010). Individual-level bet hedging in the bacterium \nSinorhizobium meliloti. Curr Biol 20, 1740-1744. \nRodrigues, A.M.M., and Gardner, A. (2022). Reproductive value and the evolution of \naltruism. Trends in Ecology & Evolution 37, 346-358. \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n45\t\nRotem, A., Ram, O., Shoresh, N., Sperling, R.A., Goren, A., Weitz, D.A., and \nBernstein, B.E. (2015). Single-cell ChIP-seq reveals cell subpopulations defined by \nchromatin state. Nature biotechnology 33, 1165-1172. \nSaiz, N., Williams, K.M., Seshan, V.E., and Hadjantonakis, A.-K. (2016). \nAsynchronous fate decisions by single cells collectively ensure consistent lineage \ncomposition in the mouse blastocyst. Nature Communications 7, 13463-13463. \nSantos-Rosa, H., Schneider, R., Bernstein, B.E., Karabetsou, N., Morillon, A., Weise, \nC., Schreiber, S.L., Mellor, J., and Kouzarides, T. (2003). Methylation of Histone H3 \nK4 Mediates Association of the Isw1p ATPase with Chromatin. Molecular Cell 12, \n1325-1332. \nShi, X., Hong, T., Walter, K.L., Ewalt, M., Michishita, E., Hung, T., Carney, D., Peña, \nP., Lan, F., Kaadige, M.R., et al. (2006). ING2 PHD domain links histone H3 lysine 4 \nmethylation to active gene repression. Nature 442, 96-99. \nShi, X., Kachirskaia, I., Walter, K.L., Kuo, J.H.A., Lake, A., Davrazou, F., Chan, S.M., \nMartin, D.G.E., Fingerman, I.M., Briggs, S.D., et al. (2007). Proteome-wide analysis \nin Saccharomyces cerevisiae identifies several PHD fingers as novel direct and \nselective binding modules of histone H3 methylated at either lysine 4 or lysine 36. \nJournal of Biological Chemistry 282, 2450-2455. \nSims, R.J., Chen, C.F., Santos-Rosa, H., Kouzarides, T., Patel, S.S., and Reinberg, \nD. (2005). Human but not yeast CHD1 binds directly and selectively to histone H3 \nmethylated at lysine 4 via its tandem chromodomains. Journal of Biological \nChemistry 280, 41789-41792. \nSlack, J. (2014). Establishment of spatial pattern. Wiley Interdiscip Rev Dev Biol 3, \n379-388. \nSoares, L.M., He, P.C., Chun, Y., Suh, H., Kim, T.S., and Buratowski, S. (2017). \nDeterminants of Histone H3K4 Methylation Patterns. Molecular Cell 68, 773–\n785.e776-773–785.e776. \nStevense, M., Muramoto, T., Muller, I., and Chubb, J.R. (2010). Digital nature of the \nimmediate-early transcriptional response. Development 137, 579-584. \nStrasser, K., Bloomfield, G., MacWilliams, A., Ceccarelli, A., MacWilliams, H., and \nTsang, A. (2012). A Retinoblastoma Orthologue Is a Major Regulator of S-Phase, \nMitotic, and Developmental Gene Expression in Dictyostelium. PLoS ONE 7, \ne39914-e39914. \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n46\t\nSüel, G.M., Kulkarni, R.P., Dworkin, J., Garcia-Ojalvo, J., and Elowitz, M.B. (2007). \nTunability and noise dependence in differentiation dynamics. Science (New York, \nNY) 315, 1716-1719. \nSze, C.C., Ozark, P.A., Cao, K., Ugarenko, M., Das, S., Wang, L., Marshall, S.A., \nRendleman, E.J., Ryan, C.A., Zha, D., et al. (2020). Coordinated regulation of \ncellular identity–associated H3K4me3 breadth by the COMPASS family. Science \nAdvances 6, eaaz4764-eaaz4764. \nTaverna, S.D., Ilin, S., Rogers, R.S., Tanny, J.C., Lavender, H., Li, H., Baker, L., \nBoyle, J., Blair, L.P., Chait, Brian T., et al. (2006). Yng1 PHD Finger Binding to H3 \nTrimethylated at K4 Promotes NuA3 HAT Activity at K14 of H3 and Transcription at a \nSubset of Targeted ORFs. Molecular Cell 24, 785-796. \nThompson, C.R., and Kay, R.R. (2000). Cell-fate choice in Dictyostelium: intrinsic \nbiases modulate sensitivity to DIF signaling. Dev Biol 227, 56-64. \nUlfig, A., and Jakob, U. (2024). Redox heterogeneity in mouse embryonic stem cells \nindividualizes cell fate decisions. Dev Cell 59, 2118-2133 e2118. \nVeltman, D.M., Akar, G., Bosgraaf, L., and Van Haastert, P.J.M. (2009). A new set of \nsmall, extrachromosomal expression vectors for Dictyostelium discoideum. Plasmid \n61, 110-118. \nWang, S.Y., Pollina, E.A., Wang, I.H., Pino, L.K., Bushnell, H.L., Takashima, K., \nFritsche, C., Sabin, G., Garcia, B.A., Greer, P.L., et al. (2021). Role of epigenetics in \nunicellular to multicellular transition in Dictyostelium. Genome Biology 2021 22:1 22, \n1-30. \nWeinberger, L., Voichek, Y., Tirosh, I., Hornung, G., Amit, I., and Barkai, N. (2012). \nExpression noise and acetylation profiles distinguish HDAC functions. Molecular cell \n47, 193-202. \nWest, S.A., and Cooper, G.A. (2016). Division of labour in microorganisms: An \nevolutionary perspective. Nature Reviews Microbiology 14, 716-723. \nWolf, J.B., Howie, J.A., Parkinson, K., Gruenheit, N., Melo, D., Rozen, D., and \nThompson, C.R.L. (2015). Fitness trade-offs result in the illusion of social success. \nCurrent Biology 25, 1086-1090. \nWu, S., Li, K., Li, Y., Zhao, T., Li, T., Yang, Y.-F., and Qian, W. (2017). Independent \nregulation of gene expression level and noise by histone modifications. PLOS \nComputational Biology 13, e1005585-e1005585. \nWysocka, J., Swigut, T., Xiao, H., Milne, T.A., Kwon, S.Y., Landry, J., Kauer, M., \nTackett, A.J., Chait, B.T., Badenhorst, P., et al. (2006). A PHD finger of NURF \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n47\t\ncouples histone H3 lysine 4 trimethylation with chromatin remodelling. Nature 442, \n86-90. \nYadav, T., Quivy, J.P., and Almouzni, G. (2018). Chromatin plasticity: A versatile \nlandscape that underlies cell fate and identity. Science 361, 1332-1336. \nYamanaka, Y., Lanner, F., and Rossant, J. (2010). FGF signal-dependent \nsegregation of primitive endoderm and epiblast in the mouse blastocyst. \nDevelopment (Cambridge, England) 137, 715-724. \nZahavi, A., Harris, K.D., and Nanjundiah, V. (2018). An individual-level selection \nmodel for the apparent altruism exhibited by cellular slime moulds. J Biosci 43, 49-\n58. \n \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nCCAF level\nThreshold\nExpression level\nThreshold\nTime (post mitosis)\nDecay in CCAF levelThreshold\nExpression level\nTime (post mitosis)\nStalk proportion Stalk proportion Stalk proportion\nTime (hours)\nStalk proportion Stalk proportion Stalk proportion\nA.\nB.\nC.\ni. ii. iii.\ni. ii. iii.\ni. ii. iii.\nFigure 1\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nA. B.\nC. D.\n−40\n−20\n0\n20\n−60 −30 0 30 60\nPC1 − 23 %\nPC2 − 10%\n−40\n−20\n0\n20\n−60 −30 0 30 60\nPC1 − 7 %\nPC2 − 3%\nCell cycle dependent\nCell cycle independent\ncell cycle dependent - 1016 genes\ncell cycle independent - 1057 genes\ni. ii.\ni.\nii.\n−1.0\n−0.5\n0.0\n0.5\n1.0\n1.5\n0 1 234\n−1.0\n−0.5\n0.0\n0.5\n1.0\n1.5\n0 1 234\nLog10 observed variability (CV2) / expected variability (CV2)\n−1.0\n−0.5\n0.0\n0.5\n1.0\n1.5\n0 1 234\nLog10 average expression\nM/S\nG2.1\nG2.2\nGroup 1 \ncells\nGroup 2 \ncells\nGroup 3 \ncells\nGroup 1 \ncells\nGroup 2 \ncells\nGroup 3 \ncells\nM/S\nG2.1\nG2.2\nLog10 average expression\nLog10 observed variability (CV2) /\n expected variability (CV2)\n−1\n0\n1\n2\n0 1 234\nNot variable\nVariable\nFigure 2\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nBA\nC\nPool of 28,000 \nREMI Mutants\nPool of 28,000 \nREMI Mutants\nPool of 28,000 \nREMI Mutants\n+ Br-cAMP Culture in HL5\n+ detergent\n+ cAMP + 10nM DIF-1 Culture in HL5\nSelection for Terminal Development\nSelection for Growth Advantage\nx6\nSelection for Insensitivity\nto DIF-1\nMutants that can\n differentiate as spores\nsequence\nsequencex2\nx6\nx3\nx6\nsequence\nRound 2\nRound 6\nBrcAMP\nRound 3\nRound 6\nGrowth\nDIF-1\ninsensitive\n-4 -3 -1 0.4 2\nDIF signalling mutants Other mutants\ni\nii\niii\n* ns **** **** ****** **** ** ****\n0\n25\n50\n75\n100\nWild type\nDDB_G0281087DDB_G0293590DDB_G0274923DDB_G0280389DDB_G0285055DDB_G0270586DDB_G0281971DDB_G0272474DDB_G0279425\n% Stalk Cell\nWT\nR2\nR6\nR2 & R6\nZ-score\n0\n0.05\n0.1\n0.15\n0.2\n0.25\n0.3\n0.35\n0-0.1 0.1-0.2 0.2-0.3 0.3-0.4 0.4-0.5 0.5-0.6 0.6-0.7 0.7-0.8 0.8-0.9 0.9-1\n***\n***\nE\nD\nDevelopmental index \n(developmental expression vs vegetative expression)\nFrequency\nStochastic\nCell cycle\nNon variable\nF\n0\n10\n20\n30\n40\n50\n60\nstochastic cell cycle stochastic \n+ cell cycle\nstochastic\ncell cycle\nexpected\n***\n***\nNumber of REMI mutants\n0\n0.05\n0.1\n0.15\n0.2\n0.25\n0.3\n0.35\n0.4\n0.45\n0-0.2 0.2-0.4 0.4-0.6 0.6-0.8 0.8-1\nStochastic\nCell cycle\nNon variable\nCell type index \n(prestalk expression vs prespore expression)\nFrequency ***\n***\n***\n***\nNS\nNS\n***\n***\nNS\nNS\n0\n1\n2\n3\n4\n5\n6\n7\n8\n9\n10\nCV\n2 (normalised to expression)\nall genes Cell fate \nREMI mutants\n***\nFigure 3 .CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n0e+00\n2e−04\n4e−04\n−2000−1000 0 1000 2000\nPosition relative to TSS\nPeak density\n0e+00\n2e−04\n4e−04\n−2000−1000 0 1000 2000\n1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n0e+00\n1e−04\n2e−04\n3e−04\n−2000−1000 0 1000 2000\n+3.4Pe1\n0e+00\n1e−04\n2e−04\n3e−04\n4e−04\n−2000−1000 0 1000 2000\n1on−variable genes\nVariable genes\n+3.4Pe3\nPosition relative to TSS\nPeak density\n increasing gene expression level\nPosition relative to TSS Position relative to TSS\n+3.4Pe1 +3.4Pe3\nA.\nB.\nPeak density\nPeak density\n0\n100\n200\n300\n400\n500\n600\n700\nStochatic genes Cell cycle genes\nCell cycle observed\nExpected\nStochastic observed\nNumber of genes\n with H3K4Me1 mark\n+3.4Pe1\n***\n***\nC.\n0\n100\n200\n300\n400\n500\n600\n700\n800\n900\n+3.4Pe3\n*** NS\nCell cycle observed\nExpected\nStochastic observed\nNumber of genes\n with H3K4Me3 mark\nStochatic genes Cell cycle genes\nFigure 4\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n0.0\n2.5\n5.0\n7.5\n10.0\n−\u001b −4 0 4\u001b\nLog2 Fold change\n−log10 padj\nA. B. C.\nD. E.\ncell cycle stochastic\nactual expected actual expected\n0\n30\n60\n90\n0\n20\n40\n60\ngene number\nWT Set1- WT Set1-\n0.0\n0.5\n1.0\n1.5\n2.0Log 10 Average Expression\nWT Set1- WT Set1-\n0.0\n0.5\n1.0\n1.5\n2.0\n2.5\nLog10 CV2\n0\n1\n2\n3\n0.05 0.01 0.001 0.0001\nFold enrichemnt\nFDR\n* *\nup-regulated up-regulateddown-regulated down-regulated\n+3.4Pe1 17\n$x4 Set1\n-\n+3.4Pe2 17\n+3.4Pe3 17\nH3 17\nћ-actin 42\n0\n0.1\n0.2\n0.3\n0.4\n0.5\n0.6\n0\n0.005\n0.01\n0.015\n0.02\n0.025\n0.03\n0.035\n0.04\n0.045\n0.05\nWT Set1 - WT Set1 -\nNormalised expresion\nNormalised expresion\nF.\nactin15 gtaU hspF-2\n0\n0.1\n0.2\n0.3\n0.4\n0.5\n0.6\n0.7\n0.\u001b\n0.9\n1\n0\n2\n4\n6\n\u001b\n10\n12\n14\n0\n2\n4\n6\n\u001b\n10\n12\nWT Set1 - WT Set1 - WT Set1 -\nCV2\nCV2\nCV2\nactin15 gtaU hspF-2\nG. H.\n0\n0.1\n0.2\n0.3\n0.4\n0.5\n0.6\n0.7\nWT Set1 -\nNormalised expresion\n*** ***\nFigure 5\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nWT act-RFP Set1- act-RFP\nWTset1-\ni ii\niii iv\nA.\npsp pstO\npstA\npstB\nB.\nWT GFP up-regulated Set1- RFP up-regulated\nWT act-GFP / Set1- act-RFP  \nC.\nWT\nWT ecmO-LacZ Set1- ecmO-LacZi.WT\nWT ecmB-LacZ Set1- ecmB-LacZ\nii.WT\nWT pspA-RFP Set1- pspA-RFP\niii.\nD.\nSet1-WT\necmO-LacZecmB-LacZ\nE.\n17\n17\n17\n17\nH3K4me1\nH3K4me2\nH3K4me3\nH3\nWT ash2\n-\nset1\n-\n17\n17\n17\nH3K9Ac\nH3K14Ac\nH3\nWT gcn5\n-\nF. G.\nWT act-RFP gcn5 - act-RFP\nWTgcn5 -\nWT act-RFP ash2 - act-RFP\nWTash2 -\nH. I.\nii\niii iv\ni ii\niii iv\nCell type index Cell type index\n***\n0\n20\n40\n60\n80\n0.00 0.25 0.50 0.75 1.00\nCount\n***\n0\n10\n20\n30\n0.00 0.25 0.50 0.75 1.00\nCount\nsamplesample\nAX4 slug up Set1- slug up\nsamplesample\ni\nFigure 6\n***\n0\n5\n10\n15\n0.00 0.25 0.50 0.75 1.00\n***\n0\n5\n10\n15\n0.00 0.25 0.50 0.75 1.00\nCount\nCell type index Cell type index\nCount\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nWT Set1-\nG2 MS G2 MS\n0\n20\n40\n60Percentage of cells in cell cycle phases 6\n8\n10\nWT Set1-\nDivision time (hours)\nA. B.\nC. D.\n0.00\n0.05\n0.10\nactin gtaU hspf\nAverage Expression (Abritrary Units)\n0\n1\n2\n3\n4\nactin gtaU hspf\nCoefficient of variation (CV )\n11 degrees\n22 degrees\n 2\n11 degrees\n22 degrees\nFigure 7 .CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n11°C\nWT Set1-\n22°C\nB.\nA.\n0\n250\n500\n750\n1000\n123 4\nLog10 ecmA GFP fluorescence\nNumber of cells\n0\n250\n500\n750\n1000\n1234\n0\n500\n1000\n123 4\nNumber of cells\n0\n500\n1000\n1234\n0nm 0.1nm 1nm 10nm 50nm\nWT Set1-\nWT Set1-\nLog10 ecmA GFP fluorescence\nLog10 ecmB GFP fluorescence Log10 ecmB GFP fluorescence\nDIF-1 concentration\nC.\nD.\n0\n10\n20\n30\n40\n50\n60\n70\n80\n90\n100\n0 - 1 1 - 2 2 - 3 3 - 4 4 - 5 5 - 6 6 - 7\nwild type\nset1-\nHours after division\n% stalk cells\nFigure 8\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nTime (hours) \nStalk proportion\nTime (hours) \nExponential decay model Stochastic-deterministic model\n(gamma distribution)\nSupplementary Figure 1\nA.\nB.\nGaussian noise Gamma noise\n Distribution of CCIF values → \nProbability density\nC.\nCCAF value\nTime (hours) \ni.\ni.\ni.\nStochastic-deterministic model\n(exponential decay)\nii. iii.\nii.\nii. iii.\nTime (hours) \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n5529 genes\n5274 genes AUC > 0.65\np-value < 0.05\n0.00\n0.25\n0.50\n0.75\n1.00\n0.5 0.6 0.7 0.8 0.9 1.0\nArea Under the Curve (AUC)\nS−value\n0.00\n0.25\n0.50\n0.75\n1.00\n0.5 0.6 0.7 0.8 0.9 1.0\nArea Under the Curve (AUC)\nS−value\n3 Cell Groups 2 Cell Groups\n4393 genes AUC > 0.65\np-value < 0.05\n1057 451310161136 2554138\n2073\nvariable genes\n5529\ncell cycle genes\ncell cycle \nvariable genes\nstochastic\nvariable genes\nSupplementary figure 2\n3 Cell Groups 2 Cell GroupsA.\nB. C.\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n−1\n0\n1\n2\n0 1 234\n−1\n0\n1\n2\n0 1 234\n−1\n0\n1\n2\n0 1 234\n−1\n0\n1\n2\n0 1 234\n−1\n0\n1\n2\n0 1 234\n−1\n0\n1\n2\n0 1 234\n−1\n0\n1\n2\n0 1 234\n−1\n0\n1\n2\n0 1 234\n−1\n0\n1\n2\n0 1 234\nMS\nG2-1\nG2-2\nCells\nGenes\n95 genes100 genes\n237 genes\n100\ngenes\n247 \ngenes\n669\ngenes\n662 genes\n247 genes\n669 genes\n100 genes\n242 genes\n667 genes\n25 cells31 cells25 cells\nSupplementary figure 3\nMS G2-1 G2-2\nLog10 observed variability (CV2) / \nexpected variability (CV2)\nLog10 observed variability (CV2) / \nexpected variability (CV2)\nLog10 observed variability (CV2) / \nexpected variability (CV2)\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n0.0\n0.2\n0.4\n0.6\n01234\nLog10 average expression\ndensity\nNon Variable\nVariable\n0.0\n0.2\n0.4\n0.6\n01234\nLog10 average expression\ndensity\nVariable\nsample_1\nsample_2\nsample_3\nsample_4\nsample_5\nsample_6\nsample_7\nsample_8\nsample_9\nsample_10\n10 samples expression \n0.0\n0.2\n0.4\n0.6\n01234\nLog10 average expression\ndensity\nGenes with H3K4me1 − Rep A\n0.0\n0.2\n0.4\n0.6\n01234\nLog10 average expression\ndensity\nGenes with H3K4me1 − Rep B\n0.0\n0.2\n0.4\n0.6\n01234\nLog10 average expression\ndensity\nGenes with H3K4me3 − Rep A \n0.0\n0.2\n0.4\n0.6\n01234\nLog10 average expression\ndensity\nGenes with H3K4me3 − Rep B\nReplicate_A\n780.9Ten Samples of Non Variable Genes\nwith H3K4me1− Averaged\nReplicate_B\n1121Variable Genes\nwith H3K4me1\n1005.4\n1349\nReplicate_A\n1370.5Ten Samples of Non Variable Genes\nwith H3K4me3− Averaged\nReplicate_B\n1233Variable Genes\nwith H3K4me3\n1630.1\n1505\nsample_1\nsample_2\nsample_3\nsample_4\nsample_5\nsample_6\nsample_7\nsample_8\nsample_9\nsample_10\nVariable\nsample_1\nsample_2\nsample_3\nsample_4\nsample_5\nsample_6\nsample_7\nsample_8\nsample_9\nsample_10\nVariable\nsample_1\nsample_2\nsample_3\nsample_4\nsample_5\nsample_6\nsample_7\nsample_8\nsample_9\nsample_10\nVariable\nsample_1\nsample_2\nsample_3\nsample_4\nsample_5\nsample_6\nsample_7\nsample_8\nsample_9\nsample_10\nVariable\nA. B.\nC. D.\nE. F.\nG. H.\nSupplementary figure 4\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nSupplementary figure 5\n−1\n0\n1\n2\n0 1 2 3 4\nLog10\u0003avgeragH\u0003expression\nLog10\u0003−\u00032EVHrYHG\u0003vDULDELOLW\\\u0003\u000b&92\f\u0003\u0012\n([SHFWHG\u0003vDULDELOLW\\\u0003\u000b&92) 0.05\n0.01\n0.001\n0.0001\nFDR\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nXhoI\nBglII\nactin15 / gtaU / hspF-2 promoter\nactin15 promoter\nNgoMIV\nRFP\nGFP\nNgoMIV\nactin8\nterminator\nactin8\nterminator\nSupplementary figure 6\nA. B.\n−1\n0\n1\n2\n0 1 2 3 4\nLog10 avgerage expression\nLog10 observed variability (CV2) /\nexpected variabilty (CV2)\nDDB_G0280853 - gtaU\nDDB_G0273561 - hspF-2\nDDB_G0272520 - actin15\n0\n25\n50\n75\n2345\nact::RFP fluoresence\nDensity\n0\n30\n60\n90\n120\n01234\ngtaU::RFP fluoresence\nDensity\n0\n25\n50\n75\n024\nhspf2::RFP fluoresence\nDensity\nWT\nSet1-\nWT\nSet1-\nWT\nSet1-\ni ii iii\nC.\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n8 hrs\n15 hrs\n22 hrs\n29 hrs\nAx4 Set1\n-\n1.5 hrs\n8.5 hrs\n9.5 hrs\n2.5 hrs\n3.5 hrs 4.5 hrs 5.5 hrs\n6.5 hrs 7.5 hrs 8.5 hrs\n9.5 hrs 10.5 hrs 11.5 hrs\n12.5 hrs 13.5 hrs 14.5 hrs\n15.5 hrs 16.5 hrs 17.5 hrs\nAx4 Set1-\n0.5 hrs\nSupplementary figure 7\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nA.\nax4set1-\nax4 act-RFP set1- act-RFP\nB.\nSupplementary figure 8\nstalk (ecmA) \nupper cupO (ecmO) \nspore (pspA) \nbasal disc (ecmB) \n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nH3K4me1\nH3K4me2\nH3K4me3\nћactin\nWT set1-FLAGset1 \n-\n17\n17\n17\n42\nSupplementary figure 9\nA. B.\nAx4 Set1-\nAx4::act-RFP\nAx4\nSet1-FLAG \n0\n0.5\n1\n1.5\n2\n H3K4me1 H3K4me2 H3K4me3\nWild type\nSet1-\nSet1-FLAG\nRelative level compared \nto wild type\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\n−20\n−10\n0\n10\n20\n−10 0 10 20\ntSNE_1\ntSNE_2\n0.5\n1.0\n1.5\n2.0\n2.5\nWT Fluidigm\n0\n5\n10\n15\n0.5 1.0 1.5 2.0 2.5\nM/S:G2 − normalised ratio\ncount\nG2\nMS\nWT Fluidigm\n0\n10\n20\n30\n0.4 0.8 1.2 1.6\nM/S:G2 − normalised ratio\ncount\nG2\nMS\nWT icell8\n0\n10\n20\n0.8 1.2 1.6\nM/S:G2 − normalised ratio\ncount\nG2\nMS\nSet1- icell8\nM/S:G2 \nnormalised \nratio\nA. i. ii.\nii.B. i.\nSupplementary figure 10\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nBSRGFP\necmA / ecmB\nGFP\n+ cre recombinase\nA\nB Ax4 ecmA-GFP-KI\n+ cre recombinase\nAx4 ecmB-GFP-KI\n+ cre recombinase\nloxP site loxP site\nset1 knock out \nSet1 ecmA-GFP-KI\n+ cre recombinase\nSet1 ecmB-GFP-KI\n+ cre recombinase\nSupplementary figure 11\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint \n\nA. B.\nProportion of cell cycle \ncompleted\nStalk proportion\nCell cycle bias\nRelative change in \nstalk proportion\nσ σ\n0 0½ 1 -¾ +¾-⅝ +⅝\nSupplementary figure 12\n.CC-BY 4.0 International licensemade available under a \n(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is \nThe copyright holder for this preprintthis version posted January 8, 2025. ; https://doi.org/10.1101/2025.01.08.631908doi: bioRxiv preprint","source_license":"CC-BY-4.0","license_restricted":false}