A Practical Guide to Compositional Data Analysis in Tissue Stereology and Blood Cell Profile

preprint OA: closed CC-BY-4.0
Full text 144,071 characters · extracted from preprint-html · click to expand
A Practical Guide to Compositional Data Analysis in Tissue Stereology and Blood Cell Profile | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article A Practical Guide to Compositional Data Analysis in Tissue Stereology and Blood Cell Profile Diogo B. Provete, Marcos R. Severgnini, Carlos E. Fernandes, Lilian Franco-Belussi This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9590298/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Histologic quantification and tissue stereology inherently generate compositional data, typically expressed as proportions, percentages, or volumetric densities constrained to a constant sum. Applying standard statistical methods directly to these closed data disregards their geometric constraints, which can generate spurious correlations and misleading biological inferences. Here, we provide a practical guide to help morphologists transition from conventional statistics to Compositional Data Analysis (CoDA). The mathematical framework for mapping data from the restricted Aitchison simplex to an unconstrained Euclidean space is detailed via log-ratio transformations ( clr and ilr ). Using splenic structural density, leukocyte profiles, renal architecture, and erythrocyte nuclear abnormalities as case studies from the tropical frog Leptodactylus podicipinus sampled along an urbanization gradient, we contrast the classical approach with CoDA. We demonstrate how standard Euclidean analysis can generate false positives and mask physiological trade-offs through sign reversals, whereas CoDA and its multivariate interpretation based on biplots can reveal the magnitude of tissue variances and ecophysiological associations that would otherwise remain hidden. To facilitate adoption and reduce the barriers posed by programming proficiency, the guide is accompanied by structured code and an interactive open-source, web-based graphical interface (CoDa Stereo). We argue that adopting CoDA for tissue stereology and related quantitative pathology workflows is important for ensuring analytical integrity and reproducibility. Closure effect Log-ratio transformations Quantitative microscopy Simplex MANOVA Shiny application Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Introduction The quantification of 3D microstructures from 2D histological sections is a cornerstone of modern quantitative morphology. Since the formalization of the term "stereology" by Hans Elias in 1961, the discipline has transitioned from a purely geometric pursuit to a rigorous mathematical framework indispensable across life and material sciences (Underwood, 1969 ; Elias & Hyde, 1980 ). This technique allows the reconstruction of the volumetric reality of tissues from opaque, planar slices (Weibel et al., 1966 ; West, 2012 ; Geuna & Herrera-Rincon, 2015 ) by counting intersections on a structured point grid or by planimetric delimitation of digital areas, permitting researchers to estimate vital parameters, such as volume (V V ), surface (S V ), and numerical (N V ) densities. Stereology is traditionally celebrated for its design-based approach, which provides unbiased and efficient estimates without requiring assumptions about the shape or orientation of the objects being measured (Mayhew, 1991 ; West, 2012 ). However, a fundamental statistical vulnerability remains pervasive in the interpretation of these data. Stereological and cell count profiles are, by definition, parts of a whole—whether expressed as volume fractions, area percentages, or relative frequencies. Mathematically, the simultaneous quantification of these proportions from the same reference space produces a strictly compositional dataset. Consequently, these data carry only relative information, in which the fractions are interrelated and subject to a constant-sum constraint (typically summing to 1 or 100%). Therefore, a unit increase in one tissue compartment must be accompanied by a decrease in at least one other compartment to maintain the total volume (Russ and Dehoff, 2000 ). During any morphological change—such as the expansion of a cellular compartment due to hypertrophy, or inflammatory recruitment in a matrix—the proportion of adjacent compartments is arithmetically compressed in the sample space, characterizing the classic constant sum problem. This constant-sum constraint, known as the "closure effect," dictates that stereological data reside in a restricted sample space, called the simplex, rather than the unconstrained D -dimensional Euclidean space (Aitchison, 1982 , 1986 ). Applying traditional parametric or non-parametric statistics directly to these percentages can lead to spurious correlations and false associations, potentially resulting in erroneous biological conclusions, as warned more than a century ago by Pearson ( 1897 ). This practice ignores the inherent negative correlational structure of closed data, violating fundamental assumptions of independence and multivariate normality, in which the geometry of the data itself may be mistaken for physiological change (Tilves et al., 2021 ; Pawlowsky-Glahn et al., 2015 ). Historically, the literature has attempted to circumvent these assumption violations through standardization methods, with the arcsine square root transformation being the most common. However, such transformations are insufficient, as they focus only on attenuating the behavior of variances at the extremes of the probability distribution (near 0 and 100%), failing to address the covariance structure imposed by the geometric constraint. The direct application of standard Euclidean statistics—such as Pearson correlations, ordinary least squares, and other linear models—remains the norm in morphological research. A requirement for any morphological analysis is subcompositional coherence. This principle demands that inferences made about a subset of tissue components (a subcomposition) should remain consistent regardless of whether other components are included in the analysis (Aitchison 1986 ). Traditional methods based on raw proportions fail this test, as the inclusion of a highly variable component (e.g., spleen white pulp in inflammatory states) will artificially alter the relative percentages of all other healthy tissues, creating an illusion of generalized atrophy (Aitchison & Greenacre, 2002 ; Greenacre, 2021 ). To resolve these geometric vulnerabilities, Compositional Data Analysis (CoDA) proposes a paradigm shift toward log-ratio transformations (Aitchison 1986 ). The central principle of CoDA posits that the biologically relevant information resides in the relative ratios between tissue components, not in their isolated fractions. By mapping data from the simplex into an orthonormal Euclidean space using the Isometric Log-Ratio ( ilr ) transformation, researchers can ensure that statistical models are both mathematically valid and biologically interpretable (Egozcue et al., 2003 ; Pawlowsky-Glahn & Egozcue, 2011 ). Within this framework, the focus shifts from individual parts to "balances" between groups of components. Focusing on the ratio between parts (e.g., white:red pulp in the spleen) ensures subcompositional coherence: the geometric relationship between two compartments remains identical regardless of the inclusion or exclusion of other tissues in the total count. Applying the logarithm linearizes these ratios, enabling the transposition of data from the restricted space to the real and continuous Euclidean space. For practical application in stereology, the workflow requires two fundamental transformations: the centered log-ratio ( clr ) for multivariate exploration and the isometric log-ratio ( ilr ) for modeling and inference (Aitchison, 1986 ). The clr transformation maps the relative weight of each tissue component to the geometric mean of the entire morphometric composition of the sample. This is the required transformation for multivariate analysis based on covariance (Pawlowsky-Glahn et al., 2015 ), forming the mathematical basis for biplots. This facilitates understanding systemic remodeling, enabling pathologists and morphologists to distinguish between localized cellular recruitment and generalized parenchymal degeneration (Pawlowsky-Glahn et al., 2015 ; Filzmoser et al., 2018 ). The main advantage of clr is the preservation of the dimensionality of the original system (a composition of D parts results in D transformed variables), allowing direct correlation with the original anatomical compartments (Aitchison, 1986 ). However, the sum of the clr vectors is equal to zero, by definition. This perfect collinearity results in a singular covariance matrix, which makes the direct use of clr in linear models and traditional hypothesis tests unfeasible (Greenacre, 2021 ). To ensure the validity of statistical inferences and linear modeling (e.g., t -tests, ANOVA, MANOVA), the ilr transformation is necessary. The ilr isometrically projects the D -dimensional simplex onto an orthogonal ( D − 1)-dimensional Euclidean space (Pawlowsky-Glahn et al., 2015 ). In clinical and histological practice, the contrast matrix V can be constructed via sequential binary partitioning (balances). This allows the researcher to create transformed variables with direct biological meaning. For example, a balance can be established that contrasts the variance of the entire stroma in relation to the structural parenchyma, or healthy tissue against pathological infiltrate. The Euclidean variables generated by the ilr do not present mathematical redundancy (Filzmoser et al., 2018 ), ensuring that the morphometric dataset meets the assumptions required by conventional statistics and supports the reproducibility of the identified associations. Despite the theoretical framework described above being well established in geochemistry, food science, and microbiome research (Greenacre, 2021 ), the integration of CoDA into the quantitative microscopy community is a recent development. As recently highlighted in related fields of tissue analysis (Tilves et al., 2021 ), treating proportional data as if they belonged to an unconstrained Euclidean space is not a minor statistical oversight, but a fundamental geometric violation that can lead to spurious correlations and even sign reversals in biological trends (Aitchison, 1986 ; Brown, 2017 ). Despite the call for statistical rigor (Altunkaynak et al., 2012 ; Brown, 2017 ), the integration of CoDA into the quantitative microscopy community is a recent development. The first application of this framework to analyze complex morphophysiological responses was recently demonstrated by Franco-Belussi et al. ( 2024 ), who utilized CoDA to resolve tissue-level shifts in frogs across environmental gradients. This work established that CoDA could reveal insights—such as systemic vascular remodeling and specific parenchymal trade-offs—that traditional statistics would have obscured. Currently, there is a lack of accessible frameworks that integrate the rigorous sampling of stereology with the simplicial geometry of CoDA, leaving researchers vulnerable to artifacts driven by the closure effect. Here, we provide a practical guide for the integration of CoDA into tissue stereology and histomorphometry. Using empirical case studies of splenic density, leukocyte profiles, renal architecture, and erythrocyte nuclear abnormalities from the tropical frog Leptodactylus podicipinus sampled along an urbanization gradient, we demonstrate how the use of isometric balances and specialized graphical tools—such as variation matrices, ray biplots, ternary plots, and CoDa dendrograms—allows researchers to isolate true biological signals from mathematical artifacts. By providing reproducible R-based workflows and an interactive Shiny web application (CoDa Stereo), we aim to facilitate the adoption of CoDA as a standard for the quantification and interpretation of changes in tissue compartments. Material and Methods Study system and sampling design Briefly, the empirical datasets utilized here encompass splenic and renal structural density, and hematological parameters (leukocyte differential counts and erythrocyte nuclear abnormalities) of the Pointedbelly Frog ( Leptodactylus podicipinus ). The Pointedbelly Frog is a medium-sized terrestrial frog (33–45 mm snout–vent length) widely distributed across South American lowlands (Frost, 2026 ), is frequently found in anthropized habitats, and a suitable model for ecotoxicological studies (Franco-Belussi et al., 2024 ). A total of 45 adult males were collected from three sites along an urbanization gradient in central Brazil: (i) an ecological station in the Pantanal wetland (16°50′S, 57°30′W; n = 10), considered the pristine baseline; (ii) an anthropized rural area (20°52′59″S, 55°11′40″W; n = 18) adjacent to the city; and (iii) an urban site (20°28′51.6″S, 54°34′38.5″W; n = 17). All specimens were euthanized following approved protocols and the spleen, kidneys, and blood were processed for histological and hematological analysis. A comprehensive description of the experimental design, euthanasia procedures, and histological processing is presented in Alves ( 2026 ). A full description of the dataset, along with further results and detailed biological interpretation, will be published elsewhere. Stereological quantification For splenic structural density, 10 transverse sections per animal (3 µm thick, H&E stained) were photographed at 100× magnification (RGB 4096 × 3286 pixels). A 120-intersection grid was superimposed on three images per animal, and intersections falling on red pulp, white pulp, melano-macrophage centers (MMCs), and blood vessels (including sinusoids) were counted. The structural volume density for each compartment was computed following commonly used formulae (see Russ & Dehoff, 2000 ). Sinusoids and blood vessels were pooled into a single "Total vessels" part prior to compositional analysis, yielding a four-part composition ( D = 4): {Red pulp, White pulp, MMCs, Total vessels}. For renal structural density, PAS-stained sections were analyzed at 100× magnification, with five images per animal and a 120-intersection grid. The five-part composition ( D = 5) comprised: {Proximal tubules, Distal tubules, Renal corpuscles, Collecting ducts, Interstitial space}. For leukocyte differential counts, 100 leukocytes per animal in blood smear were classified into lymphocytes, monocytes, neutrophils, and eosinophils (May–Grünwald–Giemsa–Wright stain). Because basophils were absent in all specimens and eosinophils occurred at low and relatively constant frequencies across sites, neutrophils and eosinophils were amalgamated into a single "Granulocytes" part for the compositional analysis, yielding a three-part composition ( D = 3): {Lymphocytes, Monocytes, Granulocytes}. This amalgamation is biologically motivated by the shared innate-immune function of the granulocyte lineage (Abbas et al. 2021 , Mescher 2021 ) and follows the recommendation to reduce dimensionality when parts carry redundant information (Pawlowsky-Glahn et al., 2015 ). For erythrocyte nuclear abnormalities, 1,000 erythrocytes per animal in the same blood smear were screened for abnormal nuclei (budded, lobulated, notched, reniform, binucleated, anucleated). The counts were aggregated into a two-part composition ( D = 2): {Normal erythrocytes, Total abnormalities} The Compositional Nature of Stereological Data All stereological and leukocyte profile data were treated as compositional data. A composition is a vector of D positive components that carry only relative information (Aitchison 1986 ). In tissue stereology, volume fractions ( V v ) and cellular proportions are intrinsically constrained by a constant sum (e.g., 100% of the organ's volume), which restricts the data to a sample space known as the ( D − 1)-simplex (Fig. 1 ). Because the simplex space is not Euclidean, standard multivariate techniques cannot be applied directly to raw percentages without inducing spurious correlations and subcompositional incoherence (Pearson, 1897 ; Pawlowsky-Glahn et al., 2015 ). To adhere to the principles of scale invariance and permutation invariance, all datasets were transformed into log-ratio coordinates before statistical inference (Fig. 1 ). Data Transformation: Centered and Isometric Log-Ratios To handle the multivariate structure of tissue compartments, we utilized two primary transformation workflows (Fig. 1 ): Centered Log-Ratio ( clr ) : Used primarily for exploratory analysis and biplot visualization. The clr transformation (Egozcue et al., 2003 ) normalizes each component by the geometric mean of the entire composition. Isometric Log-Ratio ( ilr ) : Used for hypothesis testing and linear modeling. The ilr transformation projects the D -part composition onto an orthonormal basis of D − 1 coordinates in Euclidean space, removing the mathematical redundancy inherent in closed data (Egozcue et al., 2003 ; Filzmoser et al., 2018 ). For organs with complex functional hierarchies (i.e., spleen and kidney), the ilr coordinates were constructed via Sequential Binary Partitioning (SBP). This process involves the creation of a signary matrix, in which components are grouped into "balances" based on biological relevance (e.g., parenchyma vs. stroma, or lymphoid vs. hematopoietic tissues). Each balance represents a specific log-ratio contrast between groups of parts, providing a coordinate that is both mathematically robust and biologically interpretable (Pawlowsky-Glahn & Egozcue, 2011 ). Notably, the choice of SBP is not unique; different partitioning schemes will produce different individual balance coordinates, although the overall multivariate test (MANOVA) is invariant to this choice (Filzmoser et al., 2018 ). Sequential binary partitions (SBP) for each case study Spleen ( D = 4; 3 balances). Balance 1: MMCs vs. {Red pulp + White pulp + Total vessels} — isolates the inflammatory signal (Abbas et al. 2021 ). Balance 2: Total vessels vs. {Red pulp + White pulp} — captures vascular remodeling independently of parenchymal composition. Balance 3: Red pulp vs. White pulp — contrasts hematopoietic and lymphoid functions. Leukocytes ( D = 3; 2 balances). Balance 1: Lymphocytes vs. Granulocytes — the compositional analogue of the L:G stress ratio. Balance 2: Monocytes vs. {Lymphocytes + Granulocytes} — isolates the monocyte recruitment signal (Abbas et al. 2021 ). Kidney ( D = 5; 4 balances). A default ilr basis was used (Egozcue et al., 2003 ), as no single physiologically motivated binary partition covers all five renal compartments. Global inference via MANOVA is invariant to the choice of basis. Erythrocytes ( D = 2; 1 balance). A single ilr coordinate contrasting Normal vs. Abnormal erythrocytes. Handling Zeros in Rare Morphological Events In the erythrocyte abnormality dataset, "rounded zeros" (count zeros) occurred in animals from the ecological station, in which no abnormal cells were detected among the 1,000 screened. Following Martín-Fernández et al. ( 2003 ), a Count Zero Multiplicative (CZM) replacement was applied using a Bayesian Dirichlet prior via the zCompositions package (Palarea-Albaladejo & Martín-Fernández, 2015 ). This imputation assigns a small pseudocount to the zero compartment, while proportionally adjusting the remaining parts to preserve the unit-sum constraint. For the splenic dataset, MMC counts of zero occurred in some specimens from the ecological station. Because these represent rounded zeros (MMCs exist but fall below the detection threshold of the 120-point grid), we applied the same CZM imputation prior to ilr transformation. It is important to note that the choice of imputation method introduces a degree of analyst subjectivity. For count data (point-counting stereology), CZM is the recommended approach; for percentage data or data with below-detection-limit values, methods such as multRepl or lrEM may be more appropriate (Palarea-Albaladejo & Martín-Fernández, 2015 ). The CoDa Stereo application offers all three methods. Statistical inference All statistical analyses were performed in R v. 4.5.2 (R Core Team, 2025 ). Compositional transformations and visualizations, including clr -PCA biplots and CoDa dendrograms, were implemented using the compositions (van den Boogaart et al., 2025 ) and zCompositions (Palarea-Albaladejo & Martín-Fernández, 2015 ) packages. To evaluate the effect of the environment on tissue architecture, we used Permutational Multivariate Analysis of Variance (MANOVA) implemented in the RRPP package (Collyer & Adams, 2018 ). This approach allows for robust linear modeling of ilr coordinates, accommodating the non-normal distributions often found in biological count data while providing exact P -values through randomized residual permutations (999 iterations). A Quarto dynamic document, along with data, is available at a Zenodo repository (Provete et al. 2026). Results Case Study 1: Systemic Splenic Remodeling Across an Urbanization Gradient The application of CoDA to empirical splenic structural density data reveals a biological reality more complex than predicted by standard models of tissue stability. While conventional statistics frequently attribute the reduction of healthy tissue compartments solely to the "closure effect" caused by the expansion of a single component—such as MMCs in response to inflammation—our results suggest that the environment induces a multifaceted reorganization of the splenic architecture. Because volume fraction data inherently reside within the simplex and are strictly codependent (Aitchison, 1986 ; Russ & Dehoff, 2000 ), we applied a MANOVA on ilr coordinates. This approach ensures that the identified statistical significance is robust to deviations from multivariate normality, a common issue in such data (Egozcue et al., 2003 ; Franco-Belussi et al., 2024 ). Decomposing the global variation via the SBP, described in Material and Methods, allowed us to isolate three independent biological processes (see Provete et al. 2026): Balance 1 (MMCs vs. Remaining Parenchyma) : The balance isolating MMCs confirmed that environmental stress in urban areas induces a disproportionate increase in these centers, effectively decoupling the inflammatory signal from the organ's total functional mass and supporting the use of MMCs as biomarkers of chronic stress (Fig. 2 ). Balance 2 (Vessels vs. Pulps) : This balance revealed that a substantial proportion of the variation in the relationship between the vascular system and the splenic pulps is explained by sampling site. This suggests that the environment is not merely triggering a localized immune response, but may be forcing vascular remodeling—potentially associated with venous congestion or ischemia—that classical statistics would fail to distinguish from simple compressive atrophy. Balance 3 (Red vs. White Pulp) : The alteration in the log-ratio between the red and white pulps indicates that environmental stress disrupts the baseline homeostasis between hematopoietic function and the lymphoid immune response. The distinction between these concurrent processes was visualized through a ray biplot ( clr -PCA) bounded by convex hulls. This visualization delineates the morphological space occupied by specimens from each environment, without the restrictive distributional assumptions inherent to traditional confidence ellipses. In the biplot, the spatial divergence between the radii of the Red and White Pulps, combined with the substantial length of the Total Vessels ray, illustrates the ratios among these functional tissues are not constant. By mapping stereological data into the simplex and then into Euclidean space via ilr coordinates (Filzmoser et al., 2018 ), we show that urbanization is associated with a disruption in parenchymal integrity (Dulak & Płytycz 1989 , Christin et al. 2003 ). This level of analytical resolution demonstrates the value of CoDA for modern quantitative pathology, allowing researchers to disentangle localized adaptive responses from broader architectural changes. Case Study 2: Leukocyte Profile Shifts and the Aitchison Variation Matrix The analysis of leukocyte profiles represents a classic challenge in eco-physiology. In amphibians, the ratio of lymphocytes to neutrophils (or granulocytes) is widely used as a proxy for chronic stress mediated by glucocorticoids. However, treating these counts as independent variables or simple ratios fails to account for the inherent codependence of the immune cell repertoire (Abbas et al. 2021 ). To resolve this, we applied the CoDA framework (Fig. 3 ) to the three-part leukocyte composition {Lymphocytes, Monocytes, Granulocytes}. Raw proportions are restricted to the simplex and lack the algebraic properties required for standard linear modeling (Aitchison, 1986 ). Conventional ratios (e.g., lymphocytes:neutrophils, L/G) are asymmetric and non-normal; a doubling of lymphocytes results in a ratio of 2.0, while a halving results in 0.5, creating a skewed distribution that compromises ordinary least squares regression (Pawlowsky-Glahn et al., 2015 ). By transforming the amalgamated granulocyte and mononuclear counts into ilr coordinates, we mapped the immune profile into an unconstrained Euclidean space (Fig. 3 ). This transformation ensures that the resulting "Leukocyte Balances" are symmetric, scale-invariant, and subcompositionally coherent (Egozcue et al., 2003 ; Filzmoser et al., 2018 ). The MANOVA on the ilr coordinates revealed a significant effect of the environment on the global leukocyte profile. The use of targeted balances (see Provete et al. 2026) allowed us to identify the specific cellular drivers of this response: Balance 1 (Lymphocytes vs. Granulocytes) : This balance, the compositional equivalent of the traditional L:G ratio, showed a significant downward shift in urban environments. In the Aitchison geometry, a value of 0 represents a perfect geometric equilibrium between the two cell types (Fig. 4 ). The marked move toward negative values in urban specimens provides a rigorous confirmation of granulocyte dominance—a hallmark of chronic environmental stress (Franco-Belussi et al., 2024 ). Balance 2 (Monocytes vs. [Lymphocytes + Granulocytes)]) : This balance isolated the role of monocytes relative to the rest of the profile. The significant increase in the monocyte-driven coordinate in disturbed sites suggests a transition from a baseline surveillance state to an active innate immune response (Rollins-Smith 2017 ), likely associated with increased pathogen or pollutant load in urbanized habitats. The ternary plot (Fig. 3 ) illustrates why the CoDA approach is important for reproducible quantitative pathology. By analyzing the "balance" rather than the "fraction", we show that the observed lymphopenia in urban frogs is not a standalone reduction, but a proportional trade-off against a proliferating granulocyte lineage (see also Provete et al. 2026 for a visualization using a CoDa Dendrogram). The dynamic document (Provete et al., 2026) includes CoDa dendrograms per site illustrating the balance structure, pairwise Euclidean distances in ilr space, and residual diagnostics for all four case studies. As cautioned by Pearson ( 1897 ) and reiterated in modern compositional theory (Tilves et al., 2021 ), only by respecting the geometry of the simplex can we accurately validate leukocyte profiles as reliable histological biomarkers of environmental health. The Aitchison Variation Matrix In classical eco-immunology, researchers frequently attempt to identify interactions between leukocyte lineages (Davis et al. 2008 ) using standard Pearson or Spearman correlation matrices. However, because leukocyte counts are subject to a constant-sum constraint, any increase in one cell type mathematically forces a decrease in another. This closure effect generates spurious negative correlations (Pearson, 1897 ), rendering standard correlation matrices biologically uninterpretable within the simplex. To accurately quantify the dynamic relationships between immune cells, we used the Aitchison Variation Matrix (Pawlowsky-Glahn et al., 2015 ). Instead of measuring absolute correlations, the variation matrix computes the variance of the log-ratio between all possible pairs of components. A variance close to zero indicates that two components maintain a strictly stable proportional relationship (high subcompositional coherence), whereas a high variance indicates an active biological trade-off or independent fluctuation. The highest variance within the leukocyte profile occurred at the intersection of Lymphocytes and Granulocytes (Fig. 4 ). This corroborates the hypothesis that these two lineages do not merely fluctuate independently, but are engaged in an active trade-off for the constrained spatial and energetic resources of the immune repertoire under environmental stress (Davis et al. 2008 , Rolins-Smith 2017). By replacing the standard correlation matrix with the compositional variation matrix, we provide an unbiased, mathematically coherent tool to validate the L:G trade-off as a responsive biomarker in eco-physiological biomonitoring. Case Study 3: Spurious Correlations and Sign Reversals in Eco-Immunology and Renal Architecture 3a. False Positive: Granulocyte inflation driven by lymphocyte decline To illustrate the insidious nature of the closure effect, we analyzed the leukocyte profiles (lymphocytes, monocytes, and granulocytes) across the urbanization gradient. While Case Study 2 demonstrated the value of CoDA for identifying real trade-offs, this analysis provides a textbook example of a Type I error (False Positive) driven entirely by the constant-sum constraint. When using the classical Euclidean approach on raw cellular percentages, the linear model indicated a significant recruitment of granulocytes in the urban environment (estimated increase of + 10.2%, P < 0.05), with the 95% confidence interval excluding zero (Fig. 5 , left panel). A conventional ecophysiological interpretation would erroneously conclude that environmental stress stimulates the innate immune system, triggering granulocytic proliferation. However, this pattern is a mathematical illusion. In a three-part compositional system bounded to 100%, the parts are not free to vary independently. The raw data reveal that urban frogs experience a severe decline in lymphocytes, coupled with an increase in monocytes. Because the total must sum to 100%, the reduction in the dominant fraction pushes the granulocyte fraction upward, regardless of its actual biological density. When the data are transposed to the unconstrained Aitchison geometry via the clr transformation, the true biological signal is isolated. The clr -coordinate reveals that the effect size collapses toward zero (Fig. 5 , right panel), with the 95% confidence interval crossing the zero threshold. The granulocyte lineage remains structurally stable; the classical approach was deceived by the spatial exhaustion caused by the other cell types. 3b. Sign Reversal: Artificial atrophy of the distal tubule To demonstrate how the closure effect can not only generate false significance, also but completely invert the biological signal, we analyzed the renal structural density across the urbanization gradient. When analyzed through classical Euclidean statistics, the relative volume of the distal tubule appeared to exhibit a negative trend in urban frogs compared to the pristine baseline (Fig. 6 , left panel). A traditional interpretation would suggest that the distal tubule is undergoing mild atrophy. However, this negative trajectory is an artifact driven by the closure effect: the pathological expansion of the interstitial space mechanically compresses the raw percentages of all other tissues. When transposed to the Aitchison geometry via the clr transformation, the clr -coordinate for the distal tubule revealed a positive shift in the effect size (Fig. 6 , right panel). While the 95% confidence intervals for both models cross zero—indicating high variance and a lack of statistical significance—the transition from a negative to a positive estimate is fundamental. Furthermore, the difference in the models' explanatory power ( R ²) highlights how raw percentages introduce geometric noise (see Provete et al. 2026 for detailed results). This case study demonstrates that relying on raw fractions in cell density analysis not only obscures the magnitude of physiological responses, but also can invert the direction of the biological signal. Publishing spurious correlations generated by raw fractions (Tilves et al., 2021 ) can bias future meta-analyses (Marks-Anglin & Chen, 2020 ), confound the identification of reliable biomarkers of environmental impact, and subvert the mechanistic understanding of how organisms manage spatial and physiological trade-offs in altered environments. Case Study 4: The "Zero Problem" and Rare Erythrocyte Abnormalities The quantification of genotoxic or structural damage in circulating erythrocytes—such as nuclear buds, lobed nuclei, and notched cells—is a staple of environmental biomonitoring (Duan et al. 2026 ). However, these anomalies are exceedingly rare: in a typical count effort of 1,000 cells per individual, specimens from pristine environments frequently present with a count of exactly zero abnormal cells. Within the Euclidean framework, a zero count is usually modelled via non-parametric rank tests. However, in CoDA, log-ratio transformations are undefined for zero values (ln(0) = −∞). This mathematical barrier, widely known as "The Zero Problem" (Aitchison, 1986 ; Filzmoser et al., 2018 ), must be carefully addressed to differentiate between structural zeros (the biological impossibility of a part existing) and rounded zeros (count zeros resulting from sampling exhaustion). Erythrocyte abnormalities in frogs represent count-based rounded zeros: the damage exists biologically at a baseline level, but falls below the detection threshold of the 1,000-cell probe. To resolve this without distorting the sample space, we applied CZM replacement using a Bayesian Dirichlet prior (Martín-Fernández et al., 2003 ; Filzmoser et al., 2018 ), as described in Material and Methods. Following imputation, the two-part composition was mapped into a one-dimensional Euclidean space using a single ilr balance contrasting Normal against Abnormal cells. The permutational linear model (RRPP-MANOVA) applied to this coordinate revealed a significant erosion of erythrocyte integrity along the urbanization gradient (Fig. 7 ). While frogs from the ecological station maintained a high log-ratio (indicating overwhelming dominance of healthy erythrocytes), the urban population exhibited a significant shift towards the lower bound of the coordinate (see Provete et al. 2026 for full quantitative results). By applying CoDA to rare count data, we eliminate the heteroscedasticity that plagues traditional fractional analyses of rare events. The ilr stabilizes the variance of the abnormality index, showing that even under the severe mathematical constraints of near-zero frequencies, compositional geometry provides a sensitive biomarker of environmental genotoxicity. Software Implementation: The CoDa Stereo Web Application Despite the robustness of the Aitchison geometry, the widespread adoption of CoDA in quantitative microscopy has been hindered by a steep computational learning curve. To bridge this gap, we developed CoDa Stereo (v2.0), a fully interactive, open-source web application built in R using the Shiny framework (Chang et al., 2025 ) and structured as an R package using the golem framework (Fay et al. 2024 ). The application follows the PPDAC investigative cycle (Wild & Pfannkuch, 1999 ) and incorporates the following analytical modules: Data Upload and Parts Selection. The application supports CSV and Excel (.xlsx/.xls) files with automatic sheet selection. A smart parts selector allows users to specify which numeric columns represent the compositional parts, auto-excluding columns whose names match common ID patterns (e.g., Sample_ID, Animal_ID). Non-compositional covariates (body weight, biochemical assays) can be manually deselected. The ability to include continuous predictors in the multivariate model (MANCOVA) will be implemented in future versions of the program. Data Quality and Missingness. Integration with the naniar package (Tierney & Cook, 2023 ) provides vis_miss plots of missing data. Additionally, a custom three-state heatmap distinguishes observed values, structural zeros, and true missing values (NA)—a distinction that standard missingness visualizations do not make. Missing values are handled via listwise deletion or geometric-mean imputation. The Zero Problem Resolution. Three imputation methods are offered, selectable via the interface: Count Zero Multiplicative (CZM) replacement for count zeros, multiplicative replacement ( multRepl ) for rounded zeros (below detection limit), and log-ratio EM (lrEM) for rounded zeros when sample sizes are moderate to large (Palarea-Albaladejo & Martín-Fernández, 2015 ). Column-specific detection limits are computed automatically. Compositional Visualization. Users can dynamically generate ternary plots to explore the empirical niche of three-part subcompositions within the simplex. When the composition has more than three parts, an amalgamation mode allows users to define three groups of parts, summing them into a sub-composition for ternary display. The ternary plot is implemented in pure ggplot2 (Wickham 2016 ) to ensure cross-platform rendering compatibility. The tool also performs clr transformations to render Compositional Ray Biplots ( clr -PCA) via FactoMineR (Lê et al. 2008) and factoextra (Kassambara & Mundt 2020 ). Robust Statistical Inference. Users can build factorial MANOVA models on ilr coordinates via RRPP::lm.rrpp , with dynamic selection of multiple grouping factors and a toggle for additive vs. interaction designs. The formula is previewed before fitting for pedagogical transparency. Pairwise post-hoc comparisons are available via RRPP::pairwise() , and residual diagnostics (residuals vs. fitted, PC rotation) are generated automatically. Downloads. Ternary plots and biplots can be exported as vectorial PDF. Imputed data, ANOVA tables, and pairwise comparison tables are downloadable as CSV. To facilitate training and reproducibility, CoDa Stereo includes a built-in simulated dataset that mimics the splenic structural density scenario discussed in Case Study 1. The source code and instructions for deployment — both local and on Posit Connect Cloud ( https://sxzbf8.short.gy/CoDaStereo ) — are provided at https://github.com/diogoprov/CoDaStereo . Limitations Several limitations of the CoDA framework and the present study should be acknowledged. First, the choice of Sequential Binary Partition (SBP) for ilr construction is not unique: different partitioning schemes produce different individual balance coordinates, even though the global MANOVA test is invariant to this choice. Researchers should select biologically motivated partitions and report them transparently. Second, the imputation of zeros introduces a degree of analyst subjectivity, particularly regarding the distinction between structural zeros (a part that genuinely cannot exist) and rounded zeros (below detection threshold). In stereology and hematology, most zeros are rounded, but edge cases (e.g., a tissue type absent in a particular species) require careful consideration. Third, when the number of parts ( D ) is large relative to the sample size ( n ), log-ratio transformations can amplify noise in parts with very small values, potentially inflating variance in the ilr space. Fourth, the present case studies are based on a single species ( L. podicipinus ) and a specific urbanization gradient; the generalizability of the specific biological findings to other taxa and stressors remains to be tested, although the statistical framework is universal. Conclusion The transition from classical Euclidean geometry to the Aitchison simplex is an important step toward greater rigor in quantitative morphology (Aitchison, 1986 ; Pawlowsky-Glahn et al., 2015 ). As demonstrated throughout our case studies, the reliance on raw percentages for stereological data can compromise inferential power through false positives and generate severe distortions, such as the inversion of biological signals under environmental stress. By anchoring the analysis in the principle of subcompositional coherence and using robust log-ratio transformations ( clr and ilr ), CoDA reveals the mathematical artifacts created by the closure effect (Pearson, 1897 ; Tilves et al., 2021 ), uncovering the structural trajectories and physiological trade-offs of biological tissues. The introduction of the CoDa Stereo web application addresses the computational barriers that have historically hindered the widespread adoption of this methodology among practitioners, offering an intuitive interface for the entire analytical cycle. We encourage the stereological community to adopt this geometric framework to ensure that microscopic quantifications accurately reflect biological reality. Declarations CONFLICT OF INTEREST STATEMENT The authors have no competing interests to declare that are relevant to the content of this article. AI DISCLOSURE STATEMENT During the preparation of this manuscript, the authors used generative artificial intelligence (Claude, Anthropic; Gemini, Google) to support specific research, software development, and writing workflows. In accordance with the AIdIT framework (Drobniak et al., 2026 ) and the COPE guidelines ( https://publicationethics.org/guidance/cope-position/authorship-and-ai-tools ) , the applications of AI are categorized as follows: Human Oversight Statement The authors maintained continuous, active oversight over all AI-assisted processes. All generated R scripts and the Shiny application were locally executed, independently verified, and validated against the raw empirical data to ensure mathematical correctness and functional integrity. All AI-generated text was critically reviewed, extensively edited, and evaluated for conceptual accuracy. The authors assume full responsibility for the analyses, interpretations, and final content of this publication. FUNDING STATEMENT DBP receives a research fellowship (grant #83//027.032/2024) and LFB received a grant (#174/2023) from the Foundation to Support the Development of Education, Science, and Technology of the State of Mato Grosso do Sul (FUNDECT). CEF receives a research fellowship (#309358/2023-0) from the Brazilian Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq). Author Contribution DBP: Conceptualization (lead); Data curation (lead); Formal analysis (lead); Funding Acquisition (contributing); Investigation (contributing); Methodology (contributing); Project Administration (lead); Resources (contributing); Software (lead); Supervision (contributing); Validation (equal); Visualization (lead); Writing – original draft (lead); Writing – review and editing (equal). MRS: Investigation (equal); Methodology (contributing); Writing – review and editing (equal). Lilian Franco-Belussi: Investigation (equal); Methodology (equal); Supervision (contributing); Validation (equal); Writing – review and editing (equal); Funding Acquisition (contributing). CEF: Funding Acquisition (contributing); Investigation (equal); Methodology (equal); Resources (lead); Supervision (contributing); Validation (equal); Writing – review and editing (equal). Acknowledgement Clara F. B. Alves helped with data collection. ICMBio provided collecting permits (#63297-1 to LFB and #80075-1 to MRS). Animal handling and euthanasia followed the National Institute of Health Guide for Care and Use of Animals in the Laboratory and are approved by our university’s IACUC (CEUA #997/2018; #1.203/2021). Data Availability All data and R code are available in the Zenodo repository at https://doi.org/10.5281/zenodo.19928375 References Abbas AK, Lichtman AH, Pillai S (2021) Cellular and Molecular Immunology, 10th edn. Elsevier, Philadelphia Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc Ser B 44:139–177 Aitchison J (1986) The statistical analysis of compositional data. Chapman and Hall, London Aitchison J, Greenacre M (2002) Biplots of compositional data. Appl Stat 51:375–392 Altunkaynak BZ, Önger ME, Altunkaynak ME, Ayranci E, Canan S (2012) A brief introduction to stereology and sampling strategies: basic concepts of stereology. NeuroQuantology 10:31–43 Alves CFB (2026) Efeitos da urbanização sobre o sistema hematopoiético de Leptodactylus podicipinus (Anura, Leptodactylidae). Master’s Thesis, Universidade Federal de Mato Grosso do Sul, Campo Grande, MS, Brazil Brown DL (2017) Practical stereology: applications for the pathologist. Vet Pathol 54:358–368 Chang W, Cheng J, Allaire J et al (2025) shiny: web application framework for R. R package version 1.12.1. https://CRAN.R-project.org/package=shiny Christin M-S, Gendron AD, Brousseau P et al (2003) Effects of agricultural pesticides on the immune system of Rana pipiens and on its resistance to parasitic infection. Environ Toxicol Chem 22:1127–1133. https://doi.org/10.1002/etc.5620220522 Collyer ML, Adams DC (2018) RRPP: an R package for fitting linear models to high-dimensional data using residual randomization. Methods Ecol Evol 9:1772–1779 Davis AK, Maney DL, Maerz JC (2008) The use of leukocyte profiles to measure stress in vertebrates: a review for ecologists. Funct Ecol 22(4):590–601 Drobniak SM et al (2026) A systematic map of generative AI guidelines and reporting in ecology and evolutionary biology. Preprint, Research Square. https://doi.org/10.21203/rs.3.rs-9160721/v1 Duan H, Peng X, Qin S et al (2026) Micronuclei: origins, assays, mechanisms, diseases and treatments. Sig Transduct Target Ther 11:114. https://doi.org/10.1038/s41392-025-02538-8 Dulak J, Płytycz B (1989) The effect of laboratory environment on the morphology of the spleen and the thymus in the yellow-bellied toad, Bombina variegata (L.). Developmental & Comparative Immunology 13:49–55. https://doi.org/10.1016/0145-305X(89)90016-5 Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35:279–300 Elias H, Hyde DM (1980) An elementary introduction to stereology (quantitative microscopy). Am J Anat 159:411–446 Fay C, Guyader V, Rochette S, Girard C (2024) golem: A Framework for Robust Shiny Applications. doi:10.32614/CRAN.package.golem, R package version 0.5.1. https://CRAN.R-project.org/package=golem Filzmoser P, Hron K, Templ M (2018) Applied compositional data analysis: with worked examples in R. Springer, Cham Franco-Belussi L, De Oliveira Júnior JG, Goldberg J, De Oliveira C, Fernandes CE, Provete DB (2024) Multiple morphophysiological responses of a tropical frog to urbanization conform to the pace-of-life syndrome. Conserv Physiol 12:coad106 Frost DR (2026) Amphibian species of the world: an online reference. Version 6.2 (30 April 2026). Electronic Database. American Museum of Natural History, New York. https://doi.org/10.5531/db.vz.0001 Geuna S, Herrera-Rincon C (2015) Update on stereology for light microscopy. Cell Tissue Res 360:5–12 Greenacre M (2021) Compositional data analysis. Annu Rev Stat Appl 8:271–299 Gundersen HJG et al (1988) Some new, simple and efficient stereological methods and their use in pathological research and diagnosis. APMIS 96:379–394 Kassambara A, Mundt F (2020) factoextra: Extract and Visualize the Results of Multivariate Data Analyses, R package version 1.0.7, https://CRAN.R-project.org/package=factoextra . 10.32614/CRAN.package.factoextra Le S, Josse J, Husson F (2008) FactoMineR: An R Package for Multivariate Analysis. J Stat Softw 25(1):1–18. 10.18637/jss.v025.i01 Marks-Anglin A, Chen Y (2020) A historical review of publication bias. Res Synth Methods 11:725–742 Martín-Fernández JA, Barceló-Vidal C, Pawlowsky-Glahn V (2003) Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Math Geol 35:253–278 Mayhew TM (1991) The new stereological methods for interpreting functional morphology from slices of cells and organs. Exp Physiol 76:639–665 Mescher AL (2021) Junqueira's Basic Histology: Text and Atlas, 16th edn. McGraw Hill, New York Palarea-Albaladejo J, Martín-Fernández JA (2015) zCompositions: R package for multivariate imputation of left-censored data under a compositional approach. Chemom Intell Lab Syst 143:85–96 Pawlowsky-Glahn V, Egozcue JJ (2011) Exploring compositional data with the CoDa-dendrogram. Austrian J Stat 40:103–113 Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R (2015) Modeling and analysis of compositional data. Wiley, Chichester Pearson K (1897) Mathematical contributions to the theory of evolution: on a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc R Soc Lond 60:489–498 R Core Team (2025) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/ Rollins-Smith LA (2017) Amphibian immunity-stress, disease, and climate change. Dev Comp Immunol 66:3–11 Russ JC, Dehoff RT (2000) Practical stereology, 2nd edn. Kluwer Academic/Plenum, New York Tierney N, Cook D (2023) Expanding tidy data principles to facilitate missing data exploration, visualization and assessment of imputations. J Stat Softw 105:1–31 Tilves C, Peddada S, Miljkovic I (2021) Body composition analyses require compositional data analytic (CoDA) methods. Obesity 29:1930–1931 Underwood EE (1969) Stereology, or the quantitative evaluation of microstructures. J Microsc 89:161–180 Van den Boogaart KG, Tolosana-Delgado R, Bren M (2025) compositions: compositional data analysis. R package. https://CRAN.R-project.org/package=compositions Weibel ER, Kistler GS, Scherle WF (1966) Practical stereological methods for morphometric cytology. J Cell Biol 30:23–38 West MJ (2012) Introduction to stereology. Cold Spring Harb Protoc 2012:pdb.top070623 Wickham H (2016) ggplot2: Elegant Graphics for Data Analysis. Springer-, New York Wild CJ, Pfannkuch M (1999) Statistical thinking in empirical enquiry. Int Stat Rev 67:223–248 Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9590298","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":633979155,"identity":"af9d258e-fde4-4027-bb51-f8a23061cdc8","order_by":0,"name":"Diogo B. Provete","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA5klEQVRIie3RsQrCMBCA4QuFuFS7tg8hpAgF8WUqgo6uHUQqBV0E1xbEx+gcOLBLoatDh4ogjunmIhid1CHt6JB/CCTwkRwB0On+MfzaBWDJlbYmJIQcnLCR8E9C1i1ILzPwJqCcW/Ykq+oD2v0dp5dAQRyk02EM12EST8erJEXbO/kdN1cQhqY3MAEZy3M36qa49E5AX69rQ4o6euzlLQVvJIPLm2RbEpFQEu6ryWsWErMrczZrN9keZ3KW8cZVkV6BKERQMsswKnFfjOTD8HhWERm1gfHPA9IAAAzx9aE6nU6n++0JvJpSmxuluNIAAAAASUVORK5CYII=","orcid":"","institution":"Universidade Federal de Mato Grosso do Sul","correspondingAuthor":true,"prefix":"","firstName":"Diogo","middleName":"B.","lastName":"Provete","suffix":""},{"id":633979157,"identity":"106ac3b2-7699-4f4b-b883-63d59934d9dc","order_by":1,"name":"Marcos R. Severgnini","email":"","orcid":"","institution":"Unidade Universitária de Campo Grande, Universidade Estadual de Mato Grosso do Sul","correspondingAuthor":false,"prefix":"","firstName":"Marcos","middleName":"R.","lastName":"Severgnini","suffix":""},{"id":633979160,"identity":"0cc762dd-67a7-41f1-8636-879b3df68c7a","order_by":2,"name":"Carlos E. Fernandes","email":"","orcid":"","institution":"Universidade Federal de Mato Grosso do Sul","correspondingAuthor":false,"prefix":"","firstName":"Carlos","middleName":"E.","lastName":"Fernandes","suffix":""},{"id":633979161,"identity":"18506c75-aab1-4413-b151-9b4df02d8771","order_by":3,"name":"Lilian Franco-Belussi","email":"","orcid":"","institution":"Universidade Federal de Mato Grosso do Sul","correspondingAuthor":false,"prefix":"","firstName":"Lilian","middleName":"","lastName":"Franco-Belussi","suffix":""}],"badges":[],"createdAt":"2026-05-02 03:23:43","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-9590298/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9590298/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":108840842,"identity":"1546f2d4-47fa-440c-9c70-cb7225196d6f","added_by":"auto","created_at":"2026-05-09 00:57:31","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":867076,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eMethodological workflow for Compositional Data Analysis (CoDA) in tissue stereology and blood cell profile.\u003c/strong\u003e Raw stereological metrics (e.g., cell counts, volume fractions) are inherently subject to a constant-sum constraint, restricting the sample space to the Aitchison simplex and inducing the closure effect. To perform valid and subcompositionally coherent statistical inference within an unconstrained Euclidean space, data must undergo log-ratio transformations. This flowchart delineates the required decision pathway: addressing structural and rounded zeros via amalgamation or Bayesian multiplicative imputation (e.g., CZM), defining biologically meaningful balances through Sequential Binary Partitions (SBP), and ultimately applying isometric (\u003cem\u003eilr\u003c/em\u003e) or centered (\u003cem\u003eclr\u003c/em\u003e) log-ratios for robust permutational modeling (RRPP-MANOVA) and unbiased multivariate visualization (compositional ray biplots).\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-9590298/v1/6522b944993396b859565eb2.png"},{"id":108976641,"identity":"75b295ff-d599-46e7-a8fa-f126f5e2029e","added_by":"auto","created_at":"2026-05-11 11:27:06","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":395011,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCompositional ray biplot (clr-PCA) illustrating the systemic structural remodeling of the amphibian spleen across an urbanization gradient.\u003c/strong\u003eThe multivariate variation of splenic compartments (Red Pulp, White Pulp, Melano-Macrophage Centers [MMCs], and Total Vessels) is mapped into an unconstrained Euclidean space following a centered log-ratio (\u003cem\u003eclr\u003c/em\u003e) transformation. In this geometry, the lengths of the rays (arrows) and the distances between their tips (links) are proportional to the log-ratio variances, highlighting the severe architectural disruption caused by environmental stress. Convex hulls delineate the morphological niche occupied by specimens from each sampling site. The directional shift of the urban population’s hull towards the MMC and Total Vessels rays provides geometric confirmation of the significant isometric balances (RRPP-MANOVA), effectively decoupling active inflammatory recruitment from profound vascular reorganization.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-9590298/v1/019670875609aef1790c1592.png"},{"id":108840843,"identity":"ecc21fa1-7159-42af-af40-bb111a73479b","added_by":"auto","created_at":"2026-05-09 00:57:31","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":532298,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTernary diagram illustrating the eco-immunological niche of leukocyte profiles within the Aitchison simplex.\u003c/strong\u003eThe three-part immune composition (Lymphocytes, Granulocytes, and Monocytes) is plotted in its natural, constant-sum geometric space, requiring no dimensional reduction. Density contours represent the population's center of gravity, effectively mapping the immunological niche of the amphibian populations. In the pristine baseline (ecological station), the structural niche is heavily anchored toward the lymphocyte vertex, reflecting a homeostatic state of adaptive immune surveillance. Conversely, environmental stress from urbanization induces a directional migration of the entire population's density towards the granulocyte-monocyte axis. This topographical shift visually proves the active biological trade-off: the apparent lymphopenia is not an isolated decline, but a proportional exhaustion coupled with the expansion of the innate immune response.\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-9590298/v1/cc87cc6f01cd0d39b33a5702.png"},{"id":108840844,"identity":"f7256b5a-9876-42ab-847e-62d07c414eeb","added_by":"auto","created_at":"2026-05-09 00:57:31","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":248293,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAitchison variation matrix heatmap illustrating the subcompositional stability and trade-offs within the leukocyte profile.\u003c/strong\u003e Unlike standard correlation matrices, which are heavily biased by the constant-sum constraint (closure effect) and produce spurious negative correlations, the variation matrix quantifies the true proportional relationship between pairs of components. Cell colors and numerical values represent the variance of the log-ratio between two given lineages, \u003cem\u003evar(ln(x\u003c/em\u003e\u003csub\u003e\u003cem\u003ei\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e\u003cbr\u003e\n/x\u003c/em\u003e\u003csub\u003e\u003cem\u003ej\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e\u003cbr\u003e\n)\u003c/em\u003e). Low values (darker tiles) indicate strict proportional stability and high subcompositional coherence between cell types. Conversely, the highest variance (brightest tile) is at the intersection of Lymphocytes and Granulocytes. This metric provides geometric confirmation of an active biological trade-off, demonstrating that these lineages compete for spatial and energetic resources within the constrained immune repertoire in response to environmental stress.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-9590298/v1/2bd4bbf6574507d58b48fbf5.png"},{"id":108976898,"identity":"0b99b068-87d9-4728-bfe0-02658ae71e17","added_by":"auto","created_at":"2026-05-11 11:29:28","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":308066,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEffect size estimation illustrating a \"False Positive\" (spurious correlation) in leukocyte profiles induced by the constant-sum constraint.\u003c/strong\u003eThe plots display the estimated mean difference and 95% Confidence Intervals (95% CI) for the granulocyte lineage in urban frogs compared to the baseline (ecological station). The classical approach (left), using raw percentages, produces a 95% CI that excludes zero, falsely suggesting a significant active recruitment of granulocytes (Type I error). This artifact is driven by the severe decline of lymphocytes in urban frogs, which artificially inflates the relative percentage of the remaining cell types within the 100% boundary. The CoDA approach (right), using centered log-ratio (\u003cem\u003eclr\u003c/em\u003e) coordinates, neutralizes this closure effect. The \u003cem\u003eclr\u003c/em\u003e estimation reveals that the effect size drops to zero, correctly identifying that the granulocyte compartment remains functionally stable across environments.\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-9590298/v1/7735bcadd10dd09bd746b4fa.png"},{"id":108840845,"identity":"74c2e6e4-3ff1-48cb-a9f4-d9d71f66183e","added_by":"auto","created_at":"2026-05-09 00:57:31","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":321873,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEffect size estimation illustrating a sign reversal induced by the constant-sum constraint.\u003c/strong\u003e The plots display the estimated mean difference and 95% Confidence Intervals (95% CI) for the distal tubule in urban frogs compared to the baseline (ecological station). The classical approach (left), using raw percentages, produces a negative coefficient, suggesting structural atrophy due to the spatial expansion of other renal compartments compressing the tubule's percentage. Conversely, the CoDA approach (right), using centered log-ratio (\u003cem\u003eclr\u003c/em\u003e) coordinates, properly accounts for the closure effect. The \u003cem\u003eclr\u003c/em\u003eestimation corrects this artifact, revealing a (positive) shift in the effect size, correctly identifying the true biological trajectory of the distal tubule in response to environmental stress.\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-9590298/v1/5f43fd72331135750f792e3d.png"},{"id":109081149,"identity":"c706c404-dcb9-4164-949b-8f48967ec1fd","added_by":"auto","created_at":"2026-05-12 12:02:25","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":257086,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eIsometric log-ratio (ilr) balance of erythrocyte nuclear abnormalities across an urbanization gradient.\u003c/strong\u003e To address the \"zero problem\" inherent in rare morphological events, a Bayesian multiplicative replacement (CZM) was applied to the two-part composition (Normal vs. Abnormal erythrocytes) prior to isometric transformation. The y-axis represents the ilr balance coordinate, which stabilizes the variance and ensures subcompositional coherence. The dashed red line indicates the natural genotoxic baseline calculated from the mean value of the ecological station (pristine environment). The significant shift in the urban population’s balance reflects a systemic erosion of erythrocyte genomic integrity. This visualization demonstrates that even with near-zero frequency events, compositional geometry provides a highly sensitive and statistically robust biomarker for environmental genotoxicity.\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-9590298/v1/0ed390eef35b8194d8f35a7e.png"},{"id":109204550,"identity":"f8fca763-cf89-4150-a174-13552170bbba","added_by":"auto","created_at":"2026-05-13 15:01:02","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3217208,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9590298/v1/fa44e52f-b54c-43ab-85b3-ff06d8aeaf21.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"A Practical Guide to Compositional Data Analysis in Tissue Stereology and Blood Cell Profile","fulltext":[{"header":"Introduction","content":"\u003cp\u003eThe quantification of 3D microstructures from 2D histological sections is a cornerstone of modern quantitative morphology. Since the formalization of the term \"stereology\" by Hans Elias in 1961, the discipline has transitioned from a purely geometric pursuit to a rigorous mathematical framework indispensable across life and material sciences (Underwood, \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e1969\u003c/span\u003e; Elias \u0026amp; Hyde, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e1980\u003c/span\u003e). This technique allows the reconstruction of the volumetric reality of tissues from opaque, planar slices (Weibel et al., \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e1966\u003c/span\u003e; West, \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; Geuna \u0026amp; Herrera-Rincon, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2015\u003c/span\u003e) by counting intersections on a structured point grid or by planimetric delimitation of digital areas, permitting researchers to estimate vital parameters, such as volume (V\u003csub\u003eV\u003c/sub\u003e), surface (S\u003csub\u003eV\u003c/sub\u003e), and numerical (N\u003csub\u003eV\u003c/sub\u003e) densities.\u003c/p\u003e \u003cp\u003eStereology is traditionally celebrated for its design-based approach, which provides unbiased and efficient estimates without requiring assumptions about the shape or orientation of the objects being measured (Mayhew, \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e1991\u003c/span\u003e; West, \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2012\u003c/span\u003e). However, a fundamental statistical vulnerability remains pervasive in the interpretation of these data. Stereological and cell count profiles are, by definition, parts of a whole\u0026mdash;whether expressed as volume fractions, area percentages, or relative frequencies. Mathematically, the simultaneous quantification of these proportions from the same reference space produces a strictly compositional dataset. Consequently, these data carry only relative information, in which the fractions are interrelated and subject to a constant-sum constraint (typically summing to 1 or 100%). Therefore, a unit increase in one tissue compartment must be accompanied by a decrease in at least one other compartment to maintain the total volume (Russ and Dehoff, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2000\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eDuring any morphological change\u0026mdash;such as the expansion of a cellular compartment due to hypertrophy, or inflammatory recruitment in a matrix\u0026mdash;the proportion of adjacent compartments is arithmetically compressed in the sample space, characterizing the classic constant sum problem. This constant-sum constraint, known as the \"closure effect,\" dictates that stereological data reside in a restricted sample space, called the simplex, rather than the unconstrained \u003cem\u003eD\u003c/em\u003e-dimensional Euclidean space (Aitchison, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e1982\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e). Applying traditional parametric or non-parametric statistics directly to these percentages can lead to spurious correlations and false associations, potentially resulting in erroneous biological conclusions, as warned more than a century ago by Pearson (\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e1897\u003c/span\u003e). This practice ignores the inherent negative correlational structure of closed data, violating fundamental assumptions of independence and multivariate normality, in which the geometry of the data itself may be mistaken for physiological change (Tilves et al., \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Pawlowsky-Glahn et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). Historically, the literature has attempted to circumvent these assumption violations through standardization methods, with the arcsine square root transformation being the most common. However, such transformations are insufficient, as they focus only on attenuating the behavior of variances at the extremes of the probability distribution (near 0 and 100%), failing to address the covariance structure imposed by the geometric constraint. The direct application of standard Euclidean statistics\u0026mdash;such as Pearson correlations, ordinary least squares, and other linear models\u0026mdash;remains the norm in morphological research.\u003c/p\u003e \u003cp\u003eA requirement for any morphological analysis is subcompositional coherence. This principle demands that inferences made about a subset of tissue components (a subcomposition) should remain consistent regardless of whether other components are included in the analysis (Aitchison \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e). Traditional methods based on raw proportions fail this test, as the inclusion of a highly variable component (e.g., spleen white pulp in inflammatory states) will artificially alter the relative percentages of all other healthy tissues, creating an illusion of generalized atrophy (Aitchison \u0026amp; Greenacre, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2002\u003c/span\u003e; Greenacre, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2021\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eTo resolve these geometric vulnerabilities, Compositional Data Analysis (CoDA) proposes a paradigm shift toward log-ratio transformations (Aitchison \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e). The central principle of CoDA posits that the biologically relevant information resides in the relative ratios between tissue components, not in their isolated fractions. By mapping data from the simplex into an orthonormal Euclidean space using the Isometric Log-Ratio (\u003cem\u003eilr\u003c/em\u003e) transformation, researchers can ensure that statistical models are both mathematically valid and biologically interpretable (Egozcue et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2003\u003c/span\u003e; Pawlowsky-Glahn \u0026amp; Egozcue, \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e2011\u003c/span\u003e). Within this framework, the focus shifts from individual parts to \"balances\" between groups of components. Focusing on the ratio between parts (e.g., white:red pulp in the spleen) ensures subcompositional coherence: the geometric relationship between two compartments remains identical regardless of the inclusion or exclusion of other tissues in the total count. Applying the logarithm linearizes these ratios, enabling the transposition of data from the restricted space to the real and continuous Euclidean space. For practical application in stereology, the workflow requires two fundamental transformations: the centered log-ratio (\u003cem\u003eclr\u003c/em\u003e) for multivariate exploration and the isometric log-ratio (\u003cem\u003eilr\u003c/em\u003e) for modeling and inference (Aitchison, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e). The \u003cem\u003eclr\u003c/em\u003e transformation maps the relative weight of each tissue component to the geometric mean of the entire morphometric composition of the sample. This is the required transformation for multivariate analysis based on covariance (Pawlowsky-Glahn et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2015\u003c/span\u003e), forming the mathematical basis for biplots. This facilitates understanding systemic remodeling, enabling pathologists and morphologists to distinguish between localized cellular recruitment and generalized parenchymal degeneration (Pawlowsky-Glahn et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Filzmoser et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2018\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe main advantage of \u003cem\u003eclr\u003c/em\u003e is the preservation of the dimensionality of the original system (a composition of \u003cem\u003eD\u003c/em\u003e parts results in \u003cem\u003eD\u003c/em\u003e transformed variables), allowing direct correlation with the original anatomical compartments (Aitchison, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e). However, the sum of the \u003cem\u003eclr\u003c/em\u003e vectors is equal to zero, by definition. This perfect collinearity results in a singular covariance matrix, which makes the direct use of \u003cem\u003eclr\u003c/em\u003e in linear models and traditional hypothesis tests unfeasible (Greenacre, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). To ensure the validity of statistical inferences and linear modeling (e.g., \u003cem\u003et\u003c/em\u003e-tests, ANOVA, MANOVA), the \u003cem\u003eilr\u003c/em\u003e transformation is necessary. The \u003cem\u003eilr\u003c/em\u003e isometrically projects the \u003cem\u003eD\u003c/em\u003e-dimensional simplex onto an orthogonal (\u003cem\u003eD\u003c/em\u003e\u0026thinsp;\u0026minus;\u0026thinsp;1)-dimensional Euclidean space (Pawlowsky-Glahn et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). In clinical and histological practice, the contrast matrix \u003cb\u003eV\u003c/b\u003e can be constructed via sequential binary partitioning (balances). This allows the researcher to create transformed variables with direct biological meaning. For example, a balance can be established that contrasts the variance of the entire stroma in relation to the structural parenchyma, or healthy tissue against pathological infiltrate. The Euclidean variables generated by the \u003cem\u003eilr\u003c/em\u003e do not present mathematical redundancy (Filzmoser et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2018\u003c/span\u003e), ensuring that the morphometric dataset meets the assumptions required by conventional statistics and supports the reproducibility of the identified associations.\u003c/p\u003e \u003cp\u003eDespite the theoretical framework described above being well established in geochemistry, food science, and microbiome research (Greenacre, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2021\u003c/span\u003e), the integration of CoDA into the quantitative microscopy community is a recent development. As recently highlighted in related fields of tissue analysis (Tilves et al., \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e2021\u003c/span\u003e), treating proportional data as if they belonged to an unconstrained Euclidean space is not a minor statistical oversight, but a fundamental geometric violation that can lead to spurious correlations and even sign reversals in biological trends (Aitchison, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e; Brown, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). Despite the call for statistical rigor (Altunkaynak et al., \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; Brown, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2017\u003c/span\u003e), the integration of CoDA into the quantitative microscopy community is a recent development. The first application of this framework to analyze complex morphophysiological responses was recently demonstrated by Franco-Belussi et al. (\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2024\u003c/span\u003e), who utilized CoDA to resolve tissue-level shifts in frogs across environmental gradients. This work established that CoDA could reveal insights\u0026mdash;such as systemic vascular remodeling and specific parenchymal trade-offs\u0026mdash;that traditional statistics would have obscured. Currently, there is a lack of accessible frameworks that integrate the rigorous sampling of stereology with the simplicial geometry of CoDA, leaving researchers vulnerable to artifacts driven by the closure effect.\u003c/p\u003e \u003cp\u003eHere, we provide a practical guide for the integration of CoDA into tissue stereology and histomorphometry. Using empirical case studies of splenic density, leukocyte profiles, renal architecture, and erythrocyte nuclear abnormalities from the tropical frog \u003cem\u003eLeptodactylus podicipinus\u003c/em\u003e sampled along an urbanization gradient, we demonstrate how the use of isometric balances and specialized graphical tools\u0026mdash;such as variation matrices, ray biplots, ternary plots, and CoDa dendrograms\u0026mdash;allows researchers to isolate true biological signals from mathematical artifacts. By providing reproducible R-based workflows and an interactive Shiny web application (CoDa Stereo), we aim to facilitate the adoption of CoDA as a standard for the quantification and interpretation of changes in tissue compartments.\u003c/p\u003e"},{"header":"Material and Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eStudy system and sampling design\u003c/h2\u003e \u003cp\u003eBriefly, the empirical datasets utilized here encompass splenic and renal structural density, and hematological parameters (leukocyte differential counts and erythrocyte nuclear abnormalities) of the Pointedbelly Frog (\u003cem\u003eLeptodactylus podicipinus\u003c/em\u003e). The Pointedbelly Frog is a medium-sized terrestrial frog (33\u0026ndash;45 mm snout\u0026ndash;vent length) widely distributed across South American lowlands (Frost, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2026\u003c/span\u003e), is frequently found in anthropized habitats, and a suitable model for ecotoxicological studies (Franco-Belussi et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eA total of 45 adult males were collected from three sites along an urbanization gradient in central Brazil: (i) an ecological station in the Pantanal wetland (16\u0026deg;50\u0026prime;S, 57\u0026deg;30\u0026prime;W; \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;10), considered the pristine baseline; (ii) an anthropized rural area (20\u0026deg;52\u0026prime;59\u0026Prime;S, 55\u0026deg;11\u0026prime;40\u0026Prime;W; \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;18) adjacent to the city; and (iii) an urban site (20\u0026deg;28\u0026prime;51.6\u0026Prime;S, 54\u0026deg;34\u0026prime;38.5\u0026Prime;W; \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;17). All specimens were euthanized following approved protocols and the spleen, kidneys, and blood were processed for histological and hematological analysis. A comprehensive description of the experimental design, euthanasia procedures, and histological processing is presented in Alves (\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2026\u003c/span\u003e). A full description of the dataset, along with further results and detailed biological interpretation, will be published elsewhere.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eStereological quantification\u003c/h3\u003e\n\u003cp\u003eFor splenic structural density, 10 transverse sections per animal (3 \u0026micro;m thick, H\u0026amp;E stained) were photographed at 100\u0026times; magnification (RGB 4096 \u0026times; 3286 pixels). A 120-intersection grid was superimposed on three images per animal, and intersections falling on red pulp, white pulp, melano-macrophage centers (MMCs), and blood vessels (including sinusoids) were counted. The structural volume density for each compartment was computed following commonly used formulae (see Russ \u0026amp; Dehoff, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2000\u003c/span\u003e). Sinusoids and blood vessels were pooled into a single \"Total vessels\" part prior to compositional analysis, yielding a four-part composition (\u003cem\u003eD\u003c/em\u003e\u0026thinsp;=\u0026thinsp;4): {Red pulp, White pulp, MMCs, Total vessels}.\u003c/p\u003e \u003cp\u003eFor renal structural density, PAS-stained sections were analyzed at 100\u0026times; magnification, with five images per animal and a 120-intersection grid. The five-part composition (\u003cem\u003eD\u003c/em\u003e\u0026thinsp;=\u0026thinsp;5) comprised: {Proximal tubules, Distal tubules, Renal corpuscles, Collecting ducts, Interstitial space}.\u003c/p\u003e \u003cp\u003eFor leukocyte differential counts, 100 leukocytes per animal in blood smear were classified into lymphocytes, monocytes, neutrophils, and eosinophils (May\u0026ndash;Gr\u0026uuml;nwald\u0026ndash;Giemsa\u0026ndash;Wright stain). Because basophils were absent in all specimens and eosinophils occurred at low and relatively constant frequencies across sites, neutrophils and eosinophils were amalgamated into a single \"Granulocytes\" part for the compositional analysis, yielding a three-part composition (\u003cem\u003eD\u003c/em\u003e\u0026thinsp;=\u0026thinsp;3): {Lymphocytes, Monocytes, Granulocytes}. This amalgamation is biologically motivated by the shared innate-immune function of the granulocyte lineage (Abbas et al. \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2021\u003c/span\u003e, Mescher \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) and follows the recommendation to reduce dimensionality when parts carry redundant information (Pawlowsky-Glahn et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2015\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eFor erythrocyte nuclear abnormalities, 1,000 erythrocytes per animal in the same blood smear were screened for abnormal nuclei (budded, lobulated, notched, reniform, binucleated, anucleated). The counts were aggregated into a two-part composition (\u003cem\u003eD\u003c/em\u003e\u0026thinsp;=\u0026thinsp;2): {Normal erythrocytes, Total abnormalities}\u003c/p\u003e\n\u003ch3\u003eThe Compositional Nature of Stereological Data\u003c/h3\u003e\n\u003cp\u003eAll stereological and leukocyte profile data were treated as compositional data. A composition is a vector of \u003cem\u003eD\u003c/em\u003e positive components that carry only relative information (Aitchison \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e). In tissue stereology, volume fractions (\u003cem\u003eV\u003c/em\u003e\u003csub\u003e\u003cem\u003ev\u003c/em\u003e\u003c/sub\u003e) and cellular proportions are intrinsically constrained by a constant sum (e.g., 100% of the organ's volume), which restricts the data to a sample space known as the (\u003cem\u003eD\u003c/em\u003e\u0026thinsp;\u0026minus;\u0026thinsp;1)-simplex (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). Because the simplex space is not Euclidean, standard multivariate techniques cannot be applied directly to raw percentages without inducing spurious correlations and subcompositional incoherence (Pearson, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e1897\u003c/span\u003e; Pawlowsky-Glahn et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). To adhere to the principles of scale invariance and permutation invariance, all datasets were transformed into log-ratio coordinates before statistical inference (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e\n\u003ch3\u003eData Transformation: Centered and Isometric Log-Ratios\u003c/h3\u003e\n\u003cp\u003eTo handle the multivariate structure of tissue compartments, we utilized two primary transformation workflows (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e):\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eCentered Log-Ratio (\u003c/b\u003e \u003cb\u003eclr\u003c/b\u003e \u003cb\u003e)\u003c/b\u003e: Used primarily for exploratory analysis and biplot visualization. The \u003cem\u003eclr\u003c/em\u003e transformation (Egozcue et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2003\u003c/span\u003e) normalizes each component by the geometric mean of the entire composition.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eIsometric Log-Ratio (\u003c/b\u003e \u003cb\u003eilr\u003c/b\u003e \u003cb\u003e)\u003c/b\u003e: Used for hypothesis testing and linear modeling. The \u003cem\u003eilr\u003c/em\u003e transformation projects the \u003cem\u003eD\u003c/em\u003e-part composition onto an orthonormal basis of \u003cem\u003eD\u003c/em\u003e\u0026thinsp;\u0026minus;\u0026thinsp;1 coordinates in Euclidean space, removing the mathematical redundancy inherent in closed data (Egozcue et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2003\u003c/span\u003e; Filzmoser et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2018\u003c/span\u003e).\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eFor organs with complex functional hierarchies (i.e., spleen and kidney), the \u003cem\u003eilr\u003c/em\u003e coordinates were constructed via Sequential Binary Partitioning (SBP). This process involves the creation of a signary matrix, in which components are grouped into \"balances\" based on biological relevance (e.g., parenchyma vs. stroma, or lymphoid vs. hematopoietic tissues). Each balance represents a specific log-ratio contrast between groups of parts, providing a coordinate that is both mathematically robust and biologically interpretable (Pawlowsky-Glahn \u0026amp; Egozcue, \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e2011\u003c/span\u003e). Notably, the choice of SBP is not unique; different partitioning schemes will produce different individual balance coordinates, although the overall multivariate test (MANOVA) is invariant to this choice (Filzmoser et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2018\u003c/span\u003e).\u003c/p\u003e\n\u003ch3\u003eSequential binary partitions (SBP) for each case study\u003c/h3\u003e\n\u003cp\u003e \u003cb\u003eSpleen (\u003c/b\u003e \u003cb\u003eD\u003c/b\u003e\u0026thinsp;\u003cb\u003e=\u0026thinsp;4; 3 balances).\u003c/b\u003e Balance 1: MMCs vs. {Red pulp\u0026thinsp;+\u0026thinsp;White pulp\u0026thinsp;+\u0026thinsp;Total vessels} \u0026mdash; isolates the inflammatory signal (Abbas et al. \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). Balance 2: Total vessels vs. {Red pulp\u0026thinsp;+\u0026thinsp;White pulp} \u0026mdash; captures vascular remodeling independently of parenchymal composition. Balance 3: Red pulp vs. White pulp \u0026mdash; contrasts hematopoietic and lymphoid functions.\u003c/p\u003e \u003cp\u003e \u003cb\u003eLeukocytes (\u003c/b\u003e \u003cb\u003eD\u003c/b\u003e\u0026thinsp;\u003cb\u003e=\u0026thinsp;3; 2 balances).\u003c/b\u003e Balance 1: Lymphocytes vs. Granulocytes \u0026mdash; the compositional analogue of the L:G stress ratio. Balance 2: Monocytes vs. {Lymphocytes\u0026thinsp;+\u0026thinsp;Granulocytes} \u0026mdash; isolates the monocyte recruitment signal (Abbas et al. \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2021\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cb\u003eKidney (\u003c/b\u003e \u003cb\u003eD\u003c/b\u003e\u0026thinsp;\u003cb\u003e=\u0026thinsp;5; 4 balances).\u003c/b\u003e A default \u003cem\u003eilr\u003c/em\u003e basis was used (Egozcue et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2003\u003c/span\u003e), as no single physiologically motivated binary partition covers all five renal compartments. Global inference via MANOVA is invariant to the choice of basis.\u003c/p\u003e \u003cp\u003e \u003cb\u003eErythrocytes (\u003c/b\u003e \u003cb\u003eD\u003c/b\u003e\u0026thinsp;\u003cb\u003e=\u0026thinsp;2; 1 balance).\u003c/b\u003e A single \u003cem\u003eilr\u003c/em\u003e coordinate contrasting Normal vs. Abnormal erythrocytes.\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eHandling Zeros in Rare Morphological Events\u003c/h2\u003e \u003cp\u003eIn the erythrocyte abnormality dataset, \"rounded zeros\" (count zeros) occurred in animals from the ecological station, in which no abnormal cells were detected among the 1,000 screened. Following Mart\u0026iacute;n-Fern\u0026aacute;ndez et al. (\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2003\u003c/span\u003e), a Count Zero Multiplicative (CZM) replacement was applied using a Bayesian Dirichlet prior via the \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003ezCompositions\u003c/span\u003e package (Palarea-Albaladejo \u0026amp; Mart\u0026iacute;n-Fern\u0026aacute;ndez, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). This imputation assigns a small pseudocount to the zero compartment, while proportionally adjusting the remaining parts to preserve the unit-sum constraint. For the splenic dataset, MMC counts of zero occurred in some specimens from the ecological station. Because these represent rounded zeros (MMCs exist but fall below the detection threshold of the 120-point grid), we applied the same CZM imputation prior to \u003cem\u003eilr\u003c/em\u003e transformation. It is important to note that the choice of imputation method introduces a degree of analyst subjectivity. For count data (point-counting stereology), CZM is the recommended approach; for percentage data or data with below-detection-limit values, methods such as multRepl or lrEM may be more appropriate (Palarea-Albaladejo \u0026amp; Mart\u0026iacute;n-Fern\u0026aacute;ndez, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). The CoDa Stereo application offers all three methods.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eStatistical inference\u003c/h3\u003e\n\u003cp\u003eAll statistical analyses were performed in R v. 4.5.2 (R Core Team, \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e2025\u003c/span\u003e). Compositional transformations and visualizations, including \u003cem\u003eclr\u003c/em\u003e-PCA biplots and CoDa dendrograms, were implemented using the \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003ecompositions\u003c/span\u003e (van den Boogaart et al., \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e2025\u003c/span\u003e) and \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003ezCompositions\u003c/span\u003e (Palarea-Albaladejo \u0026amp; Mart\u0026iacute;n-Fern\u0026aacute;ndez, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2015\u003c/span\u003e) packages. To evaluate the effect of the environment on tissue architecture, we used Permutational Multivariate Analysis of Variance (MANOVA) implemented in the \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003eRRPP\u003c/span\u003e package (Collyer \u0026amp; Adams, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). This approach allows for robust linear modeling of \u003cem\u003eilr\u003c/em\u003e coordinates, accommodating the non-normal distributions often found in biological count data while providing exact \u003cem\u003eP\u003c/em\u003e-values through randomized residual permutations (999 iterations). A Quarto dynamic document, along with data, is available at a Zenodo repository (Provete et al. 2026).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eCase Study 1: Systemic Splenic Remodeling Across an Urbanization Gradient\u003c/h2\u003e \u003cp\u003eThe application of CoDA to empirical splenic structural density data reveals a biological reality more complex than predicted by standard models of tissue stability. While conventional statistics frequently attribute the reduction of healthy tissue compartments solely to the \"closure effect\" caused by the expansion of a single component\u0026mdash;such as MMCs in response to inflammation\u0026mdash;our results suggest that the environment induces a multifaceted reorganization of the splenic architecture.\u003c/p\u003e \u003cp\u003eBecause volume fraction data inherently reside within the simplex and are strictly codependent (Aitchison, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e; Russ \u0026amp; Dehoff, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2000\u003c/span\u003e), we applied a MANOVA on \u003cem\u003eilr\u003c/em\u003e coordinates. This approach ensures that the identified statistical significance is robust to deviations from multivariate normality, a common issue in such data (Egozcue et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2003\u003c/span\u003e; Franco-Belussi et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eDecomposing the global variation via the SBP, described in Material and Methods, allowed us to isolate three independent biological processes (see Provete et al. 2026):\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eBalance 1 (MMCs vs. Remaining Parenchyma)\u003c/b\u003e: The balance isolating MMCs confirmed that environmental stress in urban areas induces a disproportionate increase in these centers, effectively decoupling the inflammatory signal from the organ's total functional mass and supporting the use of MMCs as biomarkers of chronic stress (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eBalance 2 (Vessels vs. Pulps)\u003c/b\u003e: This balance revealed that a substantial proportion of the variation in the relationship between the vascular system and the splenic pulps is explained by sampling site. This suggests that the environment is not merely triggering a localized immune response, but may be forcing vascular remodeling\u0026mdash;potentially associated with venous congestion or ischemia\u0026mdash;that classical statistics would fail to distinguish from simple compressive atrophy.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eBalance 3 (Red vs. White Pulp)\u003c/b\u003e: The alteration in the log-ratio between the red and white pulps indicates that environmental stress disrupts the baseline homeostasis between hematopoietic function and the lymphoid immune response.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe distinction between these concurrent processes was visualized through a ray biplot (\u003cem\u003eclr\u003c/em\u003e-PCA) bounded by convex hulls. This visualization delineates the morphological space occupied by specimens from each environment, without the restrictive distributional assumptions inherent to traditional confidence ellipses. In the biplot, the spatial divergence between the radii of the Red and White Pulps, combined with the substantial length of the Total Vessels ray, illustrates the ratios among these functional tissues are not constant. By mapping stereological data into the simplex and then into Euclidean space via \u003cem\u003eilr\u003c/em\u003e coordinates (Filzmoser et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2018\u003c/span\u003e), we show that urbanization is associated with a disruption in parenchymal integrity (Dulak \u0026amp; Płytycz \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e1989\u003c/span\u003e, Christin et al. \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2003\u003c/span\u003e). This level of analytical resolution demonstrates the value of CoDA for modern quantitative pathology, allowing researchers to disentangle localized adaptive responses from broader architectural changes.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eCase Study 2: Leukocyte Profile Shifts and the Aitchison Variation Matrix\u003c/h2\u003e \u003cp\u003eThe analysis of leukocyte profiles represents a classic challenge in eco-physiology. In amphibians, the ratio of lymphocytes to neutrophils (or granulocytes) is widely used as a proxy for chronic stress mediated by glucocorticoids. However, treating these counts as independent variables or simple ratios fails to account for the inherent codependence of the immune cell repertoire (Abbas et al. \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). To resolve this, we applied the CoDA framework (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e) to the three-part leukocyte composition {Lymphocytes, Monocytes, Granulocytes}.\u003c/p\u003e \u003cp\u003eRaw proportions are restricted to the simplex and lack the algebraic properties required for standard linear modeling (Aitchison, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e). Conventional ratios (e.g., lymphocytes:neutrophils, L/G) are asymmetric and non-normal; a doubling of lymphocytes results in a ratio of 2.0, while a halving results in 0.5, creating a skewed distribution that compromises ordinary least squares regression (Pawlowsky-Glahn et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2015\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eBy transforming the amalgamated granulocyte and mononuclear counts into \u003cem\u003eilr\u003c/em\u003e coordinates, we mapped the immune profile into an unconstrained Euclidean space (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). This transformation ensures that the resulting \"Leukocyte Balances\" are symmetric, scale-invariant, and subcompositionally coherent (Egozcue et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2003\u003c/span\u003e; Filzmoser et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2018\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe MANOVA on the \u003cem\u003eilr\u003c/em\u003e coordinates revealed a significant effect of the environment on the global leukocyte profile. The use of targeted balances (see Provete et al. 2026) allowed us to identify the specific cellular drivers of this response:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eBalance 1 (Lymphocytes vs. Granulocytes)\u003c/b\u003e: This balance, the compositional equivalent of the traditional L:G ratio, showed a significant downward shift in urban environments. In the Aitchison geometry, a value of 0 represents a perfect geometric equilibrium between the two cell types (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). The marked move toward negative values in urban specimens provides a rigorous confirmation of granulocyte dominance\u0026mdash;a hallmark of chronic environmental stress (Franco-Belussi et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eBalance 2 (Monocytes vs. [Lymphocytes\u0026thinsp;+\u0026thinsp;Granulocytes)])\u003c/b\u003e: This balance isolated the role of monocytes relative to the rest of the profile. The significant increase in the monocyte-driven coordinate in disturbed sites suggests a transition from a baseline surveillance state to an active innate immune response (Rollins-Smith \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e2017\u003c/span\u003e), likely associated with increased pathogen or pollutant load in urbanized habitats.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe ternary plot (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e) illustrates why the CoDA approach is important for reproducible quantitative pathology. By analyzing the \"balance\" rather than the \"fraction\", we show that the observed lymphopenia in urban frogs is not a standalone reduction, but a proportional trade-off against a proliferating granulocyte lineage (see also Provete et al. 2026 for a visualization using a CoDa Dendrogram). The dynamic document (Provete et al., 2026) includes CoDa dendrograms per site illustrating the balance structure, pairwise Euclidean distances in \u003cem\u003eilr\u003c/em\u003e space, and residual diagnostics for all four case studies. As cautioned by Pearson (\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e1897\u003c/span\u003e) and reiterated in modern compositional theory (Tilves et al., \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e2021\u003c/span\u003e), only by respecting the geometry of the simplex can we accurately validate leukocyte profiles as reliable histological biomarkers of environmental health.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eThe Aitchison Variation Matrix\u003c/h2\u003e \u003cp\u003eIn classical eco-immunology, researchers frequently attempt to identify interactions between leukocyte lineages (Davis et al. \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2008\u003c/span\u003e) using standard Pearson or Spearman correlation matrices. However, because leukocyte counts are subject to a constant-sum constraint, any increase in one cell type mathematically forces a decrease in another. This closure effect generates spurious negative correlations (Pearson, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e1897\u003c/span\u003e), rendering standard correlation matrices biologically uninterpretable within the simplex.\u003c/p\u003e \u003cp\u003eTo accurately quantify the dynamic relationships between immune cells, we used the Aitchison Variation Matrix (Pawlowsky-Glahn et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). Instead of measuring absolute correlations, the variation matrix computes the variance of the log-ratio between all possible pairs of components. A variance close to zero indicates that two components maintain a strictly stable proportional relationship (high subcompositional coherence), whereas a high variance indicates an active biological trade-off or independent fluctuation.\u003c/p\u003e \u003cp\u003eThe highest variance within the leukocyte profile occurred at the intersection of Lymphocytes and Granulocytes (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). This corroborates the hypothesis that these two lineages do not merely fluctuate independently, but are engaged in an active trade-off for the constrained spatial and energetic resources of the immune repertoire under environmental stress (Davis et al. \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2008\u003c/span\u003e, Rolins-Smith 2017). By replacing the standard correlation matrix with the compositional variation matrix, we provide an unbiased, mathematically coherent tool to validate the L:G trade-off as a responsive biomarker in eco-physiological biomonitoring.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eCase Study 3: Spurious Correlations and Sign Reversals in Eco-Immunology and Renal Architecture\u003c/h2\u003e \u003cp\u003e \u003cb\u003e3a. False Positive: Granulocyte inflation driven by lymphocyte decline\u003c/b\u003e \u003c/p\u003e \u003cp\u003eTo illustrate the insidious nature of the closure effect, we analyzed the leukocyte profiles (lymphocytes, monocytes, and granulocytes) across the urbanization gradient. While Case Study 2 demonstrated the value of CoDA for identifying real trade-offs, this analysis provides a textbook example of a Type I error (False Positive) driven entirely by the constant-sum constraint.\u003c/p\u003e \u003cp\u003eWhen using the classical Euclidean approach on raw cellular percentages, the linear model indicated a significant recruitment of granulocytes in the urban environment (estimated increase of +\u0026thinsp;10.2%, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.05), with the 95% confidence interval excluding zero (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, left panel). A conventional ecophysiological interpretation would erroneously conclude that environmental stress stimulates the innate immune system, triggering granulocytic proliferation.\u003c/p\u003e \u003cp\u003eHowever, this pattern is a mathematical illusion. In a three-part compositional system bounded to 100%, the parts are not free to vary independently. The raw data reveal that urban frogs experience a severe decline in lymphocytes, coupled with an increase in monocytes. Because the total must sum to 100%, the reduction in the dominant fraction pushes the granulocyte fraction upward, regardless of its actual biological density. When the data are transposed to the unconstrained Aitchison geometry via the \u003cem\u003eclr\u003c/em\u003e transformation, the true biological signal is isolated. The \u003cem\u003eclr\u003c/em\u003e-coordinate reveals that the effect size collapses toward zero (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, right panel), with the 95% confidence interval crossing the zero threshold. The granulocyte lineage remains structurally stable; the classical approach was deceived by the spatial exhaustion caused by the other cell types.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003e3b. Sign Reversal: Artificial atrophy of the distal tubule\u003c/b\u003e \u003c/p\u003e \u003cp\u003eTo demonstrate how the closure effect can not only generate false significance, also but completely invert the biological signal, we analyzed the renal structural density across the urbanization gradient. When analyzed through classical Euclidean statistics, the relative volume of the distal tubule appeared to exhibit a negative trend in urban frogs compared to the pristine baseline (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e, left panel). A traditional interpretation would suggest that the distal tubule is undergoing mild atrophy. However, this negative trajectory is an artifact driven by the closure effect: the pathological expansion of the interstitial space mechanically compresses the raw percentages of all other tissues.\u003c/p\u003e \u003cp\u003eWhen transposed to the Aitchison geometry via the \u003cem\u003eclr\u003c/em\u003e transformation, the \u003cem\u003eclr\u003c/em\u003e-coordinate for the distal tubule revealed a positive shift in the effect size (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e, right panel). While the 95% confidence intervals for both models cross zero\u0026mdash;indicating high variance and a lack of statistical significance\u0026mdash;the transition from a negative to a positive estimate is fundamental. Furthermore, the difference in the models' explanatory power (\u003cem\u003eR\u003c/em\u003e\u0026sup2;) highlights how raw percentages introduce geometric noise (see Provete et al. 2026 for detailed results).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThis case study demonstrates that relying on raw fractions in cell density analysis not only obscures the magnitude of physiological responses, but also can invert the direction of the biological signal. Publishing spurious correlations generated by raw fractions (Tilves et al., \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) can bias future meta-analyses (Marks-Anglin \u0026amp; Chen, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2020\u003c/span\u003e), confound the identification of reliable biomarkers of environmental impact, and subvert the mechanistic understanding of how organisms manage spatial and physiological trade-offs in altered environments.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eCase Study 4: The \"Zero Problem\" and Rare Erythrocyte Abnormalities\u003c/h2\u003e \u003cp\u003eThe quantification of genotoxic or structural damage in circulating erythrocytes\u0026mdash;such as nuclear buds, lobed nuclei, and notched cells\u0026mdash;is a staple of environmental biomonitoring (Duan et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2026\u003c/span\u003e). However, these anomalies are exceedingly rare: in a typical count effort of 1,000 cells per individual, specimens from pristine environments frequently present with a count of exactly zero abnormal cells.\u003c/p\u003e \u003cp\u003eWithin the Euclidean framework, a zero count is usually modelled via non-parametric rank tests. However, in CoDA, log-ratio transformations are undefined for zero values (ln(0) = \u0026minus;\u0026infin;). This mathematical barrier, widely known as \"The Zero Problem\" (Aitchison, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e; Filzmoser et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2018\u003c/span\u003e), must be carefully addressed to differentiate between structural zeros (the biological impossibility of a part existing) and rounded zeros (count zeros resulting from sampling exhaustion). Erythrocyte abnormalities in frogs represent count-based rounded zeros: the damage exists biologically at a baseline level, but falls below the detection threshold of the 1,000-cell probe. To resolve this without distorting the sample space, we applied CZM replacement using a Bayesian Dirichlet prior (Mart\u0026iacute;n-Fern\u0026aacute;ndez et al., \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2003\u003c/span\u003e; Filzmoser et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2018\u003c/span\u003e), as described in Material and Methods.\u003c/p\u003e \u003cp\u003eFollowing imputation, the two-part composition was mapped into a one-dimensional Euclidean space using a single \u003cem\u003eilr\u003c/em\u003e balance contrasting Normal against Abnormal cells. The permutational linear model (RRPP-MANOVA) applied to this coordinate revealed a significant erosion of erythrocyte integrity along the urbanization gradient (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e). While frogs from the ecological station maintained a high log-ratio (indicating overwhelming dominance of healthy erythrocytes), the urban population exhibited a significant shift towards the lower bound of the coordinate (see Provete et al. 2026 for full quantitative results).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eBy applying CoDA to rare count data, we eliminate the heteroscedasticity that plagues traditional fractional analyses of rare events. The \u003cem\u003eilr\u003c/em\u003e stabilizes the variance of the abnormality index, showing that even under the severe mathematical constraints of near-zero frequencies, compositional geometry provides a sensitive biomarker of environmental genotoxicity.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eSoftware Implementation: The CoDa Stereo Web Application\u003c/h2\u003e \u003cp\u003eDespite the robustness of the Aitchison geometry, the widespread adoption of CoDA in quantitative microscopy has been hindered by a steep computational learning curve. To bridge this gap, we developed CoDa Stereo (v2.0), a fully interactive, open-source web application built in R using the Shiny framework (Chang et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2025\u003c/span\u003e) and structured as an R package using the \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003egolem\u003c/span\u003e framework (Fay et al. \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). The application follows the PPDAC investigative cycle (Wild \u0026amp; Pfannkuch, \u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e1999\u003c/span\u003e) and incorporates the following analytical modules:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eData Upload and Parts Selection.\u003c/b\u003e The application supports CSV and Excel (.xlsx/.xls) files with automatic sheet selection. A smart parts selector allows users to specify which numeric columns represent the compositional parts, auto-excluding columns whose names match common ID patterns (e.g., Sample_ID, Animal_ID). Non-compositional covariates (body weight, biochemical assays) can be manually deselected. The ability to include continuous predictors in the multivariate model (MANCOVA) will be implemented in future versions of the program.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eData Quality and Missingness.\u003c/b\u003e Integration with the \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003enaniar\u003c/span\u003e package (Tierney \u0026amp; Cook, \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e2023\u003c/span\u003e) provides \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003evis_miss\u003c/span\u003e plots of missing data. Additionally, a custom three-state heatmap distinguishes observed values, structural zeros, and true missing values (NA)\u0026mdash;a distinction that standard missingness visualizations do not make. Missing values are handled via listwise deletion or geometric-mean imputation.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eThe Zero Problem Resolution.\u003c/b\u003e Three imputation methods are offered, selectable via the interface: Count Zero Multiplicative (CZM) replacement for count zeros, multiplicative replacement ( \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003emultRepl\u003c/span\u003e ) for rounded zeros (below detection limit), and log-ratio EM (lrEM) for rounded zeros when sample sizes are moderate to large (Palarea-Albaladejo \u0026amp; Mart\u0026iacute;n-Fern\u0026aacute;ndez, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). Column-specific detection limits are computed automatically.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eCompositional Visualization.\u003c/b\u003e Users can dynamically generate ternary plots to explore the empirical niche of three-part subcompositions within the simplex. When the composition has more than three parts, an amalgamation mode allows users to define three groups of parts, summing them into a sub-composition for ternary display. The ternary plot is implemented in pure \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003eggplot2\u003c/span\u003e (Wickham \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2016\u003c/span\u003e) to ensure cross-platform rendering compatibility. The tool also performs \u003cem\u003eclr\u003c/em\u003e transformations to render Compositional Ray Biplots (\u003cem\u003eclr\u003c/em\u003e-PCA) via \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003eFactoMineR\u003c/span\u003e (L\u0026ecirc; et al. 2008) and \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003efactoextra\u003c/span\u003e (Kassambara \u0026amp; Mundt \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2020\u003c/span\u003e).\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eRobust Statistical Inference.\u003c/b\u003e Users can build factorial MANOVA models on \u003cem\u003eilr\u003c/em\u003e coordinates via \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003eRRPP::lm.rrpp\u003c/span\u003e, with dynamic selection of multiple grouping factors and a toggle for additive vs. interaction designs. The formula is previewed before fitting for pedagogical transparency. Pairwise post-hoc comparisons are available via \u003cspan fontcategory=\"NonProportional\" class=\"\" name=\"Emphasis\"\u003eRRPP::pairwise()\u003c/span\u003e, and residual diagnostics (residuals vs. fitted, PC rotation) are generated automatically.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eDownloads.\u003c/b\u003e Ternary plots and biplots can be exported as vectorial PDF. Imputed data, ANOVA tables, and pairwise comparison tables are downloadable as CSV.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003eTo facilitate training and reproducibility, CoDa Stereo includes a built-in simulated dataset that mimics the splenic structural density scenario discussed in Case Study 1. The source code and instructions for deployment \u0026mdash; both local and on Posit Connect Cloud (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://sxzbf8.short.gy/CoDaStereo\u003c/span\u003e\u003cspan address=\"https://sxzbf8.short.gy/CoDaStereo\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e \u0026mdash; are provided at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/diogoprov/CoDaStereo\u003c/span\u003e\u003cspan address=\"https://github.com/diogoprov/CoDaStereo\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eLimitations\u003c/h2\u003e \u003cp\u003eSeveral limitations of the CoDA framework and the present study should be acknowledged. First, the choice of Sequential Binary Partition (SBP) for \u003cem\u003eilr\u003c/em\u003e construction is not unique: different partitioning schemes produce different individual balance coordinates, even though the global MANOVA test is invariant to this choice. Researchers should select biologically motivated partitions and report them transparently. Second, the imputation of zeros introduces a degree of analyst subjectivity, particularly regarding the distinction between structural zeros (a part that genuinely cannot exist) and rounded zeros (below detection threshold). In stereology and hematology, most zeros are rounded, but edge cases (e.g., a tissue type absent in a particular species) require careful consideration. Third, when the number of parts (\u003cem\u003eD\u003c/em\u003e) is large relative to the sample size (\u003cem\u003en\u003c/em\u003e), log-ratio transformations can amplify noise in parts with very small values, potentially inflating variance in the \u003cem\u003eilr\u003c/em\u003e space. Fourth, the present case studies are based on a single species (\u003cem\u003eL. podicipinus\u003c/em\u003e) and a specific urbanization gradient; the generalizability of the specific biological findings to other taxa and stressors remains to be tested, although the statistical framework is universal.\u003c/p\u003e \u003c/div\u003e"},{"header":"Conclusion","content":"\u003cp\u003eThe transition from classical Euclidean geometry to the Aitchison simplex is an important step toward greater rigor in quantitative morphology (Aitchison, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e1986\u003c/span\u003e; Pawlowsky-Glahn et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). As demonstrated throughout our case studies, the reliance on raw percentages for stereological data can compromise inferential power through false positives and generate severe distortions, such as the inversion of biological signals under environmental stress. By anchoring the analysis in the principle of subcompositional coherence and using robust log-ratio transformations (\u003cem\u003eclr\u003c/em\u003e and \u003cem\u003eilr\u003c/em\u003e), CoDA reveals the mathematical artifacts created by the closure effect (Pearson, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e1897\u003c/span\u003e; Tilves et al., \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e2021\u003c/span\u003e), uncovering the structural trajectories and physiological trade-offs of biological tissues. The introduction of the CoDa Stereo web application addresses the computational barriers that have historically hindered the widespread adoption of this methodology among practitioners, offering an intuitive interface for the entire analytical cycle. We encourage the stereological community to adopt this geometric framework to ensure that microscopic quantifications accurately reflect biological reality.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e \u003ch2\u003eCONFLICT OF INTEREST STATEMENT\u003c/h2\u003e \u003cp\u003eThe authors have no competing interests to declare that are relevant to the content of this article.\u003c/p\u003e \u003c/p\u003e\u003cp\u003e \u003ch2\u003eAI DISCLOSURE STATEMENT\u003c/h2\u003e \u003cp\u003eDuring the preparation of this manuscript, the authors used generative artificial intelligence (Claude, Anthropic; Gemini, Google) to support specific research, software development, and writing workflows. In accordance with the AIdIT framework (Drobniak et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2026\u003c/span\u003e) and the COPE guidelines (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://publicationethics.org/guidance/cope-position/authorship-and-ai-tools\u003c/span\u003e\u003cspan address=\"https://publicationethics.org/guidance/cope-position/authorship-and-ai-tools\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e, the applications of AI are categorized as follows:\u003c/p\u003e \u003c/p\u003e\u003cp\u003e \u003ch2\u003eHuman Oversight Statement\u003c/h2\u003e \u003cp\u003eThe authors maintained continuous, active oversight over all AI-assisted processes. All generated R scripts and the Shiny application were locally executed, independently verified, and validated against the raw empirical data to ensure mathematical correctness and functional integrity. All AI-generated text was critically reviewed, extensively edited, and evaluated for conceptual accuracy. The authors assume full responsibility for the analyses, interpretations, and final content of this publication.\u003c/p\u003e \u003c/p\u003e\u003ch2\u003eFUNDING STATEMENT\u003c/h2\u003e \u003cp\u003eDBP receives a research fellowship (grant #83//027.032/2024) and LFB received a grant (#174/2023) from the Foundation to Support the Development of Education, Science, and Technology of the State of Mato Grosso do Sul (FUNDECT). CEF receives a research fellowship (#309358/2023-0) from the Brazilian Conselho Nacional de Desenvolvimento Cient\u0026iacute;fico e Tecnol\u0026oacute;gico (CNPq).\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eDBP: Conceptualization (lead); Data curation (lead); Formal analysis (lead); Funding Acquisition (contributing); Investigation (contributing); Methodology (contributing); Project Administration (lead); Resources (contributing); Software (lead); Supervision (contributing); Validation (equal); Visualization (lead); Writing \u0026ndash; original draft (lead); Writing \u0026ndash; review and editing (equal). MRS: Investigation (equal); Methodology (contributing); Writing \u0026ndash; review and editing (equal). Lilian Franco-Belussi: Investigation (equal); Methodology (equal); Supervision (contributing); Validation (equal); Writing \u0026ndash; review and editing (equal); Funding Acquisition (contributing). CEF: Funding Acquisition (contributing); Investigation (equal); Methodology (equal); Resources (lead); Supervision (contributing); Validation (equal); Writing \u0026ndash; review and editing (equal).\u003c/p\u003e\u003ch2\u003eAcknowledgement\u003c/h2\u003e\u003cp\u003eClara F. B. Alves helped with data collection. ICMBio provided collecting permits (#63297-1 to LFB and #80075-1 to MRS). Animal handling and euthanasia followed the National Institute of Health Guide for Care and Use of Animals in the Laboratory and are approved by our university\u0026rsquo;s IACUC (CEUA #997/2018; #1.203/2021).\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eAll data and R code are available in the Zenodo repository at https://doi.org/10.5281/zenodo.19928375\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAbbas AK, Lichtman AH, Pillai S (2021) Cellular and Molecular Immunology, 10th edn. Elsevier, Philadelphia\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAitchison J (1982) The statistical analysis of compositional data. J R Stat Soc Ser B 44:139\u0026ndash;177\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAitchison J (1986) The statistical analysis of compositional data. Chapman and Hall, London\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAitchison J, Greenacre M (2002) Biplots of compositional data. Appl Stat 51:375\u0026ndash;392\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAltunkaynak BZ, \u0026Ouml;nger ME, Altunkaynak ME, Ayranci E, Canan S (2012) A brief introduction to stereology and sampling strategies: basic concepts of stereology. NeuroQuantology 10:31\u0026ndash;43\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlves CFB (2026) Efeitos da urbaniza\u0026ccedil;\u0026atilde;o sobre o sistema hematopoi\u0026eacute;tico de \u003cem\u003eLeptodactylus podicipinus\u003c/em\u003e (Anura, Leptodactylidae). Master\u0026rsquo;s Thesis, Universidade Federal de Mato Grosso do Sul, Campo Grande, MS, Brazil\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBrown DL (2017) Practical stereology: applications for the pathologist. Vet Pathol 54:358\u0026ndash;368\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChang W, Cheng J, Allaire J et al (2025) shiny: web application framework for R. R package version 1.12.1. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://CRAN.R-project.org/package=shiny\u003c/span\u003e\u003cspan address=\"https://CRAN.R-project.org/package=shiny\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChristin M-S, Gendron AD, Brousseau P et al (2003) Effects of agricultural pesticides on the immune system of \u003cem\u003eRana pipiens\u003c/em\u003e and on its resistance to parasitic infection. Environ Toxicol Chem 22:1127\u0026ndash;1133. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1002/etc.5620220522\u003c/span\u003e\u003cspan address=\"10.1002/etc.5620220522\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCollyer ML, Adams DC (2018) RRPP: an R package for fitting linear models to high-dimensional data using residual randomization. Methods Ecol Evol 9:1772\u0026ndash;1779\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDavis AK, Maney DL, Maerz JC (2008) The use of leukocyte profiles to measure stress in vertebrates: a review for ecologists. Funct Ecol 22(4):590\u0026ndash;601\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDrobniak SM et al (2026) A systematic map of generative AI guidelines and reporting in ecology and evolutionary biology. Preprint, Research Square. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.21203/rs.3.rs-9160721/v1\u003c/span\u003e\u003cspan address=\"10.21203/rs.3.rs-9160721/v1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDuan H, Peng X, Qin S et al (2026) Micronuclei: origins, assays, mechanisms, diseases and treatments. Sig Transduct Target Ther 11:114. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41392-025-02538-8\u003c/span\u003e\u003cspan address=\"10.1038/s41392-025-02538-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDulak J, Płytycz B (1989) The effect of laboratory environment on the morphology of the spleen and the thymus in the yellow-bellied toad, \u003cem\u003eBombina variegata\u003c/em\u003e (L.). Developmental \u0026amp; Comparative Immunology 13:49\u0026ndash;55. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/0145-305X(89)90016-5\u003c/span\u003e\u003cspan address=\"10.1016/0145-305X(89)90016-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEgozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barcel\u0026oacute;-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35:279\u0026ndash;300\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eElias H, Hyde DM (1980) An elementary introduction to stereology (quantitative microscopy). Am J Anat 159:411\u0026ndash;446\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFay C, Guyader V, Rochette S, Girard C (2024) golem: A Framework for Robust Shiny Applications. doi:10.32614/CRAN.package.golem, R package version 0.5.1. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://CRAN.R-project.org/package=golem\u003c/span\u003e\u003cspan address=\"https://CRAN.R-project.org/package=golem\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFilzmoser P, Hron K, Templ M (2018) Applied compositional data analysis: with worked examples in R. Springer, Cham\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFranco-Belussi L, De Oliveira J\u0026uacute;nior JG, Goldberg J, De Oliveira C, Fernandes CE, Provete DB (2024) Multiple morphophysiological responses of a tropical frog to urbanization conform to the pace-of-life syndrome. Conserv Physiol 12:coad106\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFrost DR (2026) Amphibian species of the world: an online reference. Version 6.2 (30 April 2026). Electronic Database. American Museum of Natural History, New York. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.5531/db.vz.0001\u003c/span\u003e\u003cspan address=\"10.5531/db.vz.0001\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGeuna S, Herrera-Rincon C (2015) Update on stereology for light microscopy. Cell Tissue Res 360:5\u0026ndash;12\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGreenacre M (2021) Compositional data analysis. Annu Rev Stat Appl 8:271\u0026ndash;299\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGundersen HJG et al (1988) Some new, simple and efficient stereological methods and their use in pathological research and diagnosis. APMIS 96:379\u0026ndash;394\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKassambara A, Mundt F (2020) factoextra: Extract and Visualize the Results of Multivariate Data Analyses, R package version 1.0.7, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://CRAN.R-project.org/package=factoextra\u003c/span\u003e\u003cspan address=\"https://CRAN.R-project.org/package=factoextra\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.32614/CRAN.package.factoextra\u003c/span\u003e\u003cspan address=\"10.32614/CRAN.package.factoextra\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLe S, Josse J, Husson F (2008) FactoMineR: An R Package for Multivariate Analysis. J Stat Softw 25(1):1\u0026ndash;18. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.18637/jss.v025.i01\u003c/span\u003e\u003cspan address=\"10.18637/jss.v025.i01\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMarks-Anglin A, Chen Y (2020) A historical review of publication bias. Res Synth Methods 11:725\u0026ndash;742\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMart\u0026iacute;n-Fern\u0026aacute;ndez JA, Barcel\u0026oacute;-Vidal C, Pawlowsky-Glahn V (2003) Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Math Geol 35:253\u0026ndash;278\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMayhew TM (1991) The new stereological methods for interpreting functional morphology from slices of cells and organs. Exp Physiol 76:639\u0026ndash;665\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMescher AL (2021) Junqueira's Basic Histology: Text and Atlas, 16th edn. McGraw Hill, New York\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePalarea-Albaladejo J, Mart\u0026iacute;n-Fern\u0026aacute;ndez JA (2015) zCompositions: R package for multivariate imputation of left-censored data under a compositional approach. Chemom Intell Lab Syst 143:85\u0026ndash;96\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePawlowsky-Glahn V, Egozcue JJ (2011) Exploring compositional data with the CoDa-dendrogram. Austrian J Stat 40:103\u0026ndash;113\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R (2015) Modeling and analysis of compositional data. Wiley, Chichester\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePearson K (1897) Mathematical contributions to the theory of evolution: on a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc R Soc Lond 60:489\u0026ndash;498\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eR Core Team (2025) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.R-project.org/\u003c/span\u003e\u003cspan address=\"https://www.R-project.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRollins-Smith LA (2017) Amphibian immunity-stress, disease, and climate change. Dev Comp Immunol 66:3\u0026ndash;11\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRuss JC, Dehoff RT (2000) Practical stereology, 2nd edn. Kluwer Academic/Plenum, New York\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTierney N, Cook D (2023) Expanding tidy data principles to facilitate missing data exploration, visualization and assessment of imputations. J Stat Softw 105:1\u0026ndash;31\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTilves C, Peddada S, Miljkovic I (2021) Body composition analyses require compositional data analytic (CoDA) methods. Obesity 29:1930\u0026ndash;1931\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUnderwood EE (1969) Stereology, or the quantitative evaluation of microstructures. J Microsc 89:161\u0026ndash;180\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVan den Boogaart KG, Tolosana-Delgado R, Bren M (2025) compositions: compositional data analysis. R package. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://CRAN.R-project.org/package=compositions\u003c/span\u003e\u003cspan address=\"https://CRAN.R-project.org/package=compositions\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWeibel ER, Kistler GS, Scherle WF (1966) Practical stereological methods for morphometric cytology. J Cell Biol 30:23\u0026ndash;38\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWest MJ (2012) Introduction to stereology. Cold Spring Harb Protoc 2012:pdb.top070623\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWickham H (2016) ggplot2: Elegant Graphics for Data Analysis. Springer-, New York\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWild CJ, Pfannkuch M (1999) Statistical thinking in empirical enquiry. Int Stat Rev 67:223\u0026ndash;248\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Closure effect, Log-ratio transformations, Quantitative microscopy, Simplex, MANOVA, Shiny application","lastPublishedDoi":"10.21203/rs.3.rs-9590298/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9590298/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eHistologic quantification and tissue stereology inherently generate compositional data, typically expressed as proportions, percentages, or volumetric densities constrained to a constant sum. Applying standard statistical methods directly to these closed data disregards their geometric constraints, which can generate spurious correlations and misleading biological inferences. Here, we provide a practical guide to help morphologists transition from conventional statistics to Compositional Data Analysis (CoDA). The mathematical framework for mapping data from the restricted Aitchison simplex to an unconstrained Euclidean space is detailed via log-ratio transformations (\u003cem\u003eclr\u003c/em\u003e and \u003cem\u003eilr\u003c/em\u003e). Using splenic structural density, leukocyte profiles, renal architecture, and erythrocyte nuclear abnormalities as case studies from the tropical frog \u003cem\u003eLeptodactylus podicipinus\u003c/em\u003e sampled along an urbanization gradient, we contrast the classical approach with CoDA. We demonstrate how standard Euclidean analysis can generate false positives and mask physiological trade-offs through sign reversals, whereas CoDA and its multivariate interpretation based on biplots can reveal the magnitude of tissue variances and ecophysiological associations that would otherwise remain hidden. To facilitate adoption and reduce the barriers posed by programming proficiency, the guide is accompanied by structured code and an interactive open-source, web-based graphical interface (CoDa Stereo). We argue that adopting CoDA for tissue stereology and related quantitative pathology workflows is important for ensuring analytical integrity and reproducibility.\u003c/p\u003e","manuscriptTitle":"A Practical Guide to Compositional Data Analysis in Tissue Stereology and Blood Cell Profile","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-09 00:57:25","doi":"10.21203/rs.3.rs-9590298/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"12de0242-2ba3-4085-8de7-eaf5ff76676a","owner":[],"postedDate":"May 9th, 2026","published":true,"recentEditorialEvents":[{"type":"decision","content":"Rejected","date":"2026-05-10T10:21:17+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-05-05T06:03:01+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-05-05T06:02:38+00:00","index":"","fulltext":""},{"type":"submitted","content":"Cell and Tissue Research","date":"2026-05-02T03:18:34+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-05-18T12:05:16+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-09 00:57:25","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9590298","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9590298","identity":"rs-9590298","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-23T02:00:01.238055+00:00
License: CC-BY-4.0