A long-term refined genomic analysis of tuberculosis clusters to discriminate between ongoing transmission, reactivations or diagnostic delays

preprint OA: closed
Full text JSON View at publisher
Full text 111,774 characters · extracted from preprint-html · click to expand
A long-term refined genomic analysis of tuberculosis clusters to discriminate between ongoing transmission, reactivations or diagnostic delays | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article A long-term refined genomic analysis of tuberculosis clusters to discriminate between ongoing transmission, reactivations or diagnostic delays Cristina Rodríguez-Grande, Silvia Vallejo-Godoy, Miguel Martínez-Lirola, and 11 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6057121/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Introduction Tuberculosis (TB) clusters are interpreted as ongoing transmission events, which demand control interventions. Our aim is to perform a refined genomic analysis in Almería, Spain, to evaluate whether reasons other than ongoing transmission could be behind the incorporation of new cases to pre-existing or new clusters, to manage more properly each new clustered case and optimizing control resources. Methods Illumina WGS was performed following standard procedures. First, genomic data were analyzed quantitatively, to identify clustered cases (< 12 SNPs). Then, a refined evolutionary analysis was performed, positioning the clustered cases in genomic networks, based on the distribution of SNPs. The location of the new clustered cases in relation to the cases preceding it in the cluster was considered to interpret the most likely reasons behind the growth of each cluster, supporting them by epidemiological and clinical data. Results We identified 106 genomic clusters during the years 2003–2024, including a total of 537 cases (2–25 cases/cluster). 106 (34.6%) of the diagnosed cases in the last four years (2021–2024) were included in 53 clusters; 22 were new clusters, while the remaining were growing clusters, already identified before 2021. New entrances in clusters were due to ongoing transmission (new cases connected in the genomic network with a recently diagnosed case at 0–2 SNPs) in only 29% of the growing clusters (1–11 cases entering in pre-existing clusters) and in 63.6% of the new clusters (2–6 cases/cluster). For new clustered cases who were not the result of ongoing transmission, the analysis of the genomic networks allowed us to identify clusters with the involvement of i) reactivations of past exposures (new case close to another case diagnosed > 4 years before), ii) prolonged diagnostic delays or subclinical periods (new case positioned in branches with a high number of SNPs preceding them, suggesting persistent bacterial viability), or to iii) multifactorial clusters, growing by reactivations, diagnostic delays and/or ongoing transmission. Conclusion A genomic evolutionary analysis is required for a precise interpretation of growing clusters. Only one-third of the growing clusters in Almería correspond to ongoing transmissions. Reactivations of past exposures, prolonged diagnostic delays or subclinical TB had also a role in growing clusters. The precise identification of the reasons behind growing clusters allows the specific management of each new clustered case. tuberculosis genomics transmission epidemiology Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Introduction Whole genome sequencing (WGS) has revolutionized how we can study the dynamics of tuberculosis (TB) transmission. Since the definition of the genomic thresholds to be applied to define clusters (pair-wise distance between two strains < 12 SNPs) ( 1 ) and identify among them those more likely related to recent transmission (< 5 SNPs) ( 2 ). These criteria have been applied in many settings worldwide ( 3 – 5 ). The general purpose behind the identification of clusters supported in genomics is to identify the epidemiological contexts where ongoing transmission is occurring, to target them by control interventions to finally control transmission and the subsequent emergence of new secondary cases. However, the SNPs data can also be exploited to extract additional epidemiological valuable information. The transmission chronology or the most likely person-to-person transmission relationships, among the cases sharing a cluster, could be inferred from a more detailed analysis of how the differential SNPs are distributed between the cases sharing the same cluster ( 6 ). To obtain additional information to the mere description of whether cases are part of the same cluster or not is especially required in socio epidemiologically complex populations. In these settings, transmission dynamics are especially complex, and tailored interventions need to be adapted to each situation, to increase the success of the control measured. This is the case of Almeria, a population in Southeast Spain with 67.1% of the total cases of TB involving migrants from 51 different nationalities (2003–2024), where the incident rates are 4–5 fold higher than in the rest of Spain. We have obtained the genomic data for TB clusters in Almeria for the last 21 years (2003-24). From these data, we have differentiated between clusters that are currently active, from those that have not caused new secondary cases in the last 3.5 years. Then, we analyzed the genomic networks of the active clusters to interpret the most likely reasons behind their growth and propose tailored interventions according to the nature of each cluster. In addition to clusters growing due to ongoing transmission, we have identified a proportion with other reasons behind them, such as reactivations of old exposures, diagnostic delays, or subclinical disease, which allows us to intervene specifically, optimizing the limited control resources that should be devoted to control transmission hot-spots. Methods General analytical scheme The analysis in Almería is based on a two-step molecular/genomic analysis. For every stain-positive TB case diagnosed, MIRU-VNTR analysis is performed directly on clinical samples, which allows us to rule out non-clustered, orphan, cases. Then, the isolates involved in MIRU-clusters are sequenced by Illumina standard approaches. All new cases confirmed to be part of a cluster after genomic analysis are discussed weekly in online meetings involving the microbiologists responsible for the diagnostics tasks, epidemiologists, and the personnel responsible for the genomic analysis, with the periodical support, when required, of the clinicians managing the cases. MIRU-VNTR analysis DNA for MIRU-VNTR was purified from the clinical sample after heating inactivation, using the GXT NA Extraction Kit (Hain Lifescience, Nehren, Germany), following the manufacturer's instructions. 24-MIRU-VNTR typing was done as described previously ( 7 ) from purified DNA from clinical specimens, except for the PCR final volume (20 µl for each reaction) and the number of cycles in PCR (45 cycles). Briefly, PCR was performed using eight triplex PCR reactions. The standard protocol followed using the multiplex PCR kit (Qiagen) for mixes 1, 2, 3, 5, 6 and 8 ( 8 ). For mixes 4 and 7, PuReTaq Ready-To-Go PCR beads (GE Healthcare, Chicago, USA) were used adding 0.5 mM of each primer and 3% dimethyl sulfoxide (DMSO). When amplification was deficient for some locus, simplex PCR was applied. PCR fragment sizing was performed using capillary electrophoresis in a 3500 Genetic Analyzer (Thermo Fisher Scientific, Waltham, USA). MIRU-VNTR alleles were assigned with GeneMapper v5 (Thermo Fisher Scientific, Waltham, USA). Whole genome sequencing DNA for sequencing was purified from Mycobacteria Growth Indicator Tube (MGIT) subcultures using the QIAamp DNA mini kit (Qiagen, Courtaboeuf, France) following the manufacturer's instructions after boiling for 10 minutes and pre-incubation with proteinase K (20 mg/ml) overnight at 56°C. Libraries for confirming clusters by WGS were prepared using the Nextera XT kit (Illumina, San Diego, USA) following the manufacturer’s instructions, and pooled for running on a MiSeq or Nextseq device (2x151bp). Sequence analysis was done using an in-house pipeline deposited in Git-Hub: https://github.com/MG-IiSGM/autosnippy . This pipeline’s workflow follows the same steps described previously ( 9 ) using a hypothetical Mycobacterium tuberculosis (MTB) ancestral genome ( 10 ) as reference. Finally, genomic distances between sequences were calculated using Jaccard similarity and Hamming distance metrics generating distance matrices. Alignments and SNP variants were visualized and checked with the IGV (Integrative Genomics Viewer) program. Analysis of clusters Genomic analysis Firstly, isolates were considered clustered when < 12 SNPs were identified between them ( 1 ). Then, an evolutionary analysis was performed, by obtaining genomic networks, based on the distribution of SNPs between the clustered isolates, by using Network (5.0, 10.0), or PopART 1.7. Clinical and epidemiological analysis The new entrances in clusters considered, from the genomic evolutionary analysis, as candidates for likely reactivations or prolonged diagnostic delays/subclinical TB were reviewed according to the clinical/epidemiological data from the corresponding cases. Results General analysis of clusters In the study period (January 2003-June 2024), 1886 culture-positive TB cases were diagnosed in Almería, and MIRU-VNTR was performed in 90.6% of them. 984 of the cases (57.6%) were involved in 126 clusters according to MIRU-VNTR. Genomic data were available for 84.1% of these clusters (including a total of 615 cases, 537 confirmed clustered cases in 106 genomic clusters, 2–25 cases/cluster). Our first aim was to analyze the reasons behind the clustering of the cases in the most recent study period, namely 2021-June 2024. 106 of the cases diagnosed in this period (34.6% of the total cases in this period) were included in 53 clusters (< 12 SNPs) (Supplementary Figure). 46 of these cases (43.4%) constituted 22 clusters (2–6 cases/cluster), considered as new clusters, identified for the first time in the last 3.5 years; while the remaining 60 (56.6%) corresponded to cases entering into 31 clusters that had already been identified before, in the 2003–2020 period, considered growing clusters, now incorporating 1–11 new cases. Once identified the recently clustered cases, we obtained the genomic networks for each of the involved clusters, by locating the cases in the same network, according to the differential SNPs identified between their sequences. The genomic networks were analyzed to propose the most likely interpretation of the reasons behind the new entrances of cases in clusters, and therefore tailor and recommend the most suitable interventions for each case. New clustered cases due to ongoing transmission Considering the rate of acquisition of SNPs in MTB (1 SNP every 2–3 years), we considered ongoing transmission as the reason behind a new case entering into a cluster when the new case was located in the genomic network close (at 0–2 SNPs) to a recently diagnosed case (0–3 years before; Fig. 1 A). This criterion was fulfilled for 47.2% of the new clustered cases and 43.4% of the total clusters. Focusing on the new clusters, most of them (14; 63.6%) (Fig. 1 A, Clusters 3113 and 3084) grew due to ongoing transmissions. One of these clusters corresponded to an extensive transmission involving six new cases (Fig. 1 A, Cluster 3113), while the remaining were more limited, including only one with 4 new clustered cases and the remaining were clusters involving 1–2 new cases (Supplementary Figure). When we focused on the clusters that had already been defined before 2021 but were still growing in the last 3.5 years, ongoing transmission as the reason behind their growth was identified in only 9 (29%) of them (Fig. 1 A, Clusters 789 and 786). Only one of the clusters involved a high number, eight, of new cases, one included 3 new cases, while the remaining clusters included only one new clustered case. For the cases in which the interpretation of the genomic network indicated cluster growing due to ongoing transmission, the epidemiological intervention recommended was to study the context of the patient looking for other potential active non-diagnosed TB cases. In the 2021-June 2024 period, we identified 18 secondary cases from 8 different clusters that had not still been diagnosed before this intervention. In another 11 new clusters, genomic data lead to the re-orientation and intensification of the surveillance for new cases and epidemiological research. Impact of COVID-19 pandemic on TB ongoing transmission Among the clusters from which the genomic network analysis indicated ongoing transmission, we must highlight one example, whose peculiarities helped us to evaluate the impact of the COVID-19 pandemic on TB transmission. The cluster (Fig. 1 A, Cluster 789) involved a case (Case 2534) with two TB episodes (one in 2019, before the pandemic, and the other in 2022) due to a lack of adherence to treatment. The strain isolated from the first and second episodes differed in 1 SNP. This SNP was used as a marker SNP to determine the number of new secondary cases due to exposures to case p2534 either before (lacking the differential SNP) or after the pandemic (harboring the SNP acquired in p2534 second episode). Only one new case (Case 2816), diagnosed in 2021 was found to share the same sequence of Case 2536 first episode (missing the SNP), while six cases diagnosed in 2022-23 were identical to Case 2534 second episode (harbouring the SNP). Most of the post-pandemic secondary cases had shared in some moment the same household with Case 2534. Reasons other than ongoing transmission behind new clustered cases For the new clustered cases entering in 36.4% of the new clusters and in 71% of the growing clusters, the analysis of their location in the genomic networks was not consistent with ongoing transmission. Moreover, their positions in the network, together with the general topology of the networks, allowed us to propose for a proportion of them other likely alternative explanations for their entrances in the corresponding clusters. From now on we will present a detailed analysis of representatives of these alternative explanations, without differentiating those entering into new or growing clusters. Reactivations of past exposures Ten of the cases entering in 10 clusters were located in their networks close (0–1 SNP) to other cases diagnosed ≥ 4 years before. These findings could indicate the involvement in the cluster of cases with TB due to reactivations of a past exposure (Fig. 2 a, Clusters 1482 and 15). Additionally, we identified another 6 clusters with new cases also entering in the network after a case diagnosed ≥ 4 years before, indicating also the involvement of reactivations in this cluster. However, different to the previous examples, now we detected some more SNPs (2–3 SNPs) with respect to the case preceding them in the network (Fig. 2 B, Clusters 1330 and 1348). For these cases, one possible explanation for the intermediate SNPs is the involvement of certain diagnostic delay. This delay would have been responsible for a period of bacterial viability before the diagnosis of the clustered case, likely due to within-host evolution leading to the acquisition of SNP diversity. However, despite the location of the new cases entering in these networks suggest the involvement in the cluster of events of reactivation from past exposures, we can´t fully assure that the very cases identified to enter in these clusters are those due to reactivation +/- certain diagnostic delay. Alternatively, reactivation +/- delay could have occurred in other cases involved in the transmission event, who might have preceded our cases, but who are not been drawn in the network because corresponded to non-diagnosed cases, missed cases, that therefore can´t be represented in the network. In this alternative scenario, a missing case would be the one with reactivation +/- delay who subsequently transmited the infection to the new cases identified in our study as new entrances in the cluster (Fig. 2 b). Although the new entrances identified in these cluster are merely candidates of reactivations we looked for data supporting this hypothesis in their epidemiological and clinical charts and discussed the findings within the interdisciplinary team. We found reasons to justify reactivation in 12 of them (80%). Regarding those cases with certain diagnostic delay, findings consistent with it were also found in five (83.3%). Prolonged diagnostic delay/Subclinical tuberculosis In eight additional cases, the number of intermediate SNPs between them and their preceding cases in the network was higher (4–8 SNPs). Also, the amount of intermediate SNPs corresponded to around those expected (considering the rate of acquisition of SNPs in MTB). It may suggest the acquisition of SNP diversity by MTB during the whole period between newly diagnosed case and the case preceding him. We considered them candidates to have experienced a likely prolonged diagnostic delay or subclinical TB (Fig. 3 ). As in the previous category, prolonged diagnostic delay could have occurred in the new cases identified in our study entering in the network or, alternatively, it could have occurred in missed cases preceding them. Therefore, our candidates were confronted with the clinical/epidemiological revision within our interdisciplinary team, to evaluate if we could confirm our hypothesis in some of them. In 7 of the cases (87.5%), evidences were found, supporting prolonged diagnostic delay/subclinical TB. Monoresistance as a proxy to suspect diagnostic delay From the candidates corresponding to prolonged diagnostic delay, we identified 2 cases with mono-resistance to fluoroquinolones (mono-FQ-R) (one of them is shown in Fig. 3 B, Cluster 15). This new finding reinforced our interpretation of diagnostic delay for the involved cases, as both had received this drug to treat infections other than TB, before the diagnosis of TB. By receiving this drug when TB had still not been diagnosed, a monotherapy treatment could have occurred, leading to the acquisition of resistance. Therefore, mono-resistance could be used as a proxy for suspect diagnostic delay. Impact of diagnostic delay on secondary cases Prolonged diagnostic delay could be responsible of unknown exposures to other cases. Among our selection of candidate cases for prolonged diagnostic delay, we identified three representatives that allowed us to assess a progressive impact on the number of secondary cases due to the delays leading to i) no secondary cases, ii) limited within-household exposures and iii) higher impact on secondary cases at the community setting. In the first example (Fig. 3 , Cluster 30), the diagnosis of Case 3219 in 2023 was fortuitous, as the radiology showing findings compatible with TB was taken because of a car accident, not because of respiratory symptoms. In addition, once identified in the network the signatures suggesting prolonged diagnostic delay, a revision of his clinical chart also revealed findings compatible with TB in a previous radiology, taken five years before his diagnosis. This radiology was requested following the standard protocol before a surgery procedure; however, the radiology was not checked by anyone as the intervention was minor and it had been therefore requested unnecessarily. In this case, despite his prolonged diagnostic delay, no secondary cases were caused, as deduced by the absence of other cases in the network along his branch, that is, along the period of within-host evolution of the strain before diagnosis. A revision of the case within the interdisciplinary team indicated that the individual had an extremely low degree of social interaction, which may explain the lack of exposure to other cases. The second example (Fig. 4 , Cluster 15) corresponded to a prolonged diagnostic delay causing two sequential exposures to the same family member, who shared a household. Case 3034 accumulated two features supporting prolonged diagnostic delay i) several SNPs before his diagnosis in the genomic network branch in which he was located, and ii) FQ-mono-R at his diagnosis. He had received this drug before, several times, to treat a long-term renal/urinary disease. From the analysis of the network, the most likely explanation is that Case 2427 was exposed twice to Case 3034, once recently, explaining the FQ-mono-R in the 2023 isolate, and the other at earlier stages of his prolonged diagnostic delay, causing the infection of Case 2427 in 2018, before having acquired the strain the FQ-R. The interpretation of the impact of the last example of prolonged diagnostic delay on secondary cases was much more challenging as it did not arise from the observation of a case with several SNPs preceding his diagnosis but from his clinical findings and epidemiological analysis of the history of the case (Fig. 4 , Cluster 789). Case 3292 at diagnosis presented with a clinical status suggestive of a long-term evolved TB, namely a severe weight loss over the last years, and radiology with extensive damage. He had been the roommate of Case 2418 when he was diagnosed with TB six years ago (2018) but he refused to be studied. Since then, Case 3292 had not attended any health care centre (he is a social outsider) until his recent diagnosis, in 2024. Supported now on all these clinical and epidemiological data, despite Case 3292 being located at the end of a branch, the most likely interpretation is that he is not the result of exposures to the cases preceding him in the branch, but most likely, he acquired TB from Case 2418. Since then, the strain in Case 3292 evolved along with his prolonged diagnostic delay. This means that cases 2536, 2570, 2678, and 3204 could be considered as secondary cases of direct/indirect exposures to Case 3292 while the strain was progressively acquiring the 3 SNPs identified along the branch. An epidemiological investigation based on this hypothesis demonstrated that all these cases lived in the same neighbourhood and were part of the same drug-dealing chain. Clusters growing by the coincidence of several factors In 6 clusters we observed the entrance of several new clustered cases, for the period 2021-24, candidates for different explanations behind each one, among those presented before as reasons for clusters growing, i.e. ongoing transmission, reactivations, or diagnostic delays (Fig. 4 ). Discussion WGS provides a better delineation of transmission clusters in MTB than traditional molecular methods. Previous studies noted the higher accuracy of this tool to infer transmission, meaning that, currently, WGS is a key tool for epidemiological research in TB ( 13 ). The first deliverable in any genomic epidemiology study in TB involving standard populations is to assess the proportion of clustered cases, based on the SNP thresholds already validated to consider transmission, or even very recent transmission ( 1 ). This allows us to offer a snapshot of the TB transmission patterns and dynamics in a population and evaluate the efficiency of control programs. Among many other studies, previous studies in Brazil identified rates of very recent transmission of 43.7% ( 3 , 14 ). However, in socio-epidemiologically complex populations, such as those with a high rate of migrant population, we must go further, as the reasons involved in TB prevalence are multifactorial. This justifies that we need more in-depth knowledge of the reasons behind each new TB diagnosis. One example of a complex population is Almeria, a province in south-east Spain. The high rate of TB in migrant cases (80% in 2023) leads to a concentration of TB cases in these vulnerable groups, as described in many other similar settings ( 15 ). Previous findings of our team ( 4 ) demonstrate the overlapping of imported TB cases with others acquiring TB after arrival, the existence of transmission between autochthonous and migrant cases, and finally, reactivations, triggered by the substandard living conditions of the migrant population. All these factors, explain why we need to make a more exhaustive analysis, beyond the assessment of the proportion of the cases that are clustered cases. In this study, we decided to exploit the genomic data beyond the quantitative determination of the number of SNPs between the cases, which is applied to rule in or out clustering; by paying attention also to the evolutionary findings in each cluster. It means performing an analysis of the distribution of the differential SNPs identified between the cases in a cluster, after positioning the cases in a genomic network This more refined analysis does not exclude that our first aim was to determine the cases in Almería that were clustered; nevertheless, as the philosophy of our program is to intervene to control transmission, we were especially interested in analyzing in detail the current situation, which justifies why we focused on the last 3.5 years. We identified that 106 of the cases diagnosed in the last 3.5 years were involved in clusters. As we count on genomic data from the clustered cases for the last 21 years, we could determine that 56.6% of these cases fed clusters that had already been identified, before the last 3 years, while the remaining 43.4% corresponded to those involved in new clusters, non identified before this period. The first interpretation was that we were facing a deficiency to control historical clusters which were still growing due to ongoing transmission. This was the first question to be answered by our evolutionary analysis of genomic networks, by evaluating whether the cases incorporated into pre-existing clusters were consistent with the topology in the network expected for ongoing transmission. Compared to the findings in new clusters, where most of the new entrances are due to ongoing transmission, only one-third of them in old growing clusters corresponded to this reason. This suggests that control efforts for previously identified clusters are appropriate. Moreover, in a major proportion of the old clusters including new cases we detected a potential involvement of reactivations of past exposures. Other studies also considered as potential reactivations those cases linked to an index case in the remote past with few SNPs differences between them ( 16 ). This means that other factors beyond the epidemiological control of recent transmission are responsible for the growing of certain clusters, among them, social factors. This highlights the current necessity to increase social intervention, and not exclusively make efforts to minimize ongoing transmission. Several socio-clinical factors were finally found in most of these candidate cases, reinforcing the appropriateness of having classified them as reactivations from the analysis of their genomic networks. If social factors emerge as relevant as additional triggers for TB, we can´t avoid considering the impact that the COVID-19 pandemic could have had on TB dynamics. It might also influence the proportion of clustered cases in our study that are considered candidates of reactivations. Some studies reported an association between COVID-19-related immune suppression and the reactivation of TB, suggesting that COVID-19 might accelerate the progression from latent to active TB ( 17 ). In addition, we also expect a more obvious and direct effect of the pandemic on TB transmission. Cluster 789 (Fig. 1 ) constitutes an excellent example to illustrate a markedly asymmetric number of secondary cases caused by the same index case and the same strain either before or after the pandemic. Another relevant interpretation extracted from our analysis of the genomic networks is the identification of clusters with one branch connecting two cases diagnosed several years apart, with several SNPs in between, indicative of the involvement of cases with bacterial viability/evolution, and therefore leading to the acquisition of diversity, before diagnosis. Alternatively to our interpretation that the intermediate SNPs could indicate diagnostic delay, it has been proposed that diversity could occur during latency with a similar mutation rate than in active TB ( 18 , 19 ), although these findings are controversial, and other studies estimate that the replication rate is lower during latent disease ( 20 ). Recent studies still highlighted the discrepancies in determining the mutation rate under latency ( 16 ). In our study, we have considered as the clearest clusters candidates to involve diagnostic delay those in which the number of SNPs in the branch connecting a new case with the preceding case in the network corresponded to those expected for a mutation rate of 1 SNP/2–3 years ( 21 ). This means a prolonged diagnostic delay, covering the whole period between the diagnoses of the two cases connected by that branch. However, this might be underestimating the true proportion of diagnostic delays, because a proportion of them might be not so prolonged and therefore lead to a lower number of intermediate SNPs in the involved branch. In this sense, we observed examples of these possible shorter diagnostic delays in some of the clustered cases interpreted as candidates to reactivations, in which we detected some SNPs (2–3 SNPs) in the branches connecting them with their preceding cases in the same branch. In fact, further investigation revealed findings consistent with diagnostic delay not only for those more prolonged examples but also in most of the shorter diagnostic delays. It is also possible that the bacterial evolution leading to the higher than expected diversity identified in some of these clusters might have occurred in other non-diagnosed intermediate cases, which participated in the cluster but still remain undiagnosed. On the other hand, these likely interpretations and the fact that different possibilities could be involved in the same case or cluster, highlight the challenge in the classification of these events. In the revision of clinical charts looking for clues of non-diagnosed TB for the cases proposed to be candidates of diagnostic delay/subclinical TB, we identified that mono-FQ-R may be used also as a proxy to consider them; especially, when it is identified in only one member of a cluster. TB patients with prior FQ prescription to TB diagnosis, usually to treat community-acquired pneumonia, have a three-fold higher risk of having FQ-resistant TB ( 22 ). Multiple FQ prescriptions, FQ prescription more than 60 days prior to TB diagnosis and for more than 10 days are associated with FQ-resistant TB ( 23 , 24 ). Several studies highlighted the role of subclinical TB in transmission to secondary cases ( 13 , 25 ). We also evaluated it and found that the epidemiological circumstances of each case with prolonged diagnostic delay/subclinical TB are determinant, and may lead to a wide range of impacts on the emergence of secondary cases. We must acknowledge that all the interpretations inferred from the study of the cluster genomic networks led us only to propose the involvement of reactivations, diagnostic delays/subclinical TB in those clusters. It does not mean that these features are associated to the new cases entering in the cluster; they could also correspond to cases preceding them that have subsequently transmitted recently the infection to these new cases in our study. It is possible that these potential preceding cases were not included in the network because they are missed cases in our study because they have not been sequenced or even diagnosed. It is true that ours is a program running since 2003 which covers the whole population in Almeria, and the figures of the percentage of diagnosed TB cases with culture available and sequence available are high (XXXXX). However. in a population enriched in migrants, such as ours, the interterritorial mobility of these individuals leads us to acknowledge the possibility of missed cases. For this reason, the cases in our study are considered only as candidates to correspond to reactivations/delays and we always need additional clinical/epidemiological data before validating them. For this it is essential to couple our refined genomic analysis with an equivalent refined analysis of the clinical, social, and epidemiological data from each new clustered case. This would not be possible without the intervention of our multidisciplinary team discussing together every new case, and re-interviewing the cases guided by the genomic data looking for additional data to support our hypothesis. The identification of the involvement of reactivation or diagnostic delay in our clusters, especially when candidates are finally validated, allows us to design and propose specific intervention strategies according to the true nature of each clustered case, to be added to the standard control measures systematically applied on every cluster. For example, when reactivations are validated, the epidemiological intervention recommended to add was aimed to minimize the future emergence of secondary cases, by optimizing preventive treatment of latent tuberculosis infection in the cases´ contexts. In case of diagnostic delays, the epidemiological intervention should also include a more active search for potential secondary cases that may have occurred during the time the patient was undiagnosed. In some clusters we identify independent entrances in different branches/locations in the network, leading to with a different possible explanation behind; this led us to recommend interventions that are not systematic for a cluster but instead are case-specific. Once understood the usefulness of this interpretative approach to exploit genome data, the following steps involve accelerating the availability of this analysis to integrate them with the epidemiological investigation in real-time. The general approach applied in genomic analysis in TB is to follow a high-throughput scheme, accumulating a high number of isolates in the same run looking for lower costs. We are applying a faster approach based on nanopore single sequencing from the primary culture of each new incident case ( 26 ). One-to-one analysis offers an alternative scheme adapted to the early coupling of genomic, epidemiological, and clinical data until the fastest approach, sequencing directly on the respiratory specimens, which remains challenging ( 27 , 28 ). Understanding the singularities behind each new clustered case by means of an evolutionary analysis of the differential SNPs observed in each cluster will help us to understand the true complexity behind them. This will allow us to better understand the reasons behind TB persistence in our populations and to differentiate correctly the new cases due to ongoing transmission from those resulting from reactivations of exposures in the past. This same analysis will shed light on the hidden burden of TB due to prolonged diagnostic delays or subclinical TB. All this information may have a paramount importance in tailoring specific interventions to maximize success in our purpose to control TB. Declarations Ethical Statement This study was approved by Junta de Andalucía Ethical Committee (References 60/2017 and 98/2023). All sequences were encrypted to anonymize any associated personal information. Written informed consent was obtained from the participants. Acknowledgements This study funded by ISCIII [PI21/01823, PI19/00331, Miguel Servet Contract (CPII20/00001) to LPL; PFIS contracts to CRG (FI20/00129) and SBS (FI21/00145)], IiSGM (2021-II-PI-01 to DGV), SEPAR 2023 (1401/2023), Junta de Andalucía (AP-0062-2021-C2-F2), and co-financed by ERDF: “A way of making Europe”. This study based upon work from COST Action Advance TB (CA21164), supported by COST (European Cooperation in Science and Technology). References Walker TM, Ip CLC, Harrell RH, Evans JT, Kapatai G, Dedicoat MJ, et al. Whole-genome sequencing to delineate Mycobacterium tuberculosis outbreaks: A retrospective observational study. Lancet Infect Dis. 2013 Feb;13(2):137–46. Meehan CJ, Goig GA, Kohl TA, Verboven L, Dippenaar A, Ezewudo M, et al. Whole genome sequencing of Mycobacterium tuberculosis: current standards and open issues. Vol. 17, Nature Reviews Microbiology. Nature Publishing Group; 2019. p. 533–45. Salvato RS, Reis AJ, Schiefelbein SH, Gómez MAA, Salvato SS, da Silva LV, et al. Genomic-based surveillance reveals high ongoing transmission of multi-drug-resistant Mycobacterium tuberculosis in Southern Brazil. Int J Antimicrob Agents. 2021 Oct 1;58(4). Abascal E, Pérez-Lago L, Martínez-Lirola M, Chiner-Oms Á, Herranz M, Chaoui I, et al. Whole genome sequencing-based analysis of tuberculosis (TB) in migrants: Rapid tools for crossborder surveillance and to distinguish between recent transmission in the host country and new importations. Eurosurveillance. 2019 Jan 24;24(4). Guthrie JL, Strudwick L, Roberts B, Allen M, McFadzen J, Roth D, et al. Comparison of routine field epidemiology and whole genome sequencing to identify tuberculosis transmission in a remote setting. Epidemiol Infect. 2020. Walker TM, Monk P, Smith EG, Peto TEA. Contact investigations for outbreaks of Mycobacterium tuberculosis: Advances through whole genome sequencing. Vol. 19, Clinical Microbiology and Infection. Blackwell Publishing Ltd; 2013. p. 796–802. Alonso M, Herranz M, Lirola MM, Gonzaĺez-Rivera M, Bouza E, De Viedmaa DG. Real-time molecular epidemiology of tuberculosis by direct genotyping of smear-positive clinical specimens. J Clin Microbiol. 2012 May;50(5):1755–7. Supply P, Allix C, Lesjean S, Cardoso-Oelemann M, Rüsch-Gerdes S, Willery E, et al. Proposal for standardization of optimized mycobacterial interspersed repetitive unit-variable-number tandem repeat typing of Mycobacterium tuberculosis. J Clin Microbiol. 2006 Dec;44(12):4498–510. Martínez-Lirola M, Herranz M, Serrano SB, Rodríguez-Grande C, Inarra ED, Garrido-Cárdenas JA, et al. A One Health approach revealed the long-term role of Mycobacterium caprae as the hidden cause of human tuberculosis in a region of Spain, 2003 to 2022. Eurosurveillance. 2023;28(12):1–11. doi: 10.2807/1560-7917.ES.2023.28.12.2200852. Comas Ĩ, Chakravartti J, Small PM, Galagan J, Niemann S, Kremer K, et al. Human T cell epitopes of Mycobacterium tuberculosis are evolutionarily hyperconserved. Nat Genet. 2010;42(6):498–503. Ai JW, Ruan QL, Liu QH, Zhang WH. Updates on the risk factors for latent tuberculosis reactivation and their managements. Emerg Microbes Infect. 2016;5(November 2015):e10. Grupo de trabajo Plan Prevención y Control de la Tuberculosis. Plan para la prevencion y control de la Tuberculosis en España. Comisión de Salud Pública el Consejo Interterritorial del Sistema Nacional de Salud. Ministerio de Sanidad, Consumo y Bienestar Social [Internet]. 2019. Available from: https://www.mscbs.gob.es/profesionales/saludPublica/prevPromocion/PlanTuberculosis/docs/PlanTB2019. Xu Y, Cancino-Munoz I, Torres-Puente M, Villamayor LM, Borrás R, Borrás-Máñez M, et al. High-resolution mapping of tuberculosis transmission: Whole genome sequencing and phylogenetic modelling of a cohort from Valencia Region, Spain. PLoS Med. 2019;16(10). Verza M, Scheffer MC, Salvato RS, Schorner MA, Barazzetti FH, Machado H de M, et al. Genomic epidemiology of Mycobacterium tuberculosis in Santa Catarina, Southern Brazil. Sci Rep. 2020 Dec 1;10(1). Litvinjenko S, Magwood O, Wu S, Wei X. Burden of tuberculosis among vulnerable populations worldwide: an overview of systematic reviews. Lancet Infect Dis. 2023 Dec 1;23(12):1395–407. Nelson KN, Talarico S, Poonja S, McDaniel CJ, Cilnis M, Chang AH, et al. Mutation of Mycobacterium tuberculosis and Implications for Using Whole-Genome Sequencing for Investigating Recent Tuberculosis Transmission. Front Public Heal. 2022 Jan 13;9. Almatrafi MA, Awad K, Alsahaf N, Tayeb S, Alharthi A, Rabie N, et al. Disseminated Tuberculosis Post COVID-19 Infection: A Case Report. Cureus. 2022 Nov 14. Ford CB, Lin PL, Chase MR, Shah RR, Iartchouk O, Galagan J, et al. Use of whole genome sequencing to estimate the mutation rate of Mycobacterium tuberculosis during latent infection. Nat Genet. 2011 May;43(5):482–8. Lillebaek T, Norman A, Rasmussen EM, Marvig RL, Folkvardsen DB, Andersen ÅB, et al. Substantial molecular evolution and mutation rates in prolonged latent Mycobacterium tuberculosis infection in humans. Int J Med Microbiol. 2016 Nov 1;306(7):580–5. Colangeli R, Arcus VL, Cursons RT, Ruthe A, Karalus N, Coley K, et al. Whole genome sequencing of Mycobacterium tuberculosis reveals slow growth and low mutation rates during latent infections in humans. PLoS One. 2014 Mar 11;9(3). Bryant JM, Schürch AC, van Deutekom H, Harris SR, de Beer JL, de Jager V, et al. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data. BMC Infect Dis. 2013 Feb 27;13(1). Migliori GB, Langendam MW, D’Ambrosio L, Centis R, Blasi F, Huitric E, et al. Protecting the tuberculosis drug pipeline: Stating the case for the rational use of fluoroquinolones. Vol. 40, European Respiratory Journal. 2012. p. 814–22. Devasia RA, Blackman A, Gebretsadik T, Griffin M, Shintani A, May C, et al. Fluoroquinolone resistance in Mycobacterium tuberculosis: The effect of duration and timing of fluoroquinolone exposure. Am J Respir Crit Care Med. 2009 Aug 15;180(4):365–70. Long R, Chong H, Hoeppner V, Shanmuganathan H, Kowalewska-Grochowska K, Shandro C, et al. Empirical treatment of community-acquired pneumonia and the development of fluoroquinolone-resistant tuberculosis. Clin Infect Dis. 2009 May 15;48(10):1354–60. Yin J, Yan G, Qin L, Fan J, Zhu C, Li Y, et al. Genomic investigation of bone tuberculosis highlighted the role of subclinical pulmonary tuberculosis in transmission. Tuberculosis. 2023, Sep 148. doi: 10.1016/j.tube.2024.102534. Sanz-Pérez A, Rodríguez-Grande C, Buenestado-Serrano S, Martínez-Lirola M, Herranz-Martín M, Peñas-Utrilla D, et al. Reducing delays in the genomic epidemiology of tuberculosis: a flexible and decentralized analysis of each incident case. Unpublished. 2024. Goig GA, Cancino-Muñoz I, Torres-Puente M, Villamayor LM, Navarro D, Borrás R, et al. Whole-genome sequencing of Mycobacterium tuberculosis directly from clinical samples for high-resolution genomic epidemiology and drug resistance surveillance: an observational study. The Lancet Microbe. 2020 Aug 1;1(4):e175–83. Nilgiriwala K, Rabodoarivelo MS, Hall MB, Patel G, Mandal A, Mishra S, et al. Genomic Sequencing from Sputum for Tuberculosis Disease Diagnosis, Lineage Determination, and Drug Susceptibility Prediction. J Clin Microbiol. 2023 Mar 23;61(3):e0157822. Additional Declarations The authors declare no competing interests. Supplementary Files SupplNetworksALMFINAL.pdf Supplementary figure Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6057121","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":418481787,"identity":"9de2e438-65ad-4bd1-9d3c-31e06ba55632","order_by":0,"name":"Cristina Rodríguez-Grande","email":"","orcid":"https://orcid.org/0000-0002-2468-8810","institution":"Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón. Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain","correspondingAuthor":false,"prefix":"","firstName":"Cristina","middleName":"","lastName":"Rodríguez-Grande","suffix":""},{"id":418481788,"identity":"1cdd29a1-4887-45a5-8c00-8656657da57b","order_by":1,"name":"Silvia Vallejo-Godoy","email":"","orcid":"https://orcid.org/0000-0002-5556-1951","institution":"Servicio de Medicina Preventiva, Salud Pública y Vigilancia Epidemiológica, Hospital Universitario Poniente, Almería, Spain","correspondingAuthor":false,"prefix":"","firstName":"Silvia","middleName":"","lastName":"Vallejo-Godoy","suffix":""},{"id":418481789,"identity":"35837d18-61e0-4a36-9279-f70bc5c169ea","order_by":2,"name":"Miguel Martínez-Lirola","email":"","orcid":"","institution":"Complejo Universitario Hospitalario Torrecárdenas, Almería, Spain","correspondingAuthor":false,"prefix":"","firstName":"Miguel","middleName":"","lastName":"Martínez-Lirola","suffix":""},{"id":418481790,"identity":"36c1213f-7be4-45a2-9ccd-2567d3ce0ee1","order_by":3,"name":"Sheri M Saleeb","email":"","orcid":"https://orcid.org/0000-0002-1658-5854","institution":"Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón. Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain","correspondingAuthor":false,"prefix":"","firstName":"Sheri","middleName":"M","lastName":"Saleeb","suffix":""},{"id":418481791,"identity":"70ec98b8-f43a-4ee6-a21a-2bb0be1e13c2","order_by":4,"name":"Sergio Buenestado-Serrano","email":"","orcid":"https://orcid.org/0000-0003-1739-4459","institution":"Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón. Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain","correspondingAuthor":false,"prefix":"","firstName":"Sergio","middleName":"","lastName":"Buenestado-Serrano","suffix":""},{"id":418488327,"identity":"23e55dc8-21d3-4038-b148-397d785cf42c","order_by":5,"name":"Pilar Barroso-García","email":"","orcid":"","institution":"pidemiología. Distrito Sanitario Almería, Spain","correspondingAuthor":false,"prefix":"","firstName":"Pilar","middleName":"","lastName":"Barroso-García","suffix":""},{"id":418505374,"identity":"0aacd3fa-3aef-46e3-a63e-b7974f0ae351","order_by":6,"name":"Senay Rueda Nieto","email":"","orcid":"","institution":"Servicio de Medicina Preventiva, Salud Pública y Vigilancia Epidemiológica, Hospital Universitario Poniente, Almería, Spain.","correspondingAuthor":false,"prefix":"","firstName":"Senay","middleName":"Rueda","lastName":"Nieto","suffix":""},{"id":418505375,"identity":"0f0d876d-6045-4eed-a16a-222875cce27c","order_by":7,"name":"Francisca Escabias-Machuca","email":"","orcid":"","institution":"Epidemiología. AGS Norte de Almería, Almería, Spain","correspondingAuthor":false,"prefix":"","firstName":"Francisca","middleName":"","lastName":"Escabias-Machuca","suffix":""},{"id":418505376,"identity":"11d3b267-43c7-415d-970c-bc57762cdcda","order_by":8,"name":"Ana Belén Esteban García","email":"","orcid":"","institution":"Servicio de Análisis de Ácidos Nucleicos, Servicios Centrales de Investigación de la Universidad de Almería, Almería, Spain","correspondingAuthor":false,"prefix":"","firstName":"Ana","middleName":"Belén Esteban","lastName":"García","suffix":""},{"id":418505377,"identity":"158c0920-1e2d-4ab6-bf13-76fa389b830d","order_by":9,"name":"María Teresa Cabezas Fernández","email":"","orcid":"","institution":"Complejo Universitario Hospitalario Torrecárdenas, Almería, Spain","correspondingAuthor":false,"prefix":"","firstName":"María","middleName":"Teresa Cabezas","lastName":"Fernández","suffix":""},{"id":418505378,"identity":"788fe23a-071f-4442-a24d-0b99b96850b8","order_by":10,"name":"José Antonio Garrido-Cárdenas","email":"","orcid":"","institution":"Departamento de Biología y Geología, Universidad de Almería, Almería, Spain","correspondingAuthor":false,"prefix":"","firstName":"José","middleName":"Antonio","lastName":"Garrido-Cárdenas","suffix":""},{"id":418505379,"identity":"45eef4a7-8660-4a75-b510-374bb8d807bb","order_by":11,"name":"Patricia Muñoz","email":"","orcid":"","institution":"Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón. Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain","correspondingAuthor":false,"prefix":"","firstName":"Patricia","middleName":"","lastName":"Muñoz","suffix":""},{"id":418505380,"identity":"a7a50bc8-f967-444d-95f1-f0cdab211486","order_by":12,"name":"Laura Pérez-Lago","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA2UlEQVRIiWNgGAWjYBACPijNw8beAKQMLAhrYYPScvw8B0BaJIjXYiw5IwFEE6NFuvnYhw9/ahM33Hx+dcOPAgkG/vbuBPxaZI4lz5zBczxxw+2csps9QIdJnDm7Ab8WiRxjZh6JYyAtaTd4gFoMJHIJacn/zPzHAKjl5pm0m3+I05LDzMyQUAP0Pvux28TZInPMmLHnwAFgIOew3ZYxkOAh6Bd+6ebHDD/+1AGj8vizm2/+2Mjxt/fi1wKNiMNAzGMAYvHgV47QUgfE7A8Iqx4Fo2AUjIIRCQCh0EWQnyaDPQAAAABJRU5ErkJggg==","orcid":"","institution":"ervicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón. Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain","correspondingAuthor":true,"prefix":"","firstName":"Laura","middleName":"","lastName":"Pérez-Lago","suffix":""},{"id":418505381,"identity":"32fc9fe0-f1fd-42a0-be56-53454e5aaa8e","order_by":13,"name":"Darío García de Viedma","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA40lEQVRIiWNgGAWjYDCCAyDExsDAz94A5BlYkKBFsucASIsEcVoYQFoMbiSAWERo4Tu/xvDAh7Jt8gw3n1/d8KNAgoG/vTsBrxbJG28MDs44d9uwcXZO2c0eoMMkzpzdgFeLwY1jCYd5224zNkvnpN3gAWoxkMglQsvfttv2bZJn0m7+IUrL+eYDhxnbbif2SLAfu02ULZI3mA8c7Dl3O3kGTw7bbRkDCR6CfuE7f7D5w4+y27b7jx9/dvPNHxs5/vZe/FoYJBJgLB4DMIlfOQjwH4Cx2B8QVj0KRsEoGAUjEgAAtpJUUd09JXQAAAAASUVORK5CYII=","orcid":"","institution":"ervicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón. Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain","correspondingAuthor":true,"prefix":"","firstName":"Darío","middleName":"García","lastName":"de Viedma","suffix":""}],"badges":[],"createdAt":"2025-02-18 14:35:44","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-6057121/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6057121/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":77668196,"identity":"cf4090bf-d09b-46e6-802a-9bccff264d24","added_by":"auto","created_at":"2025-03-04 06:29:23","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":110612,"visible":true,"origin":"","legend":"\u003cp\u003eExamples of clusters representative of ongoing transmissions (those diagnosed in 2021-24 are highlighted as shaded). The first number corresponds to the case and the second one to the diagnosis year. In the genomic networks, black dots represent SNPs, and cases within the same box correspond to identical strains (0 SNPs between them). As indicated by the dashed lines, clusters 789 and 786 represent only a section of a larger pre-existing cluster.\u003c/p\u003e","description":"","filename":"Figure1paperLA.png","url":"https://assets-eu.researchsquare.com/files/rs-6057121/v1/83775b008903408b0931dcac.png"},{"id":77670429,"identity":"2f7f4424-5534-4fa3-9528-35eb1780fdfb","added_by":"auto","created_at":"2025-03-04 06:53:28","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":127392,"visible":true,"origin":"","legend":"\u003cp\u003eA) Examples of clusters representative of involvement of reactivations (clusters 1482 and 15) and reactivations with a certain period of diagnostic delay (clusters 1330 and 1348) (those diagnosed in 2021-24 are highlighted as shaded). The first number corresponds to the case and the second one to the diagnosis year. In the genomic networks, black dots represent SNPs, and cases within the same box correspond to identical strains (0 SNPs between them). As indicated by the dashed lines, clusters 1482, 15 and 1330 represent only a section of a larger pre-existing cluster. B) hypothetical example of a missing case responsible for the reactivation +/- diagnostic delay features identified in the clusters in A), who preceded and therefore infected to the new case diagnosed entering in the cluster.\u003c/p\u003e","description":"","filename":"Figure2paperLA.png","url":"https://assets-eu.researchsquare.com/files/rs-6057121/v1/a270c2aa2ffaba32c1764790.png"},{"id":77668197,"identity":"c70f53fe-bf86-4fe0-82d0-baf2462b0757","added_by":"auto","created_at":"2025-03-04 06:29:23","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":65885,"visible":true,"origin":"","legend":"\u003cp\u003eExamples of clusters including cases candidates for prolonged diagnostic delay or subclinical TB (highlighted as shaded). As indicated by the dashed lines, cluster 30 represents only a section of a larger pre-existing cluster.\u003c/p\u003e","description":"","filename":"Figure3paperLA.png","url":"https://assets-eu.researchsquare.com/files/rs-6057121/v1/4fb8031c9eccc52707d4de9e.png"},{"id":77668202,"identity":"e5695642-461a-429e-a01c-a6951c02ba43","added_by":"auto","created_at":"2025-03-04 06:29:23","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":96622,"visible":true,"origin":"","legend":"\u003cp\u003eExamples of clusters including cases candidates for prolonged diagnostic delay or subclinical TB (highlighted as shaded) causing different impacts on resulting secondary cases: A) limited sequential household exposures; B) higher number of community secondary cases. The first number corresponds to the case and the second one to the year of diagnosis. In the genomic networks, black dots represent SNPs, and cases within the same box correspond to identical strains (0 SNPs between them). As indicated by the dashed lines, clusters 15 and 789 represent only a section of a larger pre-existing cluster.\u003c/p\u003e","description":"","filename":"Figure4paperLA.png","url":"https://assets-eu.researchsquare.com/files/rs-6057121/v1/025a888f3ece029f34c8b58b.png"},{"id":77668442,"identity":"bf4afec5-7dfa-457b-b0c3-302f076d4c2d","added_by":"auto","created_at":"2025-03-04 06:37:23","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":127560,"visible":true,"origin":"","legend":"\u003cp\u003eExamples of multifactorial clusters, including new clustered cases (highlighted as shaded) representing ongoing transmission, reactivation, or diagnostic delay in the same cluster. The first number corresponds to the case and the second one to the year of diagnosis. In the genomic networks, black dots represent SNPs, and cases within the same box correspond to identical strains (0 SNPs between them).\u003c/p\u003e","description":"","filename":"Figure5paperLA.png","url":"https://assets-eu.researchsquare.com/files/rs-6057121/v1/a5bf1a24fc6e0c14fd928f56.png"},{"id":77670631,"identity":"2b87df39-8d7f-43d8-bc2b-c5d9b1dd4952","added_by":"auto","created_at":"2025-03-04 06:53:42","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1180946,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6057121/v1/2f9a95d7-6a79-4f9a-9cd9-3d5134a0df53.pdf"},{"id":77668198,"identity":"1c5b6f24-9231-4162-852c-02cd6311c22e","added_by":"auto","created_at":"2025-03-04 06:29:23","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":673279,"visible":true,"origin":"","legend":"\u003cp\u003eSupplementary figure\u003c/p\u003e","description":"","filename":"SupplNetworksALMFINAL.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6057121/v1/d23acdb95e86baea58545994.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003e\u003cstrong\u003eA long-term refined genomic analysis of tuberculosis clusters to discriminate between ongoing transmission, reactivations or diagnostic delays\u003c/strong\u003e\u003c/p\u003e","fulltext":[{"header":"Introduction","content":"\u003cp\u003eWhole genome sequencing (WGS) has revolutionized how we can study the dynamics of tuberculosis (TB) transmission. Since the definition of the genomic thresholds to be applied to define clusters (pair-wise distance between two strains\u0026thinsp;\u0026lt;\u0026thinsp;12 SNPs) (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e) and identify among them those more likely related to recent transmission (\u0026lt;\u0026thinsp;5 SNPs) (\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e). These criteria have been applied in many settings worldwide (\u003cspan additionalcitationids=\"CR4\" citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe general purpose behind the identification of clusters supported in genomics is to identify the epidemiological contexts where ongoing transmission is occurring, to target them by control interventions to finally control transmission and the subsequent emergence of new secondary cases. However, the SNPs data can also be exploited to extract additional epidemiological valuable information. The transmission chronology or the most likely person-to-person transmission relationships, among the cases sharing a cluster, could be inferred from a more detailed analysis of how the differential SNPs are distributed between the cases sharing the same cluster (\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eTo obtain additional information to the mere description of whether cases are part of the same cluster or not is especially required in socio epidemiologically complex populations. In these settings, transmission dynamics are especially complex, and tailored interventions need to be adapted to each situation, to increase the success of the control measured. This is the case of Almeria, a population in Southeast Spain with 67.1% of the total cases of TB involving migrants from 51 different nationalities (2003\u0026ndash;2024), where the incident rates are 4\u0026ndash;5 fold higher than in the rest of Spain.\u003c/p\u003e \u003cp\u003eWe have obtained the genomic data for TB clusters in Almeria for the last 21 years (2003-24). From these data, we have differentiated between clusters that are currently active, from those that have not caused new secondary cases in the last 3.5 years. Then, we analyzed the genomic networks of the active clusters to interpret the most likely reasons behind their growth and propose tailored interventions according to the nature of each cluster. In addition to clusters growing due to ongoing transmission, we have identified a proportion with other reasons behind them, such as reactivations of old exposures, diagnostic delays, or subclinical disease, which allows us to intervene specifically, optimizing the limited control resources that should be devoted to control transmission hot-spots.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eGeneral analytical scheme\u003c/h2\u003e \u003cp\u003eThe analysis in Almer\u0026iacute;a is based on a two-step molecular/genomic analysis. For every stain-positive TB case diagnosed, MIRU-VNTR analysis is performed directly on clinical samples, which allows us to rule out non-clustered, orphan, cases. Then, the isolates involved in MIRU-clusters are sequenced by Illumina standard approaches.\u003c/p\u003e \u003cp\u003eAll new cases confirmed to be part of a cluster after genomic analysis are discussed weekly in online meetings involving the microbiologists responsible for the diagnostics tasks, epidemiologists, and the personnel responsible for the genomic analysis, with the periodical support, when required, of the clinicians managing the cases.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eMIRU-VNTR analysis\u003c/h3\u003e\n\u003cp\u003eDNA for MIRU-VNTR was purified from the clinical sample after heating inactivation, using the GXT NA Extraction Kit (Hain Lifescience, Nehren, Germany), following the manufacturer's instructions.\u003c/p\u003e \u003cp\u003e24-MIRU-VNTR typing was done as described previously (\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e) from purified DNA from clinical specimens, except for the PCR final volume (20 \u0026micro;l for each reaction) and the number of cycles in PCR (45 cycles). Briefly, PCR was performed using eight triplex PCR reactions. The standard protocol followed using the multiplex PCR kit (Qiagen) for mixes 1, 2, 3, 5, 6 and 8 (\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e). For mixes 4 and 7, PuReTaq Ready-To-Go PCR beads (GE Healthcare, Chicago, USA) were used adding 0.5 mM of each primer and 3% dimethyl sulfoxide (DMSO). When amplification was deficient for some locus, simplex PCR was applied. PCR fragment sizing was performed using capillary electrophoresis in a 3500 Genetic Analyzer (Thermo Fisher Scientific, Waltham, USA). MIRU-VNTR alleles were assigned with GeneMapper v5 (Thermo Fisher Scientific, Waltham, USA).\u003c/p\u003e\n\u003ch3\u003eWhole genome sequencing\u003c/h3\u003e\n\u003cp\u003eDNA for sequencing was purified from Mycobacteria Growth Indicator Tube (MGIT) subcultures using the QIAamp DNA mini kit (Qiagen, Courtaboeuf, France) following the manufacturer's instructions after boiling for 10 minutes and pre-incubation with proteinase K (20 mg/ml) overnight at 56\u0026deg;C.\u003c/p\u003e \u003cp\u003eLibraries for confirming clusters by WGS were prepared using the Nextera XT kit (Illumina, San Diego, USA) following the manufacturer\u0026rsquo;s instructions, and pooled for running on a MiSeq or Nextseq device (2x151bp).\u003c/p\u003e \u003cp\u003eSequence analysis was done using an in-house pipeline deposited in Git-Hub: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/MG-IiSGM/autosnippy\u003c/span\u003e\u003cspan address=\"https://github.com/MG-IiSGM/autosnippy\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. This pipeline\u0026rsquo;s workflow follows the same steps described previously (\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e) using a hypothetical \u003cem\u003eMycobacterium tuberculosis\u003c/em\u003e (MTB) ancestral genome (\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e) as reference. Finally, genomic distances between sequences were calculated using Jaccard similarity and Hamming distance metrics generating distance matrices. Alignments and SNP variants were visualized and checked with the IGV (Integrative Genomics Viewer) program.\u003c/p\u003e\n\u003ch3\u003eAnalysis of clusters\u003c/h3\u003e\n\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eGenomic analysis\u003c/h2\u003e \u003cp\u003eFirstly, isolates were considered clustered when \u0026lt;\u0026thinsp;12 SNPs were identified between them (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e). Then, an evolutionary analysis was performed, by obtaining genomic networks, based on the distribution of SNPs between the clustered isolates, by using Network (5.0, 10.0), or PopART 1.7.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eClinical and epidemiological analysis\u003c/h2\u003e \u003cp\u003eThe new entrances in clusters considered, from the genomic evolutionary analysis, as candidates for likely reactivations or prolonged diagnostic delays/subclinical TB were reviewed according to the clinical/epidemiological data from the corresponding cases.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eGeneral analysis of clusters\u003c/h2\u003e \u003cp\u003eIn the study period (January 2003-June 2024), 1886 culture-positive TB cases were diagnosed in Almer\u0026iacute;a, and MIRU-VNTR was performed in 90.6% of them. 984 of the cases (57.6%) were involved in 126 clusters according to MIRU-VNTR. Genomic data were available for 84.1% of these clusters (including a total of 615 cases, 537 confirmed clustered cases in 106 genomic clusters, 2\u0026ndash;25 cases/cluster).\u003c/p\u003e \u003cp\u003eOur first aim was to analyze the reasons behind the clustering of the cases in the most recent study period, namely 2021-June 2024. 106 of the cases diagnosed in this period (34.6% of the total cases in this period) were included in 53 clusters (\u0026lt;\u0026thinsp;12 SNPs) (Supplementary Figure). 46 of these cases (43.4%) constituted 22 clusters (2\u0026ndash;6 cases/cluster), considered as new clusters, identified for the first time in the last 3.5 years; while the remaining 60 (56.6%) corresponded to cases entering into 31 clusters that had already been identified before, in the 2003\u0026ndash;2020 period, considered growing clusters, now incorporating 1\u0026ndash;11 new cases.\u003c/p\u003e \u003cp\u003eOnce identified the recently clustered cases, we obtained the genomic networks for each of the involved clusters, by locating the cases in the same network, according to the differential SNPs identified between their sequences. The genomic networks were analyzed to propose the most likely interpretation of the reasons behind the new entrances of cases in clusters, and therefore tailor and recommend the most suitable interventions for each case.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eNew clustered cases due to ongoing transmission\u003c/h2\u003e \u003cp\u003eConsidering the rate of acquisition of SNPs in MTB (1 SNP every 2\u0026ndash;3 years), we considered ongoing transmission as the reason behind a new case entering into a cluster when the new case was located in the genomic network close (at 0\u0026ndash;2 SNPs) to a recently diagnosed case (0\u0026ndash;3 years before; Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA). This criterion was fulfilled for 47.2% of the new clustered cases and 43.4% of the total clusters.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFocusing on the new clusters, most of them (14; 63.6%) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA, Clusters 3113 and 3084) grew due to ongoing transmissions. One of these clusters corresponded to an extensive transmission involving six new cases (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA, Cluster 3113), while the remaining were more limited, including only one with 4 new clustered cases and the remaining were clusters involving 1\u0026ndash;2 new cases (Supplementary Figure). When we focused on the clusters that had already been defined before 2021 but were still growing in the last 3.5 years, ongoing transmission as the reason behind their growth was identified in only 9 (29%) of them (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA, Clusters 789 and 786). Only one of the clusters involved a high number, eight, of new cases, one included 3 new cases, while the remaining clusters included only one new clustered case.\u003c/p\u003e \u003cp\u003eFor the cases in which the interpretation of the genomic network indicated cluster growing due to ongoing transmission, the epidemiological intervention recommended was to study the context of the patient looking for other potential active non-diagnosed TB cases. In the 2021-June 2024 period, we identified 18 secondary cases from 8 different clusters that had not still been diagnosed before this intervention. In another 11 new clusters, genomic data lead to the re-orientation and intensification of the surveillance for new cases and epidemiological research.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eImpact of COVID-19 pandemic on TB ongoing transmission\u003c/h2\u003e \u003cp\u003eAmong the clusters from which the genomic network analysis indicated ongoing transmission, we must highlight one example, whose peculiarities helped us to evaluate the impact of the COVID-19 pandemic on TB transmission. The cluster (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA, Cluster 789) involved a case (Case 2534) with two TB episodes (one in 2019, before the pandemic, and the other in 2022) due to a lack of adherence to treatment. The strain isolated from the first and second episodes differed in 1 SNP. This SNP was used as a marker SNP to determine the number of new secondary cases due to exposures to case p2534 either before (lacking the differential SNP) or after the pandemic (harboring the SNP acquired in p2534 second episode). Only one new case (Case 2816), diagnosed in 2021 was found to share the same sequence of Case 2536 first episode (missing the SNP), while six cases diagnosed in 2022-23 were identical to Case 2534 second episode (harbouring the SNP). Most of the post-pandemic secondary cases had shared in some moment the same household with Case 2534.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eReasons other than ongoing transmission behind new clustered cases\u003c/h2\u003e \u003cp\u003eFor the new clustered cases entering in 36.4% of the new clusters and in 71% of the growing clusters, the analysis of their location in the genomic networks was not consistent with ongoing transmission. Moreover, their positions in the network, together with the general topology of the networks, allowed us to propose for a proportion of them other likely alternative explanations for their entrances in the corresponding clusters. From now on we will present a detailed analysis of representatives of these alternative explanations, without differentiating those entering into new or growing clusters.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eReactivations of past exposures\u003c/h2\u003e \u003cp\u003eTen of the cases entering in 10 clusters were located in their networks close (0\u0026ndash;1 SNP) to other cases diagnosed\u0026thinsp;\u0026ge;\u0026thinsp;4 years before. These findings could indicate the involvement in the cluster of cases with TB due to reactivations of a past exposure (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea, Clusters 1482 and 15).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAdditionally, we identified another 6 clusters with new cases also entering in the network after a case diagnosed\u0026thinsp;\u0026ge;\u0026thinsp;4 years before, indicating also the involvement of reactivations in this cluster. However, different to the previous examples, now we detected some more SNPs (2\u0026ndash;3 SNPs) with respect to the case preceding them in the network (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB, Clusters 1330 and 1348). For these cases, one possible explanation for the intermediate SNPs is the involvement of certain diagnostic delay. This delay would have been responsible for a period of bacterial viability before the diagnosis of the clustered case, likely due to within-host evolution leading to the acquisition of SNP diversity.\u003c/p\u003e \u003cp\u003eHowever, despite the location of the new cases entering in these networks suggest the involvement in the cluster of events of reactivation from past exposures, we can\u0026acute;t fully assure that the very cases identified to enter in these clusters are those due to reactivation +/- certain diagnostic delay. Alternatively, reactivation +/- delay could have occurred in other cases involved in the transmission event, who might have preceded our cases, but who are not been drawn in the network because corresponded to non-diagnosed cases, missed cases, that therefore can\u0026acute;t be represented in the network. In this alternative scenario, a missing case would be the one with reactivation +/- delay who subsequently transmited the infection to the new cases identified in our study as new entrances in the cluster (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eb).\u003c/p\u003e \u003cp\u003eAlthough the new entrances identified in these cluster are merely candidates of reactivations we looked for data supporting this hypothesis in their epidemiological and clinical charts and discussed the findings within the interdisciplinary team. We found reasons to justify reactivation in 12 of them (80%). Regarding those cases with certain diagnostic delay, findings consistent with it were also found in five (83.3%).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eProlonged diagnostic delay/Subclinical tuberculosis\u003c/h2\u003e \u003cp\u003eIn eight additional cases, the number of intermediate SNPs between them and their preceding cases in the network was higher (4\u0026ndash;8 SNPs). Also, the amount of intermediate SNPs corresponded to around those expected (considering the rate of acquisition of SNPs in MTB). It may suggest the acquisition of SNP diversity by MTB during the whole period between newly diagnosed case and the case preceding him. We considered them candidates to have experienced a likely prolonged diagnostic delay or subclinical TB (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAs in the previous category, prolonged diagnostic delay could have occurred in the new cases identified in our study entering in the network or, alternatively, it could have occurred in missed cases preceding them. Therefore, our candidates were confronted with the clinical/epidemiological revision within our interdisciplinary team, to evaluate if we could confirm our hypothesis in some of them. In 7 of the cases (87.5%), evidences were found, supporting prolonged diagnostic delay/subclinical TB.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eMonoresistance as a proxy to suspect diagnostic delay\u003c/h2\u003e \u003cp\u003eFrom the candidates corresponding to prolonged diagnostic delay, we identified 2 cases with mono-resistance to fluoroquinolones (mono-FQ-R) (one of them is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB, Cluster 15). This new finding reinforced our interpretation of diagnostic delay for the involved cases, as both had received this drug to treat infections other than TB, before the diagnosis of TB. By receiving this drug when TB had still not been diagnosed, a monotherapy treatment could have occurred, leading to the acquisition of resistance. Therefore, mono-resistance could be used as a proxy for suspect diagnostic delay.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eImpact of diagnostic delay on secondary cases\u003c/h2\u003e \u003cp\u003eProlonged diagnostic delay could be responsible of unknown exposures to other cases. Among our selection of candidate cases for prolonged diagnostic delay, we identified three representatives that allowed us to assess a progressive impact on the number of secondary cases due to the delays leading to i) no secondary cases, ii) limited within-household exposures and iii) higher impact on secondary cases at the community setting.\u003c/p\u003e \u003cp\u003eIn the first example (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Cluster 30), the diagnosis of Case 3219 in 2023 was fortuitous, as the radiology showing findings compatible with TB was taken because of a car accident, not because of respiratory symptoms. In addition, once identified in the network the signatures suggesting prolonged diagnostic delay, a revision of his clinical chart also revealed findings compatible with TB in a previous radiology, taken five years before his diagnosis. This radiology was requested following the standard protocol before a surgery procedure; however, the radiology was not checked by anyone as the intervention was minor and it had been therefore requested unnecessarily. In this case, despite his prolonged diagnostic delay, no secondary cases were caused, as deduced by the absence of other cases in the network along his branch, that is, along the period of within-host evolution of the strain before diagnosis. A revision of the case within the interdisciplinary team indicated that the individual had an extremely low degree of social interaction, which may explain the lack of exposure to other cases.\u003c/p\u003e \u003cp\u003eThe second example (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, Cluster 15) corresponded to a prolonged diagnostic delay causing two sequential exposures to the same family member, who shared a household. Case 3034 accumulated two features supporting prolonged diagnostic delay i) several SNPs before his diagnosis in the genomic network branch in which he was located, and ii) FQ-mono-R at his diagnosis. He had received this drug before, several times, to treat a long-term renal/urinary disease. From the analysis of the network, the most likely explanation is that Case 2427 was exposed twice to Case 3034, once recently, explaining the FQ-mono-R in the 2023 isolate, and the other at earlier stages of his prolonged diagnostic delay, causing the infection of Case 2427 in 2018, before having acquired the strain the FQ-R.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe interpretation of the impact of the last example of prolonged diagnostic delay on secondary cases was much more challenging as it did not arise from the observation of a case with several SNPs preceding his diagnosis but from his clinical findings and epidemiological analysis of the history of the case (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, Cluster 789). Case 3292 at diagnosis presented with a clinical status suggestive of a long-term evolved TB, namely a severe weight loss over the last years, and radiology with extensive damage. He had been the roommate of Case 2418 when he was diagnosed with TB six years ago (2018) but he refused to be studied. Since then, Case 3292 had not attended any health care centre (he is a social outsider) until his recent diagnosis, in 2024. Supported now on all these clinical and epidemiological data, despite Case 3292 being located at the end of a branch, the most likely interpretation is that he is not the result of exposures to the cases preceding him in the branch, but most likely, he acquired TB from Case 2418. Since then, the strain in Case 3292 evolved along with his prolonged diagnostic delay. This means that cases 2536, 2570, 2678, and 3204 could be considered as secondary cases of direct/indirect exposures to Case 3292 while the strain was progressively acquiring the 3 SNPs identified along the branch. An epidemiological investigation based on this hypothesis demonstrated that all these cases lived in the same neighbourhood and were part of the same drug-dealing chain.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003eClusters growing by the coincidence of several factors\u003c/h2\u003e \u003cp\u003eIn 6 clusters we observed the entrance of several new clustered cases, for the period 2021-24, candidates for different explanations behind each one, among those presented before as reasons for clusters growing, i.e. ongoing transmission, reactivations, or diagnostic delays (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eWGS provides a better delineation of transmission clusters in MTB than traditional molecular methods. Previous studies noted the higher accuracy of this tool to infer transmission, meaning that, currently, WGS is a key tool for epidemiological research in TB (\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe first deliverable in any genomic epidemiology study in TB involving standard populations is to assess the proportion of clustered cases, based on the SNP thresholds already validated to consider transmission, or even very recent transmission (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e). This allows us to offer a snapshot of the TB transmission patterns and dynamics in a population and evaluate the efficiency of control programs. Among many other studies, previous studies in Brazil identified rates of very recent transmission of 43.7% (\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e). However, in socio-epidemiologically complex populations, such as those with a high rate of migrant population, we must go further, as the reasons involved in TB prevalence are multifactorial. This justifies that we need more in-depth knowledge of the reasons behind each new TB diagnosis.\u003c/p\u003e \u003cp\u003eOne example of a complex population is Almeria, a province in south-east Spain. The high rate of TB in migrant cases (80% in 2023) leads to a concentration of TB cases in these vulnerable groups, as described in many other similar settings (\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e). Previous findings of our team (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e) demonstrate the overlapping of imported TB cases with others acquiring TB after arrival, the existence of transmission between autochthonous and migrant cases, and finally, reactivations, triggered by the substandard living conditions of the migrant population. All these factors, explain why we need to make a more exhaustive analysis, beyond the assessment of the proportion of the cases that are clustered cases.\u003c/p\u003e \u003cp\u003eIn this study, we decided to exploit the genomic data beyond the quantitative determination of the number of SNPs between the cases, which is applied to rule in or out clustering; by paying attention also to the evolutionary findings in each cluster. It means performing an analysis of the distribution of the differential SNPs identified between the cases in a cluster, after positioning the cases in a genomic network\u003c/p\u003e \u003cp\u003eThis more refined analysis does not exclude that our first aim was to determine the cases in Almer\u0026iacute;a that were clustered; nevertheless, as the philosophy of our program is to intervene to control transmission, we were especially interested in analyzing in detail the current situation, which justifies why we focused on the last 3.5 years. We identified that 106 of the cases diagnosed in the last 3.5 years were involved in clusters. As we count on genomic data from the clustered cases for the last 21 years, we could determine that 56.6% of these cases fed clusters that had already been identified, before the last 3 years, while the remaining 43.4% corresponded to those involved in new clusters, non identified before this period.\u003c/p\u003e \u003cp\u003eThe first interpretation was that we were facing a deficiency to control historical clusters which were still growing due to ongoing transmission. This was the first question to be answered by our evolutionary analysis of genomic networks, by evaluating whether the cases incorporated into pre-existing clusters were consistent with the topology in the network expected for ongoing transmission. Compared to the findings in new clusters, where most of the new entrances are due to ongoing transmission, only one-third of them in old growing clusters corresponded to this reason. This suggests that control efforts for previously identified clusters are appropriate.\u003c/p\u003e \u003cp\u003eMoreover, in a major proportion of the old clusters including new cases we detected a potential involvement of reactivations of past exposures. Other studies also considered as potential reactivations those cases linked to an index case in the remote past with few SNPs differences between them (\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e). This means that other factors beyond the epidemiological control of recent transmission are responsible for the growing of certain clusters, among them, social factors. This highlights the current necessity to increase social intervention, and not exclusively make efforts to minimize ongoing transmission. Several socio-clinical factors were finally found in most of these candidate cases, reinforcing the appropriateness of having classified them as reactivations from the analysis of their genomic networks.\u003c/p\u003e \u003cp\u003eIf social factors emerge as relevant as additional triggers for TB, we can\u0026acute;t avoid considering the impact that the COVID-19 pandemic could have had on TB dynamics. It might also influence the proportion of clustered cases in our study that are considered candidates of reactivations. Some studies reported an association between COVID-19-related immune suppression and the reactivation of TB, suggesting that COVID-19 might accelerate the progression from latent to active TB (\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e). In addition, we also expect a more obvious and direct effect of the pandemic on TB transmission. Cluster 789 (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e) constitutes an excellent example to illustrate a markedly asymmetric number of secondary cases caused by the same index case and the same strain either before or after the pandemic.\u003c/p\u003e \u003cp\u003eAnother relevant interpretation extracted from our analysis of the genomic networks is the identification of clusters with one branch connecting two cases diagnosed several years apart, with several SNPs in between, indicative of the involvement of cases with bacterial viability/evolution, and therefore leading to the acquisition of diversity, before diagnosis. Alternatively to our interpretation that the intermediate SNPs could indicate diagnostic delay, it has been proposed that diversity could occur during latency with a similar mutation rate than in active TB (\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e, \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e), although these findings are controversial, and other studies estimate that the replication rate is lower during latent disease (\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e). Recent studies still highlighted the discrepancies in determining the mutation rate under latency (\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e). In our study, we have considered as the clearest clusters candidates to involve diagnostic delay those in which the number of SNPs in the branch connecting a new case with the preceding case in the network corresponded to those expected for a mutation rate of 1 SNP/2\u0026ndash;3 years (\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e). This means a prolonged diagnostic delay, covering the whole period between the diagnoses of the two cases connected by that branch. However, this might be underestimating the true proportion of diagnostic delays, because a proportion of them might be not so prolonged and therefore lead to a lower number of intermediate SNPs in the involved branch. In this sense, we observed examples of these possible shorter diagnostic delays in some of the clustered cases interpreted as candidates to reactivations, in which we detected some SNPs (2\u0026ndash;3 SNPs) in the branches connecting them with their preceding cases in the same branch. In fact, further investigation revealed findings consistent with diagnostic delay not only for those more prolonged examples but also in most of the shorter diagnostic delays. It is also possible that the bacterial evolution leading to the higher than expected diversity identified in some of these clusters might have occurred in other non-diagnosed intermediate cases, which participated in the cluster but still remain undiagnosed. On the other hand, these likely interpretations and the fact that different possibilities could be involved in the same case or cluster, highlight the challenge in the classification of these events.\u003c/p\u003e \u003cp\u003eIn the revision of clinical charts looking for clues of non-diagnosed TB for the cases proposed to be candidates of diagnostic delay/subclinical TB, we identified that mono-FQ-R may be used also as a proxy to consider them; especially, when it is identified in only one member of a cluster. TB patients with prior FQ prescription to TB diagnosis, usually to treat community-acquired pneumonia, have a three-fold higher risk of having FQ-resistant TB (\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e). Multiple FQ prescriptions, FQ prescription more than 60 days prior to TB diagnosis and for more than 10 days are associated with FQ-resistant TB (\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e, \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eSeveral studies highlighted the role of subclinical TB in transmission to secondary cases (\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e). We also evaluated it and found that the epidemiological circumstances of each case with prolonged diagnostic delay/subclinical TB are determinant, and may lead to a wide range of impacts on the emergence of secondary cases.\u003c/p\u003e \u003cp\u003eWe must acknowledge that all the interpretations inferred from the study of the cluster genomic networks led us only to propose the involvement of reactivations, diagnostic delays/subclinical TB in those clusters. It does not mean that these features are associated to the new cases entering in the cluster; they could also correspond to cases preceding them that have subsequently transmitted recently the infection to these new cases in our study. It is possible that these potential preceding cases were not included in the network because they are missed cases in our study because they have not been sequenced or even diagnosed. It is true that ours is a program running since 2003 which covers the whole population in Almeria, and the figures of the percentage of diagnosed TB cases with culture available and sequence available are high (XXXXX). However. in a population enriched in migrants, such as ours, the interterritorial mobility of these individuals leads us to acknowledge the possibility of missed cases. For this reason, the cases in our study are considered only as candidates to correspond to reactivations/delays and we always need additional clinical/epidemiological data before validating them. For this it is essential to couple our refined genomic analysis with an equivalent refined analysis of the clinical, social, and epidemiological data from each new clustered case. This would not be possible without the intervention of our multidisciplinary team discussing together every new case, and re-interviewing the cases guided by the genomic data looking for additional data to support our hypothesis.\u003c/p\u003e \u003cp\u003eThe identification of the involvement of reactivation or diagnostic delay in our clusters, especially when candidates are finally validated, allows us to design and propose specific intervention strategies according to the true nature of each clustered case, to be added to the standard control measures systematically applied on every cluster. For example, when reactivations are validated, the epidemiological intervention recommended to add was aimed to minimize the future emergence of secondary cases, by optimizing preventive treatment of latent tuberculosis infection in the cases\u0026acute; contexts. In case of diagnostic delays, the epidemiological intervention should also include a more active search for potential secondary cases that may have occurred during the time the patient was undiagnosed. In some clusters we identify independent entrances in different branches/locations in the network, leading to with a different possible explanation behind; this led us to recommend interventions that are not systematic for a cluster but instead are case-specific.\u003c/p\u003e \u003cp\u003eOnce understood the usefulness of this interpretative approach to exploit genome data, the following steps involve accelerating the availability of this analysis to integrate them with the epidemiological investigation in real-time. The general approach applied in genomic analysis in TB is to follow a high-throughput scheme, accumulating a high number of isolates in the same run looking for lower costs. We are applying a faster approach based on nanopore single sequencing from the primary culture of each new incident case (\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e). One-to-one analysis offers an alternative scheme adapted to the early coupling of genomic, epidemiological, and clinical data until the fastest approach, sequencing directly on the respiratory specimens, which remains challenging (\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e, \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eUnderstanding the singularities behind each new clustered case by means of an evolutionary analysis of the differential SNPs observed in each cluster will help us to understand the true complexity behind them. This will allow us to better understand the reasons behind TB persistence in our populations and to differentiate correctly the new cases due to ongoing transmission from those resulting from reactivations of exposures in the past. This same analysis will shed light on the hidden burden of TB due to prolonged diagnostic delays or subclinical TB. All this information may have a paramount importance in tailoring specific interventions to maximize success in our purpose to control TB.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eEthical Statement This study was approved by Junta de Andaluc\u0026iacute;a Ethical Committee (References 60/2017 and 98/2023). All sequences were encrypted to anonymize any associated personal information. Written informed consent was obtained from the participants.\u003c/p\u003e\n\u003ch2\u003eAcknowledgements\u003c/h2\u003e \u003cp\u003eThis study funded by ISCIII [PI21/01823, PI19/00331, Miguel Servet Contract (CPII20/00001) to LPL; PFIS contracts to CRG (FI20/00129) and SBS (FI21/00145)], IiSGM (2021-II-PI-01 to DGV), SEPAR 2023 (1401/2023), Junta de Andaluc\u0026iacute;a (AP-0062-2021-C2-F2), and co-financed by ERDF: \u0026ldquo;A way of making Europe\u0026rdquo;. This study based upon work from COST Action Advance TB (CA21164), supported by COST (European Cooperation in Science and Technology).\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eWalker TM, Ip CLC, Harrell RH, Evans JT, Kapatai G, Dedicoat MJ, et al. Whole-genome sequencing to delineate Mycobacterium tuberculosis outbreaks: A retrospective observational study. Lancet Infect Dis. 2013 Feb;13(2):137\u0026ndash;46. \u003c/li\u003e\n\u003cli\u003eMeehan CJ, Goig GA, Kohl TA, Verboven L, Dippenaar A, Ezewudo M, et al. Whole genome sequencing of Mycobacterium tuberculosis: current standards and open issues. Vol. 17, Nature Reviews Microbiology. Nature Publishing Group; 2019. p. 533\u0026ndash;45. \u003c/li\u003e\n\u003cli\u003eSalvato RS, Reis AJ, Schiefelbein SH, G\u0026oacute;mez MAA, Salvato SS, da Silva LV, et al. Genomic-based surveillance reveals high ongoing transmission of multi-drug-resistant Mycobacterium tuberculosis in Southern Brazil. Int J Antimicrob Agents. 2021 Oct 1;58(4). \u003c/li\u003e\n\u003cli\u003eAbascal E, P\u0026eacute;rez-Lago L, Mart\u0026iacute;nez-Lirola M, Chiner-Oms \u0026Aacute;, Herranz M, Chaoui I, et al. Whole genome sequencing-based analysis of tuberculosis (TB) in migrants: Rapid tools for crossborder surveillance and to distinguish between recent transmission in the host country and new importations. Eurosurveillance. 2019 Jan 24;24(4). \u003c/li\u003e\n\u003cli\u003eGuthrie JL, Strudwick L, Roberts B, Allen M, McFadzen J, Roth D, et al. Comparison of routine field epidemiology and whole genome sequencing to identify tuberculosis transmission in a remote setting. Epidemiol Infect. 2020. \u003c/li\u003e\n\u003cli\u003eWalker TM, Monk P, Smith EG, Peto TEA. Contact investigations for outbreaks of Mycobacterium tuberculosis: Advances through whole genome sequencing. Vol. 19, Clinical Microbiology and Infection. Blackwell Publishing Ltd; 2013. p. 796\u0026ndash;802. \u003c/li\u003e\n\u003cli\u003eAlonso M, Herranz M, Lirola MM, Gonzaĺez-Rivera M, Bouza E, De Viedmaa DG. Real-time molecular epidemiology of tuberculosis by direct genotyping of smear-positive clinical specimens. J Clin Microbiol. 2012 May;50(5):1755\u0026ndash;7. \u003c/li\u003e\n\u003cli\u003eSupply P, Allix C, Lesjean S, Cardoso-Oelemann M, R\u0026uuml;sch-Gerdes S, Willery E, et al. Proposal for standardization of optimized mycobacterial interspersed repetitive unit-variable-number tandem repeat typing of Mycobacterium tuberculosis. J Clin Microbiol. 2006 Dec;44(12):4498\u0026ndash;510. \u003c/li\u003e\n\u003cli\u003eMart\u0026iacute;nez-Lirola M, Herranz M, Serrano SB, Rodr\u0026iacute;guez-Grande C, Inarra ED, Garrido-C\u0026aacute;rdenas JA, et al. A One Health approach revealed the long-term role of Mycobacterium caprae as the hidden cause of human tuberculosis in a region of Spain, 2003 to 2022. Eurosurveillance. 2023;28(12):1\u0026ndash;11. doi: 10.2807/1560-7917.ES.2023.28.12.2200852.\u003c/li\u003e\n\u003cli\u003eComas Ĩ, Chakravartti J, Small PM, Galagan J, Niemann S, Kremer K, et al. Human T cell epitopes of Mycobacterium tuberculosis are evolutionarily hyperconserved. Nat Genet. 2010;42(6):498\u0026ndash;503. \u003c/li\u003e\n\u003cli\u003eAi JW, Ruan QL, Liu QH, Zhang WH. Updates on the risk factors for latent tuberculosis reactivation and their managements. Emerg Microbes Infect. 2016;5(November 2015):e10. \u003c/li\u003e\n\u003cli\u003eGrupo de trabajo Plan Prevenci\u0026oacute;n y Control de la Tuberculosis. Plan para la prevencion y control de la Tuberculosis en Espa\u0026ntilde;a. Comisi\u0026oacute;n de Salud P\u0026uacute;blica el Consejo Interterritorial del Sistema Nacional de Salud. Ministerio de Sanidad, Consumo y Bienestar Social [Internet]. 2019. Available from: https://www.mscbs.gob.es/profesionales/saludPublica/prevPromocion/PlanTuberculosis/docs/PlanTB2019.\u003c/li\u003e\n\u003cli\u003eXu Y, Cancino-Munoz I, Torres-Puente M, Villamayor LM, Borr\u0026aacute;s R, Borr\u0026aacute;s-M\u0026aacute;\u0026ntilde;ez M, et al. High-resolution mapping of tuberculosis transmission: Whole genome sequencing and phylogenetic modelling of a cohort from Valencia Region, Spain. PLoS Med. 2019;16(10). \u003c/li\u003e\n\u003cli\u003eVerza M, Scheffer MC, Salvato RS, Schorner MA, Barazzetti FH, Machado H de M, et al. Genomic epidemiology of Mycobacterium tuberculosis in Santa Catarina, Southern Brazil. Sci Rep. 2020 Dec 1;10(1). \u003c/li\u003e\n\u003cli\u003eLitvinjenko S, Magwood O, Wu S, Wei X. Burden of tuberculosis among vulnerable populations worldwide: an overview of systematic reviews. Lancet Infect Dis. 2023 Dec 1;23(12):1395\u0026ndash;407. \u003c/li\u003e\n\u003cli\u003eNelson KN, Talarico S, Poonja S, McDaniel CJ, Cilnis M, Chang AH, et al. Mutation of Mycobacterium tuberculosis and Implications for Using Whole-Genome Sequencing for Investigating Recent Tuberculosis Transmission. Front Public Heal. 2022 Jan 13;9. \u003c/li\u003e\n\u003cli\u003eAlmatrafi MA, Awad K, Alsahaf N, Tayeb S, Alharthi A, Rabie N, et al. Disseminated Tuberculosis Post COVID-19 Infection: A Case Report. Cureus. 2022 Nov 14. \u003c/li\u003e\n\u003cli\u003eFord CB, Lin PL, Chase MR, Shah RR, Iartchouk O, Galagan J, et al. Use of whole genome sequencing to estimate the mutation rate of Mycobacterium tuberculosis during latent infection. Nat Genet. 2011 May;43(5):482\u0026ndash;8. \u003c/li\u003e\n\u003cli\u003eLillebaek T, Norman A, Rasmussen EM, Marvig RL, Folkvardsen DB, Andersen \u0026Aring;B, et al. Substantial molecular evolution and mutation rates in prolonged latent Mycobacterium tuberculosis infection in humans. Int J Med Microbiol. 2016 Nov 1;306(7):580\u0026ndash;5. \u003c/li\u003e\n\u003cli\u003eColangeli R, Arcus VL, Cursons RT, Ruthe A, Karalus N, Coley K, et al. Whole genome sequencing of Mycobacterium tuberculosis reveals slow growth and low mutation rates during latent infections in humans. PLoS One. 2014 Mar 11;9(3). \u003c/li\u003e\n\u003cli\u003eBryant JM, Sch\u0026uuml;rch AC, van Deutekom H, Harris SR, de Beer JL, de Jager V, et al. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data. BMC Infect Dis. 2013 Feb 27;13(1). \u003c/li\u003e\n\u003cli\u003eMigliori GB, Langendam MW, D\u0026rsquo;Ambrosio L, Centis R, Blasi F, Huitric E, et al. Protecting the tuberculosis drug pipeline: Stating the case for the rational use of fluoroquinolones. Vol. 40, European Respiratory Journal. 2012. p. 814\u0026ndash;22. \u003c/li\u003e\n\u003cli\u003eDevasia RA, Blackman A, Gebretsadik T, Griffin M, Shintani A, May C, et al. Fluoroquinolone resistance in Mycobacterium tuberculosis: The effect of duration and timing of fluoroquinolone exposure. Am J Respir Crit Care Med. 2009 Aug 15;180(4):365\u0026ndash;70. \u003c/li\u003e\n\u003cli\u003eLong R, Chong H, Hoeppner V, Shanmuganathan H, Kowalewska-Grochowska K, Shandro C, et al. Empirical treatment of community-acquired pneumonia and the development of fluoroquinolone-resistant tuberculosis. Clin Infect Dis. 2009 May 15;48(10):1354\u0026ndash;60. \u003c/li\u003e\n\u003cli\u003eYin J, Yan G, Qin L, Fan J, Zhu C, Li Y, et al. Genomic investigation of bone tuberculosis highlighted the role of subclinical pulmonary tuberculosis in transmission. Tuberculosis. 2023, Sep 148. doi: 10.1016/j.tube.2024.102534.\u003c/li\u003e\n\u003cli\u003eSanz-P\u0026eacute;rez A, Rodr\u0026iacute;guez-Grande C, Buenestado-Serrano S, Mart\u0026iacute;nez-Lirola M, Herranz-Mart\u0026iacute;n M, Pe\u0026ntilde;as-Utrilla D, et al. Reducing delays in the genomic epidemiology of tuberculosis: a flexible and decentralized analysis of each incident case. Unpublished. 2024.\u003c/li\u003e\n\u003cli\u003eGoig GA, Cancino-Mu\u0026ntilde;oz I, Torres-Puente M, Villamayor LM, Navarro D, Borr\u0026aacute;s R, et al. Whole-genome sequencing of Mycobacterium tuberculosis directly from clinical samples for high-resolution genomic epidemiology and drug resistance surveillance: an observational study. The Lancet Microbe. 2020 Aug 1;1(4):e175\u0026ndash;83. \u003c/li\u003e\n\u003cli\u003eNilgiriwala K, Rabodoarivelo MS, Hall MB, Patel G, Mandal A, Mishra S, et al. Genomic Sequencing from Sputum for Tuberculosis Disease Diagnosis, Lineage Determination, and Drug Susceptibility Prediction. J Clin Microbiol. 2023 Mar 23;61(3):e0157822.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Servicio de Microbiología Clínica y Enfermedades Infecciosas, Hospital General Universitario Gregorio Marañón. Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"tuberculosis, genomics, transmission, epidemiology ","lastPublishedDoi":"10.21203/rs.3.rs-6057121/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6057121/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eIntroduction\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTuberculosis (TB) clusters are interpreted as ongoing transmission events, which demand control interventions. Our aim is to perform a refined genomic analysis in Almería, Spain, to evaluate whether reasons other than ongoing transmission could be behind the incorporation of new cases to pre-existing or new clusters, to manage more properly each new clustered case and optimizing control resources.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIllumina WGS was performed following standard procedures. First, genomic data were analyzed quantitatively, to identify clustered cases (\u0026lt; 12 SNPs). Then, a refined evolutionary analysis was performed, positioning the clustered cases in genomic networks, based on the distribution of SNPs. The location of the new clustered cases in relation to the cases preceding it in the cluster was considered to interpret the most likely reasons behind the growth of each cluster, supporting them by epidemiological and clinical data.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe identified 106 genomic clusters during the years 2003–2024, including a total of 537 cases (2–25 cases/cluster). 106 (34.6%) of the diagnosed cases in the last four years (2021–2024) were included in 53 clusters; 22 were new clusters, while the remaining were growing clusters, already identified before 2021. New entrances in clusters were due to ongoing transmission (new cases connected in the genomic network with a recently diagnosed case at 0–2 SNPs) in only 29% of the growing clusters (1–11 cases entering in pre-existing clusters) and in 63.6% of the new clusters (2–6 cases/cluster). For new clustered cases who were not the result of ongoing transmission, the analysis of the genomic networks allowed us to identify clusters with the involvement of i) reactivations of past exposures (new case close to another case diagnosed \u0026gt; 4 years before), ii) prolonged diagnostic delays or subclinical periods (new case positioned in branches with a high number of SNPs preceding them, suggesting persistent bacterial viability), or to iii) multifactorial clusters, growing by reactivations, diagnostic delays and/or ongoing transmission.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusion\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA genomic evolutionary analysis is required for a precise interpretation of growing clusters. Only one-third of the growing clusters in Almería correspond to ongoing transmissions. Reactivations of past exposures, prolonged diagnostic delays or subclinical TB had also a role in growing clusters. The precise identification of the reasons behind growing clusters allows the specific management of each new clustered case.\u003c/p\u003e","manuscriptTitle":"A long-term refined genomic analysis of tuberculosis clusters to discriminate between ongoing transmission, reactivations or diagnostic delays","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-03-04 06:29:18","doi":"10.21203/rs.3.rs-6057121/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"ad360727-3027-4c82-af49-3de141ac1ead","owner":[],"postedDate":"March 4th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-03-04T06:29:18+00:00","versionOfRecord":[],"versionCreatedAt":"2025-03-04 06:29:18","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6057121","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6057121","identity":"rs-6057121","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00