Fetal Gestational Age Estimation Using AI on Simple Ultrasound Images and Video

preprint OA: gold CC-BY-4.0
📄 Open PDF Full text JSON View at publisher
Full text 132,797 characters · extracted from preprint-html · click to expand
Fetal Gestational Age Estimation Using AI on Simple Ultrasound Images and Video | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Fetal Gestational Age Estimation Using AI on Simple Ultrasound Images and Video Martin Benson, Sacha Walton, Tom Hartley, Simon Meagher, Suresh Seshadri, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5907990/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 20 Nov, 2025 Read the published version in npj Digital Medicine → Version 1 posted 9 You are reading this latest preprint version Abstract Background Accurate gestational age (GA) estimation is essential for prenatal care, guiding fetal growth assessment and medical interventions. Ultrasound-based biometric measurements, though more reliable than last menstrual period, require high sonographic expertise and are time-consuming, posing challenges, especially in resource-limited settings. This study aimed to develop an Artificial Intelligence (AI) model that estimates GA from any fetal ultrasound images regardless of orientation or standard plane, reducing reliance on sonographic skill, to increase accessibility. Methods: We trained a deep learning model on a large, diverse dataset of over 2 million ultrasound images from three continents (Australia, India, and the UK). The model was trained to estimate GA from ultrasound images without requiring specific biometric planes or measurements. It outputs a GA estimate alongside an uncertainty level, based on image quality and type. Validation was performed using independent datasets of ultrasound images and videos, with comparison against standard biometric measurements across all trimesters. Findings: The AI model consistently produced GA estimates that were at least as accurate as those derived from traditional biometry, with a mean absolute error (MAE) significantly lower than biometry in the second trimester (p < 0.001) and comparable in the third trimester. Subanalysis by country and maternal BMI demonstrated the model's robustness across different sub-populations. The model also accurately estimated GA on video datasets, producing a confident estimate after a median of 24 seconds of video. Interpretation: This AI-based GA estimation method, trained from retrospective clinical data, is at least accurate and gold-standard fetal biometry. By significantly reducing the skill level required by sonologists, this approach holds potential to improve prenatal care in resource-limited settings and democratize access to ultrasound-based GA estimation globally. Biological sciences/Computational biology and bioinformatics Biological sciences/Physiology/Reproductive biology Health sciences/Health care/Medical imaging Figures Figure 1 Figure 2 Figure 3 Research in Context Evidence before this study Prior to conducting our research, we updated our previous three systematic reviews to thoroughly understand current methods for estimating gestational age (GA). Comprehensive database searches were conducted across MEDLINE, Embase, CINAHL, LILACS, the Cochrane Database of Systematic Reviews, and the Science Citation Index, covering studies published from January 1970 up to the present. These searches assessed studies on prenatal and postnatal liquid biomarkers for GA estimation; dating by Crown-rump length (CRL) in the first trimester; and ultrasound biometry and symphysis-fundal height measurements in the second and third trimesters. Our findings highlighted the lack of effective prenatal liquid biomarkers for accurately predicting GA throughout pregnancy, necessitating reliance on ultrasound biometry. We also noted considerable methodological heterogeneity and biases in the CRL measurement equations for first trimester GA estimation, potentially affecting their accuracy. Furthermore, the high operator dependency required for expert fetal biometry in later trimesters underscores the need for standardised techniques, especially critical for late-presenting pregnancies in resource-limited settings. Collectively, these reviews underscore the imperative for a reliable, universally applicable, and less operator-dependent method for GA estimation. Added value of this study Our study introduces an AI model that uses ultrasound images to estimate GA with minimal operator input. This approach addresses the limitations identified in the systematic reviews by providing a standardized, accessible, and scalable solution for accurate GA estimation across diverse clinical settings. Implications of all the available evidence Integrating findings from previous systematic reviews with our current study suggests that our AI-enhanced ultrasound approach could revolutionize prenatal care, especially in under-resourced areas. This technology supports the global health goal of improving prenatal screenings and maternal-fetal outcomes. Future research should validate this AI model in various clinical settings and continue to refine its capabilities through broader data integration. Introduction Accurate gestational age (GA) estimation is fundamental to prenatal care, guiding critical decisions that impact both maternal and fetal health. Knowledge of GA is essential as it influences the interpretation of fetal growth and well-being, the timing of medical interventions, and the planning and timing of birth. [ 1 ][ 2 ][ 3 ]. Clinically GA has been estimated using the last menstrual period (LMP) as a proxy for the time since conception. However, this method is associated with large error due to assumptions about menstrual regularity and ovulation timing, coupled with the potential for inaccurate recall [ 1 ][ 4 ][ 5 ][ 6 ][ 7 ]. These limitations have driven the adoption of GA assessment using ultrasound-based fetal measurement. Ultrasound is a safe and widely used low-cost diagnostic technique and is a foundational aspect of pre-natal care globally. Fetal ultrasound measurement is now considered the gold standard for GA assessment, particularly when performed early in the first trimester of gestation [ 8 ]. During this period, crown-rump length (CRL) measurements predicts GA with a precision of 3–7 days [ 1 ][ 2 ][ 3 ][ 9 ][ 10 ][ 11 ][ 12 ]. Due to fetal curling with advancing pregnancy, the fetal CRL cannot be effectively measured after 14 weeks, and in the second and third trimester a combination of other fetal measurements are used to determine estimated gestational age. The accuracy of GA determination by fetal biometry reduces with advancing gestation – it is ± 7–10 days until 24 weeks and decreases to ± 10–14 days between 24 and 28 weeks [ 1 ][ 3 ][ 13 ][ 14 ]. In the third trimester (greater than 28 weeks gestation), ultrasound estimation is even less accurate, with previous studies reporting accuracy of ± 21 to 30 days [ 1 ][ 3 ][ 13 ][ 14 ]. These large errors, combined with the need for significant expertise on how to do ultrasound, has led to research into alternative methods. However, thus no approach has yet matched ultrasound for GA estimation across the full spectrum of pregnancy. Biomarkers, including human chorionic gonadotropin (hCG) and various metabolomic profiles, have shown some promise in early gestation but are hindered by inconsistencies, wide reference ranges, and limited windows of accuracy [ 15 ]. In low-resource settings, where late presentation to antenatal care is common, these limitations are particularly problematic. Consequently, ultrasound remains the preferred modality for GA determination, and the World Health Organization (WHO) recommends that all pregnant women receive at least one ultrasound scan before 24 weeks of gestation. However, this recommendation is aspirational in many low- and middle-income countries (LMICs): here the challenges center around the fact that accurate GA estimation through ultrasound requires availability of the technology, but also expertise to obtain and interpret the necessary measurements. Obtaining precise biometric measurements demands considerable operator skill, time, and fetal cooperation, making it a burdensome task even in well-equipped settings [ 16 ]. This is compounded by the fact that in many LMICs the first antenatal visit occurs late in pregnancy, which diminishes the accuracy of GA estimation. It is therefore desirable to develop methods for estimating GA which are easier, quicker and require less operator skill to perform, and which can be applied at all gestations. To address these challenges, we introduce a novel approach that trains a deep learning model on a very large and diverse dataset of ultrasound images. The data, stored during routine obstetric examinations along with corresponding gestational age data, are much larger and more diverse than used previously: it contains data spanning 3 continents and is more than an order of magnitude larger than any used to develop similar models. Importantly, the model does not require images from biometry planes as used in most approaches defined by imaging protocols [ 2 ][ 17 ][ 18 ]) and can provide accurate GA estimates with minimal operator input. In other words we sought to develop a model designed to work with images obtained without the need for specialized sonographic techniques, making it accessible to users with low levels of training. The model also includes an estimate of uncertainty, allowing for greater confidence in the results, particularly when images are suboptimal. This property of the model enables its potential use on ultrasound data obtained by a wider population, who have not been specially trained in sonography. Our approach has the potential to democratize access to accurate GA estimation, particularly in settings where skilled sonographers are scarce. By enabling novice users to obtain reliable GA assessments, this technology could significantly enhance prenatal care in underserved regions, aligning with global health goals to improve maternal and child outcomes. In this paper, we present the development and validation of this AI-based model, demonstrating its superior performance compared to traditional biometry-based methods across a wide range of gestational ages and maternal characteristics. We also explore its potential for broader application, including its use in low-cost, portable ultrasound devices that could bring this critical diagnostic capability to even the most remote and resource-limited settings. Methods We trained a Deep Learning model that estimates fetal GA directly from ultrasound images, without requiring measurement information or the acquisition of specific views of the fetus. The model outputs not just an estimate of the GA, but also the level of uncertainty that is inherent in the estimate. The level of uncertainty that it reports varies from image to image and depends upon the following factors: The amount of useful information within the image (some images, for example when the probe is not pointing at the fetus, are entirely unsuitable for GA estimation). The extent to which the image is of a type that is well represented in the training data. The further out-of-distribution an image is, the greater the uncertainty in the estimate will be. Model training was undertaken using fetal ultrasound image data which was collected from multiple centres in Australia, India, and the UK, creating a diverse dataset. A stratified sampling strategy was applied to the training data based on GA bands and ultrasound probe types, to ensure robust performance across various clinical scenarios. Model validation was conducted on several independent retrospective datasets, assessing accuracy of the approach in comparison to GA estimates derived from clinical biometric measurements like biparietal diameter and head circumference, to evaluate the model’s efficacy throughout pregnancy and across different maternal BMI categories. Both training and validation datasets were annotated with gold standard GA estimates, taken during routine clinical practice. Gold Standard for GA Throughout the development and validation analysis, the “ground-truth” GA value that the model output was compared against was computed from a previous CRL measurement taken between 9 + 0 and 13 + 6 weeks. More specifically: At all centres, CRL measurements were taken by expert and quality controlled sonologists. That measurement was converted to a GA estimate as per [ 9 ]. This GA estimate related to the day on which the CRL measurement was taken. The GA estimate was then adjusted forward to the date of the ultrasound examination (when the relevant image was captured, in many cases in a subsequent examination to the one in which CRL measurement was performed), by adding the number of days elapsed since the CRL measurement was taken. Datasets For model development we collated a large, multicenter anonymised dataset of ultrasound imaging data from 6 centres in Australia, India and the UK. These datasets(Table 1 ) contain imaging records obtained during routine clinical scanning, together with associated information on gestational age and a limited amount of maternal demographic information. To ensure robust model performance (including mitigating the possible impact of confounding effects), and in support of an efficient training process, a stratified sample of the available data was used to develop the model. The strata used when sampling were: Data Source GA (in 4-week bands) Ultrasound probe type (TA/TV) The data over those strata were balanced by randomly sampling varying numbers of images per subject. No exclusions were made from the development data. It therefore contains (in approximately their natural frequencies) singleton and multiple pregnancies, as well as congenital abnormalities. This a priori decision was considered to be useful from the perspective of external validity and model performance under a wide range of circumstances. The sampled data were split, using computerised randomisation, between training and holdout sets in a 90:10 proportion in order to yield a robust holdout volume while retaining as much data for model training as possible. The unit of randomisation was the individual woman, so that all images pertaining to any given subject were assigned to either the training set or the holdout set. The holdout set was used during the model development process to optimise hyperparameters and to confirm the model’s ability to generalise. Following sampling and assignment, the data volumes used to develop the model were as detailed in Table 2 . Note that the total number of images (but not the number of data subjects) used is lower than the total available due to the sampling process described above, which sampled only a random sample of the images for each subject. Deep Learning Model A Deep Learning [ 20 ] model was used to produce GA estimates from ultrasound images trained via supervised learning – a process by which the model is optimised to generate estimates that are as close as possible to those provided in the training dataset. Deep Learning has demonstrated state of the art performance in computer vision tasks since 2012 [ 21 ] and now dominates the field, making it a natural approach for this task. One challenge in producing GA estimates based on ultrasound images taken by novices is that images may be unsuitable for this purpose (as an extreme example, images may not show a fetus at all). To overcome this problem, we designed the model to report the level of uncertainty in GA prediction. To do this we follow the approach described by Stirn et al [ 22 ] to construct a neural network comprising the following elements: A trunk network \(\:{f}_{trunk}:\left(X,\:{{\Theta\:}}_{z}\right)\underset{}{\to\:}{\mathbb{R}}^{d}\) which, given parameters \(\:{\theta\:}_{z}ϵ{{\Theta\:}}_{z}\) maps an image \(\:xϵX\) to a \(\:d\) dimensional representation vector. We employed a ConvNeXt [ 23 ] architecture for \(\:{f}_{trunk}\) with \(\:d=768\) A mean prediction head \(\:\mu\::\left({\mathbb{R}}^{d},\:{{\Theta\:}}_{\mu\:}\right)\underset{}{\to\:}\mathbb{R}\) which, given parameters \(\:{\theta\:}_{\mu\:}ϵ{{\Theta\:}}_{\mu\:}\) maps a representation vector \(\:vϵ{\mathbb{R}}^{d}\) to an estimate of the mean of a normal distribution describing the target variable A variance prediction head \(\:\sigma\::\left({\mathbb{R}}^{d},\:{{\Theta\:}}_{\sigma\:}\right)\underset{}{\to\:}{\mathbb{R}}^{+}\) which, given parameters \(\:{\theta\:}_{\sigma\:}ϵ{{\Theta\:}}_{\sigma\:}\) maps a representation vector \(\:vϵ{\mathbb{R}}^{d}\) to an estimate of the sigma parameter of a normal distribution describing the target variable Since they are non-negative we modelled the GA values in log space. The parameter optimisation process sought to minimise $$\:\mathcal{L}:=\sum\:_{\left(x,y\right)ϵ\mathcal{D}}\frac{{\left|y-\mu\:\left(x\right)\right|}_{2}^{2}}{2}-\text{ln}\mathcal{N}\left(y|⌊\mu\:\left(x\right)⌋,\sigma\:\left(⌊\:{f}_{trunk}\left(x\right)⌋\right)\right)$$ Where \(\:\mathcal{D}\) denotes the dataset of image \(\:x\) and label \(\:y\) pairs \(\:{\left|\bullet\:\right|}_{2}\) denotes the \(\:{l}^{2}\) -norm \(\:\mathcal{N}\) denotes the pdf of a standard Normal distribution \(\:⌊\bullet\:⌋\) denotes a stop-gradient operation Despite the stop-gradient operations applied in this approach, we found that phased model training delivers superior results: Phase 1 – freeze the sigma head, optimising only parameters in the trunk and mean prediction head. Phase 2 – Finetune from the Phase 1 model, with all parameters unfrozen. Each phase was trained for 100 epochs, using the Adam optimiser [ 24 ] with \(\:\epsilon\:={10}^{-8}\) , \(\:{\beta\:}_{1}=0.9\) and \(\:{\beta\:}_{2}=0.999\) . Learning rate was set on a 2-phase schedule: a linear warm up phase increasing it from 0 to a maximum value over the first 10 epochs and then decreasing it again to 0 over a cosine annealing schedule over the following 90 epochs. The maximum learning rate that was applied was derived via the process described in [ 25 ]. Training data were augmented during mini-batch preparation to enhance diversity and encourage better model generalisation. A variant of the RandAugment [ 26 ] approach was taken by which a random set of transformations was selected per image. The list of available transformations was tailored towards being appropriate for ultrasound images, and is as follows: rotation, re-scaling, horizontal flip, blur, brightness & contrast jitter, pixel-wise multiplicative noise and grid distortion. Applying to Video Data Typically, several images of the fetus are obtained during an ultrasound examination, and in real world imaging this is in the form of a real time video. In order to ensure applicability to video clips, our method was designed to generate more precise estimates when multiple images of a given fetus are available, by applying a static 1D Kalman Filter to the estimates (and corresponding uncertainties) that are obtained by applying the Deep Learning model to the individual images. The Kalman Filter results are continuously updated as new frames are available, producing values for the mean and standard deviation of the estimate based on all of the frames analysed up to that point. At the point that the resultant standard deviation drops below a threshold value, indicating sufficient prediction in the estimate, the process is stopped and the mean value reported. Validation Validation was undertaken by analysis of retrospective data that were wholly independent of the data used in model development. GA estimates calculated by the AI model were compared to GA estimates according to fetal biometric measurement at the same ultrasound examination, which represents clinical best practice. This comparison was made using Mean Absolute Error (MAE). Two datasets were used for validation: Retrospective image data: Comprising a large sample of image sets stored during routine ultrasound examinations. This dataset was large enough to support subset analyses by gestational age, maternal BMI, and country of scanning. Retrospective video data: Comprising a smaller sample of ultrasound videos, that were created by splicing together small random sub-segments of full-length fetal scans in a random order to approximate scans conducted in an undirected manner. Validation Using Images This component of the validation compared model estimates obtained from sets of images stored in patient records during a routine ultrasound examination to those obtained from biometric measurement versus the gold standard. The dataset covers a wide range of the period of fetal gestation, a range of maternal BMIs and countries, and is summarised in Table 4 . The sample size was estimated to enable detection of a 1-day difference in MAE at 90% confidence level for all subsets of interest. No data from any of the patients included in the validation dataset was used at all during the development process. Validation Using Videos This component of the validation aims to confirm that the model estimates are accurate when calculated from videos of scans that were obtained in an undirected fashion. It further aims to establish that the results obtained from the model do not depend heavily on the particular order in which fetal anatomy happens to be scanned. A dataset was created that consists of videos of ultrasound scanning, designed to simulate undirected scanning (for example scanning without reference to the ultrasound images). To achieve this, 99 full length videos of a routine ultrasound examinations and where actual GA was known (based on previous CRL, the gold standard) were used. From these 99 videos we created 99 3-minute videos via the following process: For each scan, randomly select 36 non-overlapping 5-second subsegments of the video For each scan, randomly shuffle the order of the subsegments Concatenate together the shuffled subsegments into a 3-minute clip For each of these videos, we also repeated steps 2 and 3 to produce another video having the same content but with the frames in a different order. During the process of video analysis, the model assesses each frame it sees to determine whether it contains a frames containing useful information 7 and the model only utilizes frames that contain such information. Because the model outputs estimates once it generates a sufficiently confident result, then ignoring the rest of the video, this means that the re-shuffling process may result in different parts of fetal anatomy being presented to the model during the period that it analyses. We have also analysed the time taken by the algorithm to produce an estimate on each of these videos, and report the cumulative distribution. Comparison to Biometric Estimates For GA estimates by biometry this was done from measurements of the fetal biparietal diameter (BPD), head circumference (HC), abdominal circumference (AC) and, femur length (FL) associated with the scans and based on the following formula [ 14 ]: $$\:{GA}_{days}=7\times\:\left(10.85+\left(0.0006\times\:HC\times\:FL\right)+\left(0.067\times\:BPD\right)+\left(0.0168\times\:AC\right)\right)$$ For scans that were conducted prior to week 14, the clinical gold-standard approach to estimating GA is via CRL measurement. This means that the biometric estimates are, by definition, correct and comparison of model estimates are therefore not feasible. Instead, we compare to an accuracy benchmark [ 12 ], which quantifies the error in CRL-based estimates relative to GAs calculated from conception dates that are known with absolute certainty. Results As outlined above, the model was applied to two types of data: static images obtained from routine ultrasound examinations, and videos that simulate undirected ultrasound scanning. Model validation on images The MAE of the model and biometry-based estimates (“Biometry Measured”, as described in Comparison to Biometric Estimates) are summarised in Table 4 . The AI-model based estimates were consistently more accurate than those obtained from biometry over the whole range of GAs from 10 to 36 weeks(Table 4 ). The superiority of GA estimation compared to that by biometry was strongly statistically significant from 14 to 24 weeks, and moderately significant over 24 weeks. The MAE expected for biometry-based estimates reported in the literature (“Biometry Literature Benchmark”) for that band are also provided in Table 4 . Subanalysis of MAE of the AI-model and biometry-based estimates by country demonstrate that model estimates were at least as accurate as those obtained from biometry in all scanning countries (Table 5 ), with results statistically more accurate for the UK and Australia. The lack of significance among scans from India may be due to the smaller sample size, or a lower than expected error of the biometry-based estimates obtained during weeks 18–24. Subanalysis of MAE by maternal BMI show that model estimates are at least as accurate as those obtained from biometry for all bands of maternal BMI (Table 6 ). The relationship between the predicted gestational age generated by the model and the corresponding ground truth GA values are shown in Fig. 1 . Visual assessment of the accuracy, precision and potential biases of the model’s predictions demonstrates data points are clustered around the diagonal line representing perfect prediction. The Bland-Altman plot represented in Fig. 2 offers a complementary perspective on the relationship between predicted gestational age and corresponding actual GA values. It confirms a high degree of correspondence between the predicted and actual values, and also shows, in common with other approaches to estimating GA, that error magnitudes increase with GA. Model validation with video The MAE of the model estimates have been calculated and summarised within bands of (actual) GA and are reported in Table 7 . They show that the model estimates are more accurate than would be expected to be obtained from biometric measurement in all trimesters. A scatter plot and Bland-Altman plot of the relationship between predicted and actual GA values can be found in the Supplementary Materials. The amount of time needed to generate a sufficiently confident GA estimate varies based on the level of useful information encountered within the frames of each video. Figure 3 shows the cumulative distribution of the time needed to produce predictions on the validation set, demonstrating that in 95% of cases less than a minute of video is needed. The median time to produce an estimate is 24 seconds. To assess the impact that re-shuffling the 5-second video snippets within the videos has on model predictions, a GA estimate was produced for shuffled and re-shuffled versions of the same video. This resulted in re-shuffled videos presenting frames to the model in a different order, but also, as noted above, some frames that were different from the original shuffled video. The difference in the estimates obtained from each pair of videos was less than 3 days in around 90% of cases. This is a relatively small movement in relation to the average magnitude of error in the estimates, suggesting that the accuracy of model estimates was not strongly dependent on the order in which fetal anatomy is scanned. Discussion The main contributions of this paper are to: (i) Introduce a method of training a Deep Learning model from very large retrospective ultrasound image datasets capable of outputting both a GA estimate and a well-calibrated estimate of the level of uncertainty inherent in that estimate, (ii) Introduce a method for applying such a model to a succession of ultrasound images obtained from a video sequence or collection of static images, by applying a static 1D Kalman filter to its outputs and (iii) Train such a model on a large, very diverse dataset of ultrasound images, containing data on over 75k fetuses, and sourced from scanning centres spanning 3 continents. Our results show that the accuracy of the outputs of this model is superior to estimates obtained via current standard clinical practice (biometric measurement), both at an overall level and in all key subset analyses that were performed. While direct comparisons to other results presented in the literature are made difficult by differences in data, scanning approach, and methods of analysis we consider that the results that we have presented are competitive with those of Gomes et.al [ 27 ], who developed a deep learning neural network model trained on a dataset of 1968 subjects to estimate GA from short ultrasound videos, some of which were acquired via a blind sweep scanning protocol, and some which were “fly-to” videos targeting biometry planes. Their model achieved MAE values of approximately 1.92 days (14–19 weeks), 2.78 days (20–25 weeks), 2.97 days (26–31 weeks) and 3.07 days (32–37 weeks) when measured on videos obtained from trained sonographers following a blind sweep protocol. They further demonstrated that accuracy was non-inferior to biometry-based estimates when applied to videos obtained from novices following the blind sweep protocol (though lower accuracy than performed by a trained sonographer). The methods we describe in this paper have the benefit of being readily applicable to large retrospective image archives, rather than requiring video data for model training. Maraci et al. [ 28 ] describe a method for estimating GA by applying deep learning to images of the trans-cerebellar view of the fetal head to produce an automated TCD measurement than can then be a GA estimate via standard reference charts. Their model was trained on a dataset of 500 ultrasound images of fetuses for between 16 and 26 weeks GA and yielded MAE values of approximately 5.4 days, though with respect to a GA estimate obtained through manual TCD measurement, which is not the gold-standard approach. Finally, Lee et al. [ 29 ] describe a slightly more complex approach which uses deep learning to analyse images from multiple biometric planes (TV, AC, FL) to estimate GA directly. They utilized a dataset of 3809 subjects and reported MAE values of 2.6 days during 14–19 weeks of pregnancy, 3.1 days (20–25 weeks), 3.3 days (26–31 weeks), and 4.6 days (32–37 weeks). A drawback of both of the latter two approaches is that it requires the operator to acquire images from particular planes, meaning that they are not applicable to scanning by novices. A limitation of the analysis presented is that it does not quantify how accurate the model is when applied directly to videos of scans obtained by novice users. A natural next step in enhancing our understanding of the performance of our method in a real-world setting will be to perform prospective study in which novice users obtain GA estimates using it and then to analyse the results of that. We consider that the methods presented represent progress towards democratising the use of ultrasound in clinical decision making, through developing automated systems that are able to interpret ultrasound data that is acquired without requiring sonographic training. Such systems could be operated by a wider set of medical practitioners, including, for example General Practitioners and midwives. This could drive improvements in clinical care in LMICs especially but may also enhance efficiency and patient outcomes in developed countries too. Declarations Declaration of Interests MB, SW, TH and NS are permanent employees of Intelligent Ultrasound. AP is a scientific advisor for Intelligent Ultrasound. SM and SS are directors of medical institutions which have contributed data to the research. Funding: This study was funded by Intelligent Ultrasound. ATP was part-funded by the NIHR Oxford Biomedical Research Centre. Author Contribution MB: Conceptualisation of the study and methodology design.Oversaw data management, model development, and validation.Drafted and reviewed the manuscript.SWContributed to data preparation and pre-processing for deep learning model training.Assisted with statistical analysis and interpretation of results.Drafted and reviewed the manuscript.THDevelopment and optimisation of the deep learning model.Performed validation experiments and uncertainty estimation.Contributed to manuscript writing, particularly in the methods section.SM and SSProvided clinical data and guidance on ultrasound image acquisition protocols.Contributed to the interpretation of clinical relevance of findings.Reviewed the manuscript critically for intellectual content.NSConceptualisation of the study and methodology design.Contributed to the study’s technical design and software development.Provided input on AI model architecture and implementation.Coordinated the collaboration among the institutions.Co-wrote, reviewed and approved the final manuscript.ATPConceptualisation and overall guidance on study design and clinical translation.Coordinated the collaboration among the institutions.Co-wrote, edited and approved the manuscript as the senior author. Data Availability Data used in this analysis cannot be shared due to ethical committee and participant privacy constraints. References HALL, M. H., & CARR-HILL, R. A. (1985). The significance of uncertain gestation for obstetric outcome. BJOG: An International Journal of Obstetrics & Gynaecology, 92(5), 452–460. Bilardo, C. M., Chaoui, R., Hyett, J. A., Kagan, K. O., Karim, J. N., Papageorghiou, A. T., … Nicolaides, K. H. (2023). ISUOG Practice Guidelines (updated): performance of 11–14-week ultrasound scan. Ultrasound in Obstetrics and Gynecology, 61(1). Kalish, R. B., & Chervenak, F. A. (2005). Sonographic determination of gestational age. The ultrasound review of obstetrics and Gynecology, 5(4), 254–258. Savitz, D. A., Terry Jr, J. W., Dole, N., Thorp Jr, J. M., Siega-Riz, A. M., & Herring, A. H. (2002). Comparison of pregnancy dating by last menstrual period, ultrasound scanning, and their combination. American journal of obstetrics and gynecology, 187(6), 1660–1666. Wegienka, G., & Baird, D. D. (2005). A comparison of recalled date of last menstrual period with prospectively recorded dates. Journal of women's health, 14(3), 248–252. Chiazze, L., Brayer, F. T., Macisco, J. J., Parker, M. P., & Duffy, B. J. (1968). The length and variability of the human menstrual cycle. Jama, 203(6), 377–380. Creinin, M. D., Keverline, S., & Meyn, L. A. (2004). How regular is regular? An analysis of menstrual cycle regularity. Contraception, 70(4), 289–292. Pettker, C. M., Goldberg, J. D., El-Sayed, Y. Y., & Copel, J. A. (2017).. Obstetrics and gynecology, 129(5), E150-E154. Robinson, H. P., & Fleming, J. E. E. (1975). A critical evaluation of sonar “crown-rump length” measurements. BJOG: An International Journal of Obstetrics & Gynaecology, 82(9), 702–710. Ohuma, E. O., Papageorghiou, A. T., Villar, J., & Altman, D. G. (2013). Estimation of gestational age in early pregnancy from crown-rump length when gestational age range is truncated: the case study of the INTERGROWTH-21 st Project. BMC medical research methodology, 13, 1–14. Geirsson, R. T. (1991). Ultrasound instead of last menstrual period as the basis of gestational age assignment. Ultrasound in Obstetrics and Gynecology: The Official Journal of the International Society of Ultrasound in Obstetrics and Gynecology, 1(3), 212–219. Papageorghiou, A. T., Kennedy, S. H., Salomon, L. J., Ohuma, E. O., Cheikh Ismail, L., Barros, F. C., … International Fetal and Newborn Growth Consortium for the 21st Century (INTERGROWTH-21st). (2014). International standards for early fetal size and pregnancy dating based on ultrasound measurement of crown–rump length in the first trimester of pregnancy. Ultrasound in Obstetrics & Gynecology, 44(6), 641–648. Benson, C. B., & Doubilet, P. M. (1991). Sonographic prediction of gestational age: accuracy of second-and third-trimester fetal measurements. AJR. American journal of roentgenology, 157(6), 1275–1277. Hadlock, F. P., Deter, R. L., Harrist, R. B., & Park, S. K. (1984). Estimating fetal age: computer-assisted analysis of multiple fetal growth parameters. Radiology, 152(2), 497–501. Bradburn E, Conde-Agudelo A, Roberts NW, Villar J, Papageorghiou AT. Accuracy of prenatal and postnatal biomarkers for estimating gestational age: a systematic review and meta-analysis. EClinicalMedicine. 2024;70:102498. doi: 10.1016/j.eclinm.2024.102498 Drukker, L., Yasrab, R., Noble, J. A., & Papageorghiou, A. T. (2021). Vp18. 07: First trimester scans: how much time does it take to acquire the crl and nt?. Ultrasound in Obstetrics & Gynecology, 58. Salomon, L. J., Alfirevic, Z., Berghella, V., Bilardo, C. M., Chalouhi, G. E., Costa, F. D. S., … Lee, W. (2022). ISUOG Practice Guidelines (updated): performance of the routine mid-trimester fetal ultrasound scan. Ultrasound in Obstetrics and Gynecology, 59(6), 840–856. Khalil, A., Sotiriadis, A., D'Antonio, F., Da Silva Costa, F., Odibo, A., Prefumo, F., … Salomon, L. J. (2024). ISUOG Practice Guidelines: performance of third-trimester obstetric ultrasound scan. Ultrasound in Obstetrics & Gynecology, 63(1), 131–147. Pexsters, A., Daemen, A., Bottomley, C., Van Schoubroeck, D., De Catte, L., De Moor, B., … Bourne, T. (2010). New crown–rump length curve based on over 3500 pregnancies. Ultrasound in Obstetrics and Gynecology: The Official Journal of the International Society of Ultrasound in Obstetrics and Gynecology, 35(6), 650–655. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436–444. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25. Stirn, A., Wessels, H., Schertzer, M., Pereira, L., Sanjana, N., & Knowles, D. (2023, April). Faithful heteroscedastic regression with neural networks. In International Conference on Artificial Intelligence and Statistics (pp. 5593–5613). PMLR. Liu, Z., Mao, H., Wu, C. Y., Feichtenhofer, C., Darrell, T., & Xie, S. (2022). A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11976–11986). Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. Smith, L. N. (2017, March). Cyclical learning rates for training neural networks. In 2017 IEEE winter conference on applications of computer vision (WACV) (pp. 464–472). IEEE. Cubuk, E. D., Zoph, B., Shlens, J., & Le, Q. V. (2020). Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 702–703). Gomes, R. G., Vwalika, B., Lee, C., Willis, A., Sieniek, M., Price, J. T., … Shetty,S. (2022). A mobile-optimized artificial intelligence system for gestational age and fetal malpresentation assessment. Communications Medicine, 2(1), 128. Maraci, M. A., Yaqub, M., Craik, R., Beriwal, S., Self, A., Von Dadelszen, P., … Noble,J. A. (2020). Toward point-of-care ultrasound estimation of fetal gestational age from the trans-cerebellar diameter using CNN-based ultrasound image analysis. Journal of Medical Imaging, 7(1), 014501–014501. Lee, L. H., Bradburn, E., Craik, R., Yaqub, M., Norris, S. A., Ismail, L. C., … Papageorghiou,A. T. (2023). Machine learning for accurate estimation of fetal gestational age based on ultrasound images. NPJ digital medicine, 6(1), 36. Tables Table 1. Summary of retrospective ultrasound image data obtained during routine clinical scanning collated from multiple centres in India, Australia and the UK. Data Source Images (n) Dataset Description Route Australia 9,104,090 Imaging centers, Melbourne, Australia. TA and TV GE Voluson E6, E8 and E10. India 1,028,534 Hospital based, Chennai, India. TA and TV GE Voluson E6, E7, E10, P8, S6 and Mindray Resona 7. UK 639,618 Imaging center, London, UK. TA GE Voluson E10 and Expert 22. Table 2. The data volumes used for developing the model Data Source Training Set Holdout Set Number of Images Number of Subjects Number of Images Number of Subjects India 610,208 29,094 67,386 3,215 Australia 1,149,244 41,391 127,627 4,607 UK 240,877 8,046 25,258 896 Total 2,000,329 78,531 220,271 8,718 Table 3. Summary of retrospective image data for model validation GA Band Number of Data Subjects (10,14] 193 (14,18] 98 (18,22] 184 (22,26] 100 (26-30] 84 (30-34] 83 Total 742 Table 4. Comparison of MAEs by GA Band GA Band (weeks) Number of Scans Biometry Literature Benchmark MAE (days) Biometry Measured MAE (days) IU ScanNav FetalCheck MAE (days) MAE Superiority to Literature Benchmark 1 -ve = improvement to benchmark MAE Superiority to Measured -ve = improvement to benchmark Superiority to Measured p-value 2 10-14 190 2.8 3 - 4 1.3 -1.5 (-54%) - - 14-18 100 3.0 5 3.4 1.7 -1.3 (-43%) -1.7 (-50%) <0.001 18-24 278 3.9 5 3.6 2.8 -1.1 (-28%) -0.8 (-28%) 0.001 24-30 92 5.0 5 6.1 5.0 0.0 (0%) -1.1 (-18%) 0.06 30-36 82 6.8 5 5.7 4.7 -2.1 (-31%) -1.0 (-18%) 0.08 Table 5. Comparison of MAEs by Scanning Country Scanning Country Number of Scans Biometry Measured MAE (days) IU ScanNav FetalCheck MAE (days) MAE Superiority to Measured Superiority to Measured p-value 6 UK 98 2.4 1.5 -0.9 (-38%) <0.001 India 344 5.4 3.5 -1.9 (-35%) 0.14 Australia 300 3.7 2.2 -1.5 (-41%) <0.001 Table 6 . Comparison of MAEs by Maternal BMI Maternal BMI Number of Scans Biometry Measured MAE (days) IU ScanNav FetalCheck MAE (days) MAE Superiority to Measured Superiority to Measured p-value 6 30 108 3.8 3.2 -0.6 (-15%) 0.22 Table 7. MAEs of estimates obtained from video GA Band Number of Scans Expected Biometry MAE (days) IU ScanNav FetalCheck MAE (days) MAE Superiority to Expected 10-14 Weeks 36 2.8 2.5 -0.3 (-11%) 14-27 Weeks 58 3.9 3.7 -0.2 (-5%) 27-34 Weeks 5 5.4 2.6 -2.8 (-52%) [1] Absolute difference is reported, followed by relative difference in brackets. [2] Calculated via Wilcoxon Signed-Rank Test [3] Derived from error standard deviation reported in [10], assuming CRL=55mm. [4] Since CRL-based GA is by definition correct in this band, the measured value is not reported. [5] Derived from error standard deviations reported in [15]. [6] Calculated via Wilcoxon Signed-Rank Test Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 20 Nov, 2025 Read the published version in npj Digital Medicine → Version 1 posted Editorial decision: Revision requested 01 Mar, 2025 Reviews received at journal 01 Mar, 2025 Reviews received at journal 28 Feb, 2025 Reviewers agreed at journal 28 Feb, 2025 Reviewers agreed at journal 28 Feb, 2025 Reviewers invited by journal 28 Feb, 2025 Editor assigned by journal 28 Jan, 2025 Submission checks completed at journal 28 Jan, 2025 First submitted to journal 26 Jan, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5907990","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":408515933,"identity":"49440054-fd0f-4bcb-907d-62124cd0a451","order_by":0,"name":"Martin Benson","email":"","orcid":"","institution":"Intelligent Ultrasound","correspondingAuthor":false,"prefix":"","firstName":"Martin","middleName":"","lastName":"Benson","suffix":""},{"id":408515934,"identity":"4b6b5ef1-8488-4a74-abd8-d97a3a71b852","order_by":1,"name":"Sacha Walton","email":"","orcid":"","institution":"Intelligent Ultrasound","correspondingAuthor":false,"prefix":"","firstName":"Sacha","middleName":"","lastName":"Walton","suffix":""},{"id":408515935,"identity":"57afd7de-6a82-424b-a918-58fc1277eb80","order_by":2,"name":"Tom Hartley","email":"","orcid":"","institution":"Intelligent Ultrasound","correspondingAuthor":false,"prefix":"","firstName":"Tom","middleName":"","lastName":"Hartley","suffix":""},{"id":408515936,"identity":"5be9c21a-92b9-4260-8a42-d92ed2a88e1e","order_by":3,"name":"Simon Meagher","email":"","orcid":"","institution":"Monash Ultrasound for Women","correspondingAuthor":false,"prefix":"","firstName":"Simon","middleName":"","lastName":"Meagher","suffix":""},{"id":408515937,"identity":"0b1b9e6d-cbf7-44c4-806f-91bb13bc714c","order_by":4,"name":"Suresh Seshadri","email":"","orcid":"","institution":"Mediscan Systems","correspondingAuthor":false,"prefix":"","firstName":"Suresh","middleName":"","lastName":"Seshadri","suffix":""},{"id":408515938,"identity":"0510c6c0-6035-45dc-a1d9-10677ec92034","order_by":5,"name":"Nicholas Sleep","email":"","orcid":"","institution":"Intelligent Ultrasound","correspondingAuthor":false,"prefix":"","firstName":"Nicholas","middleName":"","lastName":"Sleep","suffix":""},{"id":408515939,"identity":"f05654a5-6d3f-46af-b0a0-7e2900dac95c","order_by":6,"name":"Aris T Papageorghiou","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABQklEQVRIie2SPUvDQBiA7wgky0HWO5T4FxICdRH9KzkCTmmXgFQcvFA4l2BXwcHBP6BLyRjJ0OXarBFcQiBzXIJKB6+lSG3ixyiYBw7u6+F9730PgI6OPwlkm6sYAaDEm5vObxTV+UH5TCwHMr9V9OtkVMHocKCPbx+K4dvTro69euc0AoYZK3kF+fG2gueUYyhcH2ela80uS0Su+hMyE8AmTLUx5F4jEwE5gFyhLBM9EoQJMrP+5DHggI4B6Mmj4baxJ6BMjJ/Tmyyt14pXrhQOtLpNMQVkMnpC79JQJexlpajrKGgZpZGYJRPDVEx9K1NtK2AJImG5/xpwbJMR8rEzbzzfEFpRPUdnAyNN8pwtkiNdcwsr4AeGOb24r6oTt60XgDJZfSwbIEvxUUr5C9q6sgYuFT2Ws8UXNzo6Ojr+Ne+omnYOAt6yDgAAAABJRU5ErkJggg==","orcid":"","institution":"Intelligent Ultrasound","correspondingAuthor":true,"prefix":"","firstName":"Aris","middleName":"T","lastName":"Papageorghiou","suffix":""}],"badges":[],"createdAt":"2025-01-26 18:53:15","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5907990/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5907990/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s41746-025-02024-z","type":"published","date":"2025-11-20T15:57:52+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":75087204,"identity":"18effd9a-6a36-4efb-a653-debe9323b990","added_by":"auto","created_at":"2025-01-30 10:19:28","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":40968,"visible":true,"origin":"","legend":"\u003cp\u003eScatter plot of predicted and actual gestational age values\u003c/p\u003e","description":"","filename":"image1.png","url":"https://assets-eu.researchsquare.com/files/rs-5907990/v1/83744fff379f207bfcfdb3c4.png"},{"id":75087205,"identity":"fd00bccc-69c1-4445-83f9-e35b26e830ee","added_by":"auto","created_at":"2025-01-30 10:19:28","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":54280,"visible":true,"origin":"","legend":"\u003cp\u003ePredicted vs Actual GA Bland-Altman Plot\u003c/p\u003e","description":"","filename":"image2.png","url":"https://assets-eu.researchsquare.com/files/rs-5907990/v1/e2d10832ce6149cdf73c3283.png"},{"id":75087208,"identity":"6b8312cc-5f97-4f08-9276-68ddf728d6dd","added_by":"auto","created_at":"2025-01-30 10:19:28","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":11716,"visible":true,"origin":"","legend":"\u003cp\u003eCumulative Distribution of Time Taken to Generate GA Prediction\u003c/p\u003e","description":"","filename":"image3.png","url":"https://assets-eu.researchsquare.com/files/rs-5907990/v1/84a3bf2370ac1f624c012a44.png"},{"id":96650884,"identity":"29517a62-6f31-43da-b879-b3215fc9079a","added_by":"auto","created_at":"2025-11-24 16:12:33","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1181898,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5907990/v1/4c1e6c92-b277-47ce-ae0d-a42cbe0fd579.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Fetal Gestational Age Estimation Using AI on Simple Ultrasound Images and Video","fulltext":[{"header":"Research in Context","content":"\u003ch2\u003eEvidence before this study\u003c/h2\u003e\n\u003cp\u003ePrior to conducting our research, we updated our previous three systematic reviews to thoroughly understand current methods for estimating gestational age (GA). Comprehensive database searches were conducted across MEDLINE, Embase, CINAHL, LILACS, the Cochrane Database of Systematic Reviews, and the Science Citation Index, covering studies published from January 1970 up to the present. These searches assessed studies on prenatal and postnatal liquid biomarkers for GA estimation; dating by Crown-rump length (CRL) in the first trimester; and ultrasound biometry and symphysis-fundal height measurements in the second and third trimesters. Our findings highlighted the lack of effective prenatal liquid biomarkers for accurately predicting GA throughout pregnancy, necessitating reliance on ultrasound biometry. We also noted considerable methodological heterogeneity and biases in the CRL measurement equations for first trimester GA estimation, potentially affecting their accuracy. Furthermore, the high operator dependency required for expert fetal biometry in later trimesters underscores the need for standardised techniques, especially critical for late-presenting pregnancies in resource-limited settings. Collectively, these reviews underscore the imperative for a reliable, universally applicable, and less operator-dependent method for GA estimation.\u003c/p\u003e\n\u003ch2\u003eAdded value of this study\u003c/h2\u003e\n\u003cp\u003eOur study introduces an AI model that uses ultrasound images to estimate GA with minimal operator input. This approach addresses the limitations identified in the systematic reviews by providing a standardized, accessible, and scalable solution for accurate GA estimation across diverse clinical settings.\u003c/p\u003e\n\u003ch2\u003eImplications of all the available evidence\u003c/h2\u003e\n\u003cp\u003eIntegrating findings from previous systematic reviews with our current study suggests that our AI-enhanced ultrasound approach could revolutionize prenatal care, especially in under-resourced areas. This technology supports the global health goal of improving prenatal screenings and maternal-fetal outcomes. Future research should validate this AI model in various clinical settings and continue to refine its capabilities through broader data integration.\u003c/p\u003e"},{"header":"Introduction","content":"\u003cp\u003eAccurate gestational age (GA) estimation is fundamental to prenatal care, guiding critical decisions that impact both maternal and fetal health. Knowledge of GA is essential as it influences the interpretation of fetal growth and well-being, the timing of medical interventions, and the planning and timing of birth. [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e][\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e][\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eClinically GA has been estimated using the last menstrual period (LMP) as a proxy for the time since conception. However, this method is associated with large error due to assumptions about menstrual regularity and ovulation timing, coupled with the potential for inaccurate recall [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e][\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e][\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e][\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e][\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. These limitations have driven the adoption of GA assessment using ultrasound-based fetal measurement. Ultrasound is a safe and widely used low-cost diagnostic technique and is a foundational aspect of pre-natal care globally. Fetal ultrasound measurement is now considered the gold standard for GA assessment, particularly when performed early in the first trimester of gestation [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. During this period, crown-rump length (CRL) measurements predicts GA with a precision of 3–7 days [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e][\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e][\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e][\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e][\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e][\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e][\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. Due to fetal curling with advancing pregnancy, the fetal CRL cannot be effectively measured after 14 weeks, and in the second and third trimester a combination of other fetal measurements are used to determine estimated gestational age. The accuracy of GA determination by fetal biometry reduces with advancing gestation – it is ± 7–10 days until 24 weeks and decreases to ± 10–14 days between 24 and 28 weeks [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e][\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e][\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e][\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. In the third trimester (greater than 28 weeks gestation), ultrasound estimation is even less accurate, with previous studies reporting accuracy of ± 21 to 30 days [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e][\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e][\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e][\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThese large errors, combined with the need for significant expertise on how to do ultrasound, has led to research into alternative methods. However, thus no approach has yet matched ultrasound for GA estimation across the full spectrum of pregnancy. Biomarkers, including human chorionic gonadotropin (hCG) and various metabolomic profiles, have shown some promise in early gestation but are hindered by inconsistencies, wide reference ranges, and limited windows of accuracy [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. In low-resource settings, where late presentation to antenatal care is common, these limitations are particularly problematic.\u003c/p\u003e \u003cp\u003eConsequently, ultrasound remains the preferred modality for GA determination, and the World Health Organization (WHO) recommends that all pregnant women receive at least one ultrasound scan before 24 weeks of gestation. However, this recommendation is aspirational in many low- and middle-income countries (LMICs): here the challenges center around the fact that accurate GA estimation through ultrasound requires availability of the technology, but also expertise to obtain and interpret the necessary measurements. Obtaining precise biometric measurements demands considerable operator skill, time, and fetal cooperation, making it a burdensome task even in well-equipped settings [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. This is compounded by the fact that in many LMICs the first antenatal visit occurs late in pregnancy, which diminishes the accuracy of GA estimation. It is therefore desirable to develop methods for estimating GA which are easier, quicker and require less operator skill to perform, and which can be applied at all gestations.\u003c/p\u003e \u003cp\u003eTo address these challenges, we introduce a novel approach that trains a deep learning model on a very large and diverse dataset of ultrasound images. The data, stored during routine obstetric examinations along with corresponding gestational age data, are much larger and more diverse than used previously: it contains data spanning 3 continents and is more than an order of magnitude larger than any used to develop similar models. Importantly, the model does not require images from biometry planes as used in most approaches defined by imaging protocols [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e][\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e][\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]) and can provide accurate GA estimates with minimal operator input. In other words we sought to develop a model designed to work with images obtained without the need for specialized sonographic techniques, making it accessible to users with low levels of training. The model also includes an estimate of uncertainty, allowing for greater confidence in the results, particularly when images are suboptimal. This property of the model enables its potential use on ultrasound data obtained by a wider population, who have not been specially trained in sonography.\u003c/p\u003e \u003cp\u003eOur approach has the potential to democratize access to accurate GA estimation, particularly in settings where skilled sonographers are scarce. By enabling novice users to obtain reliable GA assessments, this technology could significantly enhance prenatal care in underserved regions, aligning with global health goals to improve maternal and child outcomes. In this paper, we present the development and validation of this AI-based model, demonstrating its superior performance compared to traditional biometry-based methods across a wide range of gestational ages and maternal characteristics. We also explore its potential for broader application, including its use in low-cost, portable ultrasound devices that could bring this critical diagnostic capability to even the most remote and resource-limited settings.\u003c/p\u003e "},{"header":"Methods","content":"\u003cp\u003eWe trained a Deep Learning model that estimates fetal GA directly from ultrasound images, without requiring measurement information or the acquisition of specific views of the fetus. The model outputs not just an estimate of the GA, but also the level of uncertainty that is inherent in the estimate. The level of uncertainty that it reports varies from image to image and depends upon the following factors:\u003c/p\u003e\n\u003col style=\"list-style-type:lower-roman;\"\u003e\n \u003cli\u003e\n \u003cp\u003eThe amount of useful information within the image (some images, for example when the probe is not pointing at the fetus, are entirely unsuitable for GA estimation).\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eThe extent to which the image is of a type that is well represented in the training data. The further out-of-distribution an image is, the greater the uncertainty in the estimate will be.\u003c/p\u003e\n \u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eModel training was undertaken using fetal ultrasound image data which was collected from multiple centres in Australia, India, and the UK, creating a diverse dataset. A stratified sampling strategy was applied to the training data based on GA bands and ultrasound probe types, to ensure robust performance across various clinical scenarios.\u003c/p\u003e\n\u003cp\u003eModel validation was conducted on several independent retrospective datasets, assessing accuracy of the approach in comparison to GA estimates derived from clinical biometric measurements like biparietal diameter and head circumference, to evaluate the model\u0026rsquo;s efficacy throughout pregnancy and across different maternal BMI categories.\u003c/p\u003e\n\u003cp\u003eBoth training and validation datasets were annotated with gold standard GA estimates, taken during routine clinical practice.\u003c/p\u003e\n\u003cp\u003eGold Standard for GA\u003c/p\u003e\n\u003cp\u003eThroughout the development and validation analysis, the \u0026ldquo;ground-truth\u0026rdquo; GA value that the model output was compared against was computed from a previous CRL measurement taken between 9\u0026thinsp;+\u0026thinsp;0 and 13\u0026thinsp;+\u0026thinsp;6 weeks. More specifically:\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\n \u003cp\u003eAt all centres, CRL measurements were taken by expert and quality controlled sonologists.\u003c/p\u003e\n \u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThat measurement was converted to a GA estimate as per [\u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e]. This GA estimate related to the day on which the CRL measurement was taken.\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\n \u003cp\u003eThe GA estimate was then adjusted forward to the date of the ultrasound examination (when the relevant image was captured, in many cases in a subsequent examination to the one in which CRL measurement was performed), by adding the number of days elapsed since the CRL measurement was taken.\u003c/p\u003e\n \u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eDatasets\u003c/p\u003e\n\u003cp\u003eFor model development we collated a large, multicenter anonymised dataset of ultrasound imaging data from 6 centres in Australia, India and the UK. These datasets(Table \u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e) contain imaging records obtained during routine clinical scanning, together with associated information on gestational age and a limited amount of maternal demographic information.\u003c/p\u003e\n\u003cp\u003eTo ensure robust model performance (including mitigating the possible impact of confounding effects), and in support of an efficient training process, a stratified sample of the available data was used to develop the model. The strata used when sampling were:\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\n \u003cp\u003eData Source\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eGA (in 4-week bands)\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eUltrasound probe type (TA/TV)\u003c/p\u003e\n \u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThe data over those strata were balanced by randomly sampling varying numbers of images per subject. No exclusions were made from the development data. It therefore contains (in approximately their natural frequencies) singleton and multiple pregnancies, as well as congenital abnormalities. This a priori decision was considered to be useful from the perspective of external validity and model performance under a wide range of circumstances.\u003c/p\u003e\n\u003cp\u003eThe sampled data were split, using computerised randomisation, between training and holdout sets in a 90:10 proportion in order to yield a robust holdout volume while retaining as much data for model training as possible. The unit of randomisation was the individual woman, so that all images pertaining to any given subject were assigned to either the training set or the holdout set. The holdout set was used during the model development process to optimise hyperparameters and to confirm the model\u0026rsquo;s ability to generalise.\u003c/p\u003e\n\u003cp\u003eFollowing sampling and assignment, the data volumes used to develop the model were as detailed in Table \u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e. Note that the total number of images (but not the number of data subjects) used is lower than the total available due to the sampling process described above, which sampled only a random sample of the images for each subject.\u003c/p\u003e\n\u003cp\u003eDeep Learning Model\u003c/p\u003e\n\u003cp\u003eA Deep Learning [\u003cspan class=\"CitationRef\"\u003e20\u003c/span\u003e] model was used to produce GA estimates from ultrasound images trained via supervised learning \u0026ndash; a process by which the model is optimised to generate estimates that are as close as possible to those provided in the training dataset. Deep Learning has demonstrated state of the art performance in computer vision tasks since 2012 [\u003cspan class=\"CitationRef\"\u003e21\u003c/span\u003e] and now dominates the field, making it a natural approach for this task.\u003c/p\u003e\n\u003cp\u003eOne challenge in producing GA estimates based on ultrasound images taken by novices is that images may be unsuitable for this purpose (as an extreme example, images may not show a fetus at all). To overcome this problem, we designed the model to report the level of uncertainty in GA prediction. To do this we follow the approach described by Stirn et al [\u003cspan class=\"CitationRef\"\u003e22\u003c/span\u003e] to construct a neural network comprising the following elements:\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\n \u003cp\u003eA trunk network \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{f}_{trunk}:\\left(X,\\:{{\\Theta\\:}}_{z}\\right)\\underset{}{\\to\\:}{\\mathbb{R}}^{d}\\)\u003c/span\u003e\u003c/span\u003e which, given parameters \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\theta\\:}_{z}ϵ{{\\Theta\\:}}_{z}\\)\u003c/span\u003e\u003c/span\u003e maps an image \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:xϵX\\)\u003c/span\u003e\u003c/span\u003e to a \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:d\\)\u003c/span\u003e\u003c/span\u003e dimensional representation vector. We employed a ConvNeXt [\u003cspan class=\"CitationRef\"\u003e23\u003c/span\u003e] architecture for \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{f}_{trunk}\\)\u003c/span\u003e\u003c/span\u003e with \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:d=768\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eA mean prediction head \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\mu\\::\\left({\\mathbb{R}}^{d},\\:{{\\Theta\\:}}_{\\mu\\:}\\right)\\underset{}{\\to\\:}\\mathbb{R}\\)\u003c/span\u003e\u003c/span\u003e which, given parameters \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\theta\\:}_{\\mu\\:}ϵ{{\\Theta\\:}}_{\\mu\\:}\\)\u003c/span\u003e\u003c/span\u003e maps a representation vector \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:vϵ{\\mathbb{R}}^{d}\\)\u003c/span\u003e\u003c/span\u003e to an estimate of the mean of a normal distribution describing the target variable\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eA variance prediction head \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\sigma\\::\\left({\\mathbb{R}}^{d},\\:{{\\Theta\\:}}_{\\sigma\\:}\\right)\\underset{}{\\to\\:}{\\mathbb{R}}^{+}\\)\u003c/span\u003e\u003c/span\u003e which, given parameters \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\theta\\:}_{\\sigma\\:}ϵ{{\\Theta\\:}}_{\\sigma\\:}\\)\u003c/span\u003e\u003c/span\u003e maps a representation vector \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:vϵ{\\mathbb{R}}^{d}\\)\u003c/span\u003e\u003c/span\u003e to an estimate of the sigma parameter of a normal distribution describing the target variable\u003c/p\u003e\n \u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eSince they are non-negative we modelled the GA values in log space.\u003c/p\u003e\n\u003cp\u003eThe parameter optimisation process sought to minimise\u003c/p\u003e\n\u003cdiv id=\"Equa\" class=\"Equation\"\u003e\n \u003cdiv class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e$$\\:\\mathcal{L}:=\\sum\\:_{\\left(x,y\\right)ϵ\\mathcal{D}}\\frac{{\\left|y-\\mu\\:\\left(x\\right)\\right|}_{2}^{2}}{2}-\\text{ln}\\mathcal{N}\\left(y|\u0026lfloor;\\mu\\:\\left(x\\right)\u0026rfloor;,\\sigma\\:\\left(\u0026lfloor;\\:{f}_{trunk}\\left(x\\right)\u0026rfloor;\\right)\\right)$$\u003c/div\u003e\n\u003c/div\u003e\n\u003cp\u003eWhere\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\n \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u0026nbsp;\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathcal{D}\\)\u003c/span\u003e\u0026nbsp;\u003c/span\u003e denotes the dataset of image \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:x\\)\u003c/span\u003e\u003c/span\u003e and label \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:y\\)\u003c/span\u003e\u003c/span\u003e pairs\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u0026nbsp;\u003cspan class=\"mathinline\"\u003e\\(\\:{\\left|\\bullet\\:\\right|}_{2}\\)\u003c/span\u003e\u0026nbsp;\u003c/span\u003e denotes the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{l}^{2}\\)\u003c/span\u003e\u003c/span\u003e-norm\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u0026nbsp;\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathcal{N}\\)\u003c/span\u003e\u0026nbsp;\u003c/span\u003e denotes the pdf of a standard Normal distribution\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u0026nbsp;\u003cspan class=\"mathinline\"\u003e\\(\\:\u0026lfloor;\\bullet\\:\u0026rfloor;\\)\u003c/span\u003e\u0026nbsp;\u003c/span\u003e denotes a stop-gradient operation\u003c/p\u003e\n \u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eDespite the stop-gradient operations applied in this approach, we found that phased model training delivers superior results:\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\n \u003cp\u003ePhase 1 \u0026ndash; freeze the sigma head, optimising only parameters in the trunk and mean prediction head.\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003ePhase 2 \u0026ndash; Finetune from the Phase 1 model, with all parameters unfrozen.\u003c/p\u003e\n \u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eEach phase was trained for 100 epochs, using the Adam optimiser [\u003cspan class=\"CitationRef\"\u003e24\u003c/span\u003e] with \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\epsilon\\:={10}^{-8}\\)\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\beta\\:}_{1}=0.9\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\beta\\:}_{2}=0.999\\)\u003c/span\u003e\u003c/span\u003e. Learning rate was set on a 2-phase schedule: a linear warm up phase increasing it from 0 to a maximum value over the first 10 epochs and then decreasing it again to 0 over a cosine annealing schedule over the following 90 epochs. The maximum learning rate that was applied was derived via the process described in [\u003cspan class=\"CitationRef\"\u003e25\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eTraining data were augmented during mini-batch preparation to enhance diversity and encourage better model generalisation. A variant of the RandAugment [\u003cspan class=\"CitationRef\"\u003e26\u003c/span\u003e] approach was taken by which a random set of transformations was selected per image. The list of available transformations was tailored towards being appropriate for ultrasound images, and is as follows: rotation, re-scaling, horizontal flip, blur, brightness \u0026amp; contrast jitter, pixel-wise multiplicative noise and grid distortion.\u003c/p\u003e\n\u003cp\u003eApplying to Video Data\u003c/p\u003e\n\u003cp\u003eTypically, several images of the fetus are obtained during an ultrasound examination, and in real world imaging this is in the form of a real time video. In order to ensure applicability to video clips, our method was designed to generate more precise estimates when multiple images of a given fetus are available, by applying a static 1D Kalman Filter to the estimates (and corresponding uncertainties) that are obtained by applying the Deep Learning model to the individual images. The Kalman Filter results are continuously updated as new frames are available, producing values for the mean and standard deviation of the estimate based on all of the frames analysed up to that point. At the point that the resultant standard deviation drops below a threshold value, indicating sufficient prediction in the estimate, the process is stopped and the mean value reported.\u003c/p\u003e\n\u003cp\u003eValidation\u003c/p\u003e\n\u003cp\u003eValidation was undertaken by analysis of retrospective data that were wholly independent of the data used in model development. GA estimates calculated by the AI model were compared to GA estimates according to fetal biometric measurement at the same ultrasound examination, which represents clinical best practice. This comparison was made using Mean Absolute Error (MAE).\u003c/p\u003e\n\u003cp\u003eTwo datasets were used for validation:\u003c/p\u003e\n\u003col\u003e\n \u003cli\u003e\u003cspan\u003e\n \u003cp\u003eRetrospective image data: Comprising a large sample of image sets stored during routine ultrasound examinations. This dataset was large enough to support subset analyses by gestational age, maternal BMI, and country of scanning.\u003c/p\u003e\n \u003c/span\u003e\u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eRetrospective video data: Comprising a smaller sample of ultrasound videos, that were created by splicing together small random sub-segments of full-length fetal scans in a random order to approximate scans conducted in an undirected manner.\u003c/p\u003e\n \u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eValidation Using Images\u003c/p\u003e\n\u003cp\u003eThis component of the validation compared model estimates obtained from sets of images stored in patient records during a routine ultrasound examination to those obtained from biometric measurement versus the gold standard. The dataset covers a wide range of the period of fetal gestation, a range of maternal BMIs and countries, and is summarised in Table \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e. The sample size was estimated to enable detection of a 1-day difference in MAE at 90% confidence level for all subsets of interest. No data from any of the patients included in the validation dataset was used at all during the development process.\u003c/p\u003e\n\u003cp\u003eValidation Using Videos\u003c/p\u003e\n\u003cp\u003eThis component of the validation aims to confirm that the model estimates are accurate when calculated from videos of scans that were obtained in an undirected fashion. It further aims to establish that the results obtained from the model do not depend heavily on the particular order in which fetal anatomy happens to be scanned. A dataset was created that consists of videos of ultrasound scanning, designed to simulate undirected scanning (for example scanning without reference to the ultrasound images). To achieve this, 99 full length videos of a routine ultrasound examinations and where actual GA was known (based on previous CRL, the gold standard) were used. From these 99 videos we created 99 3-minute videos via the following process:\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\n \u003cp\u003eFor each scan, randomly select 36 non-overlapping 5-second subsegments of the video\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eFor each scan, randomly shuffle the order of the subsegments\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eConcatenate together the shuffled subsegments into a 3-minute clip\u003c/p\u003e\n \u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eFor each of these videos, we also repeated steps 2 and 3 to produce another video having the same content but with the frames in a different order.\u003c/p\u003e\n\u003cp\u003eDuring the process of video analysis, the model assesses each frame it sees to determine whether it contains a frames containing useful information\u003csup\u003e7\u003c/sup\u003e and the model only utilizes frames that contain such information. Because the model outputs estimates once it generates a sufficiently confident result, then ignoring the rest of the video, this means that the re-shuffling process may result in different parts of fetal anatomy being presented to the model during the period that it analyses.\u003c/p\u003e\n\u003cp\u003eWe have also analysed the time taken by the algorithm to produce an estimate on each of these videos, and report the cumulative distribution.\u003c/p\u003e\n\u003cp\u003eComparison to Biometric Estimates\u003c/p\u003e\n\u003cp\u003eFor GA estimates by biometry this was done from measurements of the fetal biparietal diameter (BPD), head circumference (HC), abdominal circumference (AC) and, femur length (FL) associated with the scans and based on the following formula [\u003cspan class=\"CitationRef\"\u003e14\u003c/span\u003e]:\u003c/p\u003e\n\u003cdiv id=\"Equb\" class=\"Equation\"\u003e\n \u003cdiv class=\"mathdisplay\" id=\"FileID_Equb\" name=\"EquationSource\"\u003e$$\\:{GA}_{days}=7\\times\\:\\left(10.85+\\left(0.0006\\times\\:HC\\times\\:FL\\right)+\\left(0.067\\times\\:BPD\\right)+\\left(0.0168\\times\\:AC\\right)\\right)$$\u003c/div\u003e\n\u003c/div\u003e\n\u003cp\u003eFor scans that were conducted prior to week 14, the clinical gold-standard approach to estimating GA is via CRL measurement. This means that the biometric estimates are, by definition, correct and comparison of model estimates are therefore not feasible. Instead, we compare to an accuracy benchmark [\u003cspan class=\"CitationRef\"\u003e12\u003c/span\u003e], which quantifies the error in CRL-based estimates relative to GAs calculated from conception dates that are known with absolute certainty.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003eAs outlined above, the model was applied to two types of data: static images obtained from routine ultrasound examinations, and videos that simulate undirected ultrasound scanning.\u003c/p\u003e\n\u003cp\u003eModel validation on images\u003c/p\u003e\n\u003cp\u003eThe MAE of the model and biometry-based estimates (\u0026ldquo;Biometry Measured\u0026rdquo;, as described in Comparison to Biometric Estimates) are summarised in Table \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e. The AI-model based estimates were consistently more accurate than those obtained from biometry over the whole range of GAs from 10 to 36 weeks(Table \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e). The superiority of GA estimation compared to that by biometry was strongly statistically significant from 14 to 24 weeks, and moderately significant over 24 weeks. The MAE expected for biometry-based estimates reported in the literature (\u0026ldquo;Biometry Literature Benchmark\u0026rdquo;) for that band are also provided in Table \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e.\u003c/p\u003e\n\u003cp\u003eSubanalysis of MAE of the AI-model and biometry-based estimates by country demonstrate that model estimates were at least as accurate as those obtained from biometry in all scanning countries (Table \u003cspan class=\"InternalRef\"\u003e5\u003c/span\u003e), with results statistically more accurate for the UK and Australia. The lack of significance among scans from India may be due to the smaller sample size, or a lower than expected error of the biometry-based estimates obtained during weeks 18\u0026ndash;24.\u003c/p\u003e\n\u003cp\u003eSubanalysis of MAE by maternal BMI show that model estimates are at least as accurate as those obtained from biometry for all bands of maternal BMI (Table \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e).\u003c/p\u003e\n\u003cp\u003eThe relationship between the predicted gestational age generated by the model and the corresponding ground truth GA values are shown in Fig. \u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e. Visual assessment of the accuracy, precision and potential biases of the model\u0026rsquo;s predictions demonstrates data points are clustered around the diagonal line representing perfect prediction. The Bland-Altman plot represented in Fig. \u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e offers a complementary perspective on the relationship between predicted gestational age and corresponding actual GA values. It confirms a high degree of correspondence between the predicted and actual values, and also shows, in common with other approaches to estimating GA, that error magnitudes increase with GA.\u003c/p\u003e\n\u003cp\u003eModel validation with video\u003c/p\u003e\n\u003cp\u003eThe MAE of the model estimates have been calculated and summarised within bands of (actual) GA and are reported in Table \u003cspan class=\"InternalRef\"\u003e7\u003c/span\u003e. They show that the model estimates are more accurate than would be expected to be obtained from biometric measurement in all trimesters.\u003c/p\u003e\n\u003cp\u003eA scatter plot and Bland-Altman plot of the relationship between predicted and actual GA values can be found in the Supplementary Materials.\u003c/p\u003e\n\u003cp\u003eThe amount of time needed to generate a sufficiently confident GA estimate varies based on the level of useful information encountered within the frames of each video. Figure \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e shows the cumulative distribution of the time needed to produce predictions on the validation set, demonstrating that in 95% of cases less than a minute of video is needed. The median time to produce an estimate is 24 seconds.\u003c/p\u003e\n\u003cp\u003eTo assess the impact that re-shuffling the 5-second video snippets within the videos has on model predictions, a GA estimate was produced for shuffled and re-shuffled versions of the same video. This resulted in re-shuffled videos presenting frames to the model in a different order, but also, as noted above, some frames that were different from the original shuffled video. The difference in the estimates obtained from each pair of videos was less than 3 days in around 90% of cases. This is a relatively small movement in relation to the average magnitude of error in the estimates, suggesting that the accuracy of model estimates was not strongly dependent on the order in which fetal anatomy is scanned.\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe main contributions of this paper are to: (i) Introduce a method of training a Deep Learning model from very large retrospective ultrasound image datasets capable of outputting both a GA estimate and a well-calibrated estimate of the level of uncertainty inherent in that estimate, (ii) Introduce a method for applying such a model to a succession of ultrasound images obtained from a video sequence or collection of static images, by applying a static 1D Kalman filter to its outputs and (iii) Train such a model on a large, very diverse dataset of ultrasound images, containing data on over 75k fetuses, and sourced from scanning centres spanning 3 continents.\u003c/p\u003e \u003cp\u003eOur results show that the accuracy of the outputs of this model is superior to estimates obtained via current standard clinical practice (biometric measurement), both at an overall level and in all key subset analyses that were performed.\u003c/p\u003e \u003cp\u003eWhile direct comparisons to other results presented in the literature are made difficult by differences in data, scanning approach, and methods of analysis we consider that the results that we have presented are competitive with those of Gomes \u003cem\u003eet.al\u003c/em\u003e [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e], who developed a deep learning neural network model trained on a dataset of 1968 subjects to estimate GA from short ultrasound videos, some of which were acquired via a blind sweep scanning protocol, and some which were \u0026ldquo;fly-to\u0026rdquo; videos targeting biometry planes. Their model achieved MAE values of approximately 1.92 days (14\u0026ndash;19 weeks), 2.78 days (20\u0026ndash;25 weeks), 2.97 days (26\u0026ndash;31 weeks) and 3.07 days (32\u0026ndash;37 weeks) when measured on videos obtained from trained sonographers following a blind sweep protocol. They further demonstrated that accuracy was non-inferior to biometry-based estimates when applied to videos obtained from novices following the blind sweep protocol (though lower accuracy than performed by a trained sonographer). The methods we describe in this paper have the benefit of being readily applicable to large retrospective image archives, rather than requiring video data for model training.\u003c/p\u003e \u003cp\u003eMaraci \u003cem\u003eet al.\u003c/em\u003e [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e] describe a method for estimating GA by applying deep learning to images of the trans-cerebellar view of the fetal head to produce an automated TCD measurement than can then be a GA estimate via standard reference charts. Their model was trained on a dataset of 500 ultrasound images of fetuses for between 16 and 26 weeks GA and yielded MAE values of approximately 5.4 days, though with respect to a GA estimate obtained through manual TCD measurement, which is not the gold-standard approach. Finally, Lee \u003cem\u003eet al.\u003c/em\u003e [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e] describe a slightly more complex approach which uses deep learning to analyse images from multiple biometric planes (TV, AC, FL) to estimate GA directly. They utilized a dataset of 3809 subjects and reported MAE values of 2.6 days during 14\u0026ndash;19 weeks of pregnancy, 3.1 days (20\u0026ndash;25 weeks), 3.3 days (26\u0026ndash;31 weeks), and 4.6 days (32\u0026ndash;37 weeks). A drawback of both of the latter two approaches is that it requires the operator to acquire images from particular planes, meaning that they are not applicable to scanning by novices.\u003c/p\u003e \u003cp\u003eA limitation of the analysis presented is that it does not quantify how accurate the model is when applied directly to videos of scans obtained by novice users. A natural next step in enhancing our understanding of the performance of our method in a real-world setting will be to perform prospective study in which novice users obtain GA estimates using it and then to analyse the results of that.\u003c/p\u003e \u003cp\u003eWe consider that the methods presented represent progress towards democratising the use of ultrasound in clinical decision making, through developing automated systems that are able to interpret ultrasound data that is acquired without requiring sonographic training. Such systems could be operated by a wider set of medical practitioners, including, for example General Practitioners and midwives. This could drive improvements in clinical care in LMICs especially but may also enhance efficiency and patient outcomes in developed countries too.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eDeclaration of Interests\u003c/h2\u003e \u003cp\u003eMB, SW, TH and NS are permanent employees of Intelligent Ultrasound. AP is a scientific advisor for Intelligent Ultrasound. SM and SS are directors of medical institutions which have contributed data to the research.\u003c/p\u003e \u003ch2\u003eFunding:\u003c/h2\u003e \u003cp\u003eThis study was funded by Intelligent Ultrasound. ATP was part-funded by the NIHR Oxford Biomedical Research Centre.\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eMB: Conceptualisation of the study and methodology design.Oversaw data management, model development, and validation.Drafted and reviewed the manuscript.SWContributed to data preparation and pre-processing for deep learning model training.Assisted with statistical analysis and interpretation of results.Drafted and reviewed the manuscript.THDevelopment and optimisation of the deep learning model.Performed validation experiments and uncertainty estimation.Contributed to manuscript writing, particularly in the methods section.SM and SSProvided clinical data and guidance on ultrasound image acquisition protocols.Contributed to the interpretation of clinical relevance of findings.Reviewed the manuscript critically for intellectual content.NSConceptualisation of the study and methodology design.Contributed to the study\u0026rsquo;s technical design and software development.Provided input on AI model architecture and implementation.Coordinated the collaboration among the institutions.Co-wrote, reviewed and approved the final manuscript.ATPConceptualisation and overall guidance on study design and clinical translation.Coordinated the collaboration among the institutions.Co-wrote, edited and approved the manuscript as the senior author.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eData used in this analysis cannot be shared due to ethical committee and participant privacy constraints.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eHALL, M. H., \u0026amp; CARR-HILL, R. A. (1985). The significance of uncertain gestation for obstetric outcome. BJOG: An International Journal of Obstetrics \u0026amp; Gynaecology, 92(5), 452\u0026ndash;460.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBilardo, C. M., Chaoui, R., Hyett, J. A., Kagan, K. O., Karim, J. N., Papageorghiou, A. T., \u0026hellip; Nicolaides, K. H. (2023). ISUOG Practice Guidelines (updated): performance of 11\u0026ndash;14-week ultrasound scan. Ultrasound in Obstetrics and Gynecology, 61(1).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKalish, R. B., \u0026amp; Chervenak, F. A. (2005). Sonographic determination of gestational age. The ultrasound review of obstetrics and Gynecology, 5(4), 254\u0026ndash;258.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSavitz, D. A., Terry Jr, J. W., Dole, N., Thorp Jr, J. M., Siega-Riz, A. M., \u0026amp; Herring, A. H. (2002). Comparison of pregnancy dating by last menstrual period, ultrasound scanning, and their combination. American journal of obstetrics and gynecology, 187(6), 1660\u0026ndash;1666.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWegienka, G., \u0026amp; Baird, D. D. (2005). A comparison of recalled date of last menstrual period with prospectively recorded dates. Journal of women's health, 14(3), 248\u0026ndash;252.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChiazze, L., Brayer, F. T., Macisco, J. J., Parker, M. P., \u0026amp; Duffy, B. J. (1968). The length and variability of the human menstrual cycle. Jama, 203(6), 377\u0026ndash;380.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCreinin, M. D., Keverline, S., \u0026amp; Meyn, L. A. (2004). How regular is regular? An analysis of menstrual cycle regularity. Contraception, 70(4), 289\u0026ndash;292.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePettker, C. M., Goldberg, J. D., El-Sayed, Y. Y., \u0026amp; Copel, J. A. (2017).. Obstetrics and gynecology, 129(5), E150-E154.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRobinson, H. P., \u0026amp; Fleming, J. E. E. (1975). A critical evaluation of sonar \u0026ldquo;crown-rump length\u0026rdquo; measurements. BJOG: An International Journal of Obstetrics \u0026amp; Gynaecology, 82(9), 702\u0026ndash;710.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOhuma, E. O., Papageorghiou, A. T., Villar, J., \u0026amp; Altman, D. G. (2013). Estimation of gestational age in early pregnancy from crown-rump length when gestational age range is truncated: the case study of the INTERGROWTH-21 st Project. BMC medical research methodology, 13, 1\u0026ndash;14.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGeirsson, R. T. (1991). Ultrasound instead of last menstrual period as the basis of gestational age assignment. Ultrasound in Obstetrics and Gynecology: The Official Journal of the International Society of Ultrasound in Obstetrics and Gynecology, 1(3), 212\u0026ndash;219.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePapageorghiou, A. T., Kennedy, S. H., Salomon, L. J., Ohuma, E. O., Cheikh Ismail, L., Barros, F. C., \u0026hellip; International Fetal and Newborn Growth Consortium for the 21st Century (INTERGROWTH-21st). (2014). International standards for early fetal size and pregnancy dating based on ultrasound measurement of crown\u0026ndash;rump length in the first trimester of pregnancy. Ultrasound in Obstetrics \u0026amp; Gynecology, 44(6), 641\u0026ndash;648.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBenson, C. B., \u0026amp; Doubilet, P. M. (1991). Sonographic prediction of gestational age: accuracy of second-and third-trimester fetal measurements. AJR. American journal of roentgenology, 157(6), 1275\u0026ndash;1277.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHadlock, F. P., Deter, R. L., Harrist, R. B., \u0026amp; Park, S. K. (1984). Estimating fetal age: computer-assisted analysis of multiple fetal growth parameters. Radiology, 152(2), 497\u0026ndash;501.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBradburn E, Conde-Agudelo A, Roberts NW, Villar J, Papageorghiou AT. Accuracy of prenatal and postnatal biomarkers for estimating gestational age: a systematic review and meta-analysis. EClinicalMedicine. 2024;70:102498. doi: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.eclinm.2024.102498\u003c/span\u003e\u003cspan address=\"10.1016/j.eclinm.2024.102498\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDrukker, L., Yasrab, R., Noble, J. A., \u0026amp; Papageorghiou, A. T. (2021). Vp18. 07: First trimester scans: how much time does it take to acquire the crl and nt?. Ultrasound in Obstetrics \u0026amp; Gynecology, 58.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSalomon, L. J., Alfirevic, Z., Berghella, V., Bilardo, C. M., Chalouhi, G. E., Costa, F. D. S., \u0026hellip; Lee, W. (2022). ISUOG Practice Guidelines (updated): performance of the routine mid-trimester fetal ultrasound scan. Ultrasound in Obstetrics and Gynecology, 59(6), 840\u0026ndash;856.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhalil, A., Sotiriadis, A., D'Antonio, F., Da Silva Costa, F., Odibo, A., Prefumo, F., \u0026hellip; Salomon, L. J. (2024). ISUOG Practice Guidelines: performance of third-trimester obstetric ultrasound scan. Ultrasound in Obstetrics \u0026amp; Gynecology, 63(1), 131\u0026ndash;147.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePexsters, A., Daemen, A., Bottomley, C., Van Schoubroeck, D., De Catte, L., De Moor, B., \u0026hellip; Bourne, T. (2010). New crown\u0026ndash;rump length curve based on over 3500 pregnancies. Ultrasound in Obstetrics and Gynecology: The Official Journal of the International Society of Ultrasound in Obstetrics and Gynecology, 35(6), 650\u0026ndash;655.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLeCun, Y., Bengio, Y., \u0026amp; Hinton, G. (2015). Deep learning. nature, 521(7553), 436\u0026ndash;444.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKrizhevsky, A., Sutskever, I., \u0026amp; Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStirn, A., Wessels, H., Schertzer, M., Pereira, L., Sanjana, N., \u0026amp; Knowles, D. (2023, April). Faithful heteroscedastic regression with neural networks. In International Conference on Artificial Intelligence and Statistics (pp. 5593\u0026ndash;5613). PMLR.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu, Z., Mao, H., Wu, C. Y., Feichtenhofer, C., Darrell, T., \u0026amp; Xie, S. (2022). A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11976\u0026ndash;11986).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKingma, D. P., \u0026amp; Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSmith, L. N. (2017, March). Cyclical learning rates for training neural networks. In 2017 IEEE winter conference on applications of computer vision (WACV) (pp. 464\u0026ndash;472). IEEE.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCubuk, E. D., Zoph, B., Shlens, J., \u0026amp; Le, Q. V. (2020). Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 702\u0026ndash;703).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGomes, R. G., Vwalika, B., Lee, C., Willis, A., Sieniek, M., Price, J. T., \u0026hellip; Shetty,S. (2022). A mobile-optimized artificial intelligence system for gestational age and fetal malpresentation assessment. Communications Medicine, 2(1), 128.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMaraci, M. A., Yaqub, M., Craik, R., Beriwal, S., Self, A., Von Dadelszen, P., \u0026hellip; Noble,J. A. (2020). Toward point-of-care ultrasound estimation of fetal gestational age from the trans-cerebellar diameter using CNN-based ultrasound image analysis. Journal of Medical Imaging, 7(1), 014501\u0026ndash;014501.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee, L. H., Bradburn, E., Craik, R., Yaqub, M., Norris, S. A., Ismail, L. C., \u0026hellip; Papageorghiou,A. T. (2023). Machine learning for accurate estimation of fetal gestational age based on ultrasound images. NPJ digital medicine, 6(1), 36.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"},{"header":"Tables","content":"\u003cp\u003eTable 1. Summary of retrospective ultrasound image data obtained during routine clinical scanning collated from multiple centres in India, Australia and the UK.\u0026nbsp;\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"595\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 15.9396%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eData Source\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 14.2617%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eImages (n)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 33.8926%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDataset Description\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13.7584%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eRoute\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 22.1477%;\"\u003e\n \u003cp\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 15.9396%;\"\u003e\n \u003cp\u003eAustralia\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 14.2617%;\"\u003e\n \u003cp\u003e9,104,090\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 33.8926%;\"\u003e\n \u003cp\u003eImaging centers, Melbourne, Australia.\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13.7584%;\"\u003e\n \u003cp\u003eTA and TV\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 22.1477%;\"\u003e\n \u003cp\u003eGE Voluson E6, E8 and E10.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 15.9396%;\"\u003e\n \u003cp\u003eIndia\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 14.2617%;\"\u003e\n \u003cp\u003e1,028,534\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 33.8926%;\"\u003e\n \u003cp\u003eHospital based, Chennai, India.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13.7584%;\"\u003e\n \u003cp\u003eTA and TV\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 22.1477%;\"\u003e\n \u003cp\u003eGE Voluson E6, E7, E10, P8, S6 and \u0026nbsp;Mindray Resona 7.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 15.9396%;\"\u003e\n \u003cp\u003eUK\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 14.2617%;\"\u003e\n \u003cp\u003e639,618\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 33.8926%;\"\u003e\n \u003cp\u003eImaging center, London, UK.\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13.7584%;\"\u003e\n \u003cp\u003eTA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 22.1477%;\"\u003e\n \u003cp\u003eGE Voluson E10 and Expert 22.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eTable 2. The data volumes used for developing the model\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"99%\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"2\" valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eData Source\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 41px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTraining Set\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\" style=\"width: 40px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eHoldout Set\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNumber of Images\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNumber of Subjects\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNumber of Images\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNumber of Subjects\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003eIndia\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 20px;\"\u003e\n \u003cp\u003e610,208\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 20px;\"\u003e\n \u003cp\u003e29,094\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 20px;\"\u003e\n \u003cp\u003e67,386\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003e3,215\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003eAustralia\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 20px;\"\u003e\n \u003cp\u003e1,149,244\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 20px;\"\u003e\n \u003cp\u003e41,391\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 20px;\"\u003e\n \u003cp\u003e127,627\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003e4,607\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003eUK\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 20px;\"\u003e\n \u003cp\u003e240,877\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 20px;\"\u003e\n \u003cp\u003e8,046\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 20px;\"\u003e\n \u003cp\u003e25,258\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 19px;\"\u003e\n \u003cp\u003e896\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTotal\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e2,000,329\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e78,531\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e220,271\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 19px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e8,718\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u0026nbsp;Table 3. Summary of retrospective image data for model validation\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"248\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eGA Band\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNumber of Data Subjects\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e(10,14]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e193\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e(14,18]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e98\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e(18,22]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e184\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e(22,26]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e100\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e(26-30]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e84\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e(30-34]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e83\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTotal\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 124px;\"\u003e\n \u003cp\u003e742\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eTable 4. Comparison of MAEs by GA Band\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"646\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 65px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eGA Band (weeks)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 67px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNumber of Scans\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 87px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eBiometry Literature Benchmark MAE (days)\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 79px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eBiometry Measured MAE (days)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 84px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eIU ScanNav FetalCheck MAE (days)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 92px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMAE Superiority to Literature Benchmark\u003csup\u003e1\u003c/sup\u003e\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-ve = improvement to benchmark\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 86px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMAE Superiority to Measured\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-ve = improvement to benchmark\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 86px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSuperiority to Measured p-value\u003csup\u003e2\u003c/sup\u003e\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 65px;\"\u003e\n \u003cp\u003e10-14\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 67px;\"\u003e\n \u003cp\u003e190\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2.8\u003csup\u003e3\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 79px;\"\u003e\n \u003cp\u003e-\u003csup\u003e4\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 84px;\"\u003e\n \u003cp\u003e1.3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 92px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-1.5 (-54%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 86px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 86px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 65px;\"\u003e\n \u003cp\u003e14-18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 67px;\"\u003e\n \u003cp\u003e100\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e3.0\u003csup\u003e5\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 79px;\"\u003e\n \u003cp\u003e3.4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 84px;\"\u003e\n \u003cp\u003e1.7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 92px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-1.3 (-43%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 86px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-1.7 (-50%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 86px;\"\u003e\n \u003cp\u003e\u0026lt;0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 65px;\"\u003e\n \u003cp\u003e18-24\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 67px;\"\u003e\n \u003cp\u003e278\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e3.9\u003csup\u003e5\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 79px;\"\u003e\n \u003cp\u003e3.6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 84px;\"\u003e\n \u003cp\u003e2.8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 92px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-1.1 (-28%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 86px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-0.8 (-28%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 86px;\"\u003e\n \u003cp\u003e0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 65px;\"\u003e\n \u003cp\u003e24-30\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 67px;\"\u003e\n \u003cp\u003e92\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e5.0\u003csup\u003e5\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 79px;\"\u003e\n \u003cp\u003e6.1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 84px;\"\u003e\n \u003cp\u003e5.0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 92px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.0 (0%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 86px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-1.1 (-18%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 86px;\"\u003e\n \u003cp\u003e0.06\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 65px;\"\u003e\n \u003cp\u003e30-36\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 67px;\"\u003e\n \u003cp\u003e82\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e6.8\u003csup\u003e5\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 79px;\"\u003e\n \u003cp\u003e5.7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 84px;\"\u003e\n \u003cp\u003e4.7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 92px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-2.1 (-31%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 86px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-1.0 (-18%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 86px;\"\u003e\n \u003cp\u003e0.08\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u0026nbsp;Table 5. Comparison of MAEs by Scanning Country\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"468\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 69px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eScanning Country\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 67px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNumber of Scans\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 79px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eBiometry Measured MAE (days)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 84px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eIU ScanNav FetalCheck MAE (days)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 85px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMAE Superiority to Measured\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 85px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSuperiority to Measured p-value\u003csup\u003e6\u003c/sup\u003e\u003c/strong\u003e\u003c/sup\u003e\u003c/a\u003e\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 69px;\"\u003e\n \u003cp\u003eUK\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 67px;\"\u003e\n \u003cp\u003e98\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 79px;\"\u003e\n \u003cp\u003e2.4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 84px;\"\u003e\n \u003cp\u003e1.5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-0.9 (-38%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 85px;\"\u003e\n \u003cp\u003e\u0026lt;0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 69px;\"\u003e\n \u003cp\u003eIndia\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 67px;\"\u003e\n \u003cp\u003e344\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 79px;\"\u003e\n \u003cp\u003e5.4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 84px;\"\u003e\n \u003cp\u003e3.5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-1.9 (-35%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.14\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 69px;\"\u003e\n \u003cp\u003eAustralia\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 67px;\"\u003e\n \u003cp\u003e300\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 79px;\"\u003e\n \u003cp\u003e3.7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 84px;\"\u003e\n \u003cp\u003e2.2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-1.5 (-41%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 85px;\"\u003e\n \u003cp\u003e\u0026lt;0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u0026nbsp;\u003cem\u003eTable\u0026nbsp;\u003c/em\u003e\u003cem\u003e6\u003c/em\u003e\u003cem\u003e.\u003c/em\u003e \u003cem\u003eComparison of MAEs by Maternal BMI\u003c/em\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"604\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 113px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMaternal BMI\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNumber of Scans\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eBiometry Measured MAE (days)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eIU ScanNav FetalCheck MAE (days)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMAE Superiority to Measured\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSuperiority to Measured p-value\u003csup\u003e6\u003c/sup\u003e\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 113px;\"\u003e\n \u003cp\u003e\u0026lt; 25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e107\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e5.9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e3.5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-2.4 (-40%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e0.51\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 113px;\"\u003e\n \u003cp\u003e25 to 30\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e129\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e6.4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e3.8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-2.6 (-40%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e0.12\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 113px;\"\u003e\n \u003cp\u003e\u0026gt;30\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e108\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e3.8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e3.2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-0.6 (-15%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e0.22\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u0026nbsp;Table 7. MAEs of estimates obtained from video\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"506\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 113px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eGA Band\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNumber of Scans\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eExpected Biometry MAE (days)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eIU ScanNav FetalCheck MAE (days)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMAE Superiority to Expected\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 113px;\"\u003e\n \u003cp\u003e10-14 Weeks\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e36\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e2.8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e2.5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-0.3 (-11%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 113px;\"\u003e\n \u003cp\u003e14-27 Weeks\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e58\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e3.9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e3.7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-0.2 (-5%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 113px;\"\u003e\n \u003cp\u003e27-34 Weeks\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e5.4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e2.6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 98px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e-2.8 (-52%)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e[1] Absolute difference is reported, followed by relative difference in brackets.\u003c/p\u003e\n\u003cp\u003e[2] Calculated via Wilcoxon Signed-Rank Test\u003c/p\u003e\n\u003cp\u003e[3] Derived from error standard deviation reported in [10], assuming CRL=55mm.\u003c/p\u003e\n\u003cp\u003e[4] Since CRL-based GA is by definition correct in this band, the measured value is not reported.\u003c/p\u003e\n\u003cp\u003e[5] Derived from error standard deviations reported in [15].\u003c/p\u003e\n \u003cp\u003e[6] Calculated via Wilcoxon Signed-Rank Test\u003c/p\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"npj-digital-medicine","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"npjdigitalmed","sideBox":"Learn more about [npj Digital Medicine](http://www.nature.com/npjdigitalmed/)","snPcode":"41746","submissionUrl":"https://submission.springernature.com/new-submission/41746/3","title":"npj Digital Medicine","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"NPJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-5907990/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5907990/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eBackground\u003cbr\u003e\nAccurate gestational age (GA) estimation is essential for prenatal care, guiding fetal growth assessment and medical interventions. Ultrasound-based biometric measurements, though more reliable than last menstrual period, require high sonographic expertise and are time-consuming, posing challenges, especially in resource-limited settings. This study aimed to develop an Artificial Intelligence (AI) model that estimates GA from any fetal ultrasound images regardless of orientation or standard plane, reducing reliance on sonographic skill, to increase accessibility.\u003c/p\u003e\n\u003cp\u003eMethods:\u003cbr\u003e\nWe trained a deep learning model on a large, diverse dataset of over 2 million ultrasound images from three continents (Australia, India, and the UK). The model was trained to estimate GA from ultrasound images without requiring specific biometric planes or measurements. It outputs a GA estimate alongside an uncertainty level, based on image quality and type. Validation was performed using independent datasets of ultrasound images and videos, with comparison against standard biometric measurements across all trimesters.\u003c/p\u003e\n\u003cp\u003eFindings:\u003cbr\u003e\nThe AI model consistently produced GA estimates that were at least as accurate as those derived from traditional biometry, with a mean absolute error (MAE) significantly lower than biometry in the second trimester (p \u0026lt; 0.001) and comparable in the third trimester. Subanalysis by country and maternal BMI demonstrated the model's robustness across different sub-populations. The model also accurately estimated GA on video datasets, producing a confident estimate after a median of 24 seconds of video.\u003c/p\u003e\n\u003cp\u003eInterpretation:\u003cbr\u003e\nThis AI-based GA estimation method, trained from retrospective clinical data, is at least accurate and gold-standard fetal biometry. By significantly reducing the skill level required by sonologists, this approach holds potential to improve prenatal care in resource-limited settings and democratize access to ultrasound-based GA estimation globally.\u003c/p\u003e","manuscriptTitle":"Fetal Gestational Age Estimation Using AI on Simple Ultrasound Images and Video","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-01-30 10:19:23","doi":"10.21203/rs.3.rs-5907990/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-03-01T18:28:15+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-03-01T07:08:02+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-03-01T03:50:42+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"14512559093224160330853633824510090173","date":"2025-02-28T11:35:44+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"76326064267771660538467770192700005446","date":"2025-02-28T11:25:41+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-02-28T11:08:54+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-01-28T16:32:03+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-01-28T11:25:48+00:00","index":"","fulltext":""},{"type":"submitted","content":"npj Digital Medicine","date":"2025-01-26T18:48:46+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"npj-digital-medicine","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"npjdigitalmed","sideBox":"Learn more about [npj Digital Medicine](http://www.nature.com/npjdigitalmed/)","snPcode":"41746","submissionUrl":"https://submission.springernature.com/new-submission/41746/3","title":"npj Digital Medicine","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"NPJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"65a17507-cbe7-4633-b732-ff4c79ebab57","owner":[],"postedDate":"January 30th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":43558047,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":43558048,"name":"Biological sciences/Physiology/Reproductive biology"},{"id":43558049,"name":"Health sciences/Health care/Medical imaging"}],"tags":[],"updatedAt":"2025-11-24T16:08:32+00:00","versionOfRecord":{"articleIdentity":"rs-5907990","link":"https://doi.org/10.1038/s41746-025-02024-z","journal":{"identity":"npj-digital-medicine","isVorOnly":false,"title":"npj Digital Medicine"},"publishedOn":"2025-11-20 15:57:52","publishedOnDateReadable":"November 20th, 2025"},"versionCreatedAt":"2025-01-30 10:19:23","video":"","vorDoi":"10.1038/s41746-025-02024-z","vorDoiUrl":"https://doi.org/10.1038/s41746-025-02024-z","workflowStages":[]},"version":"v1","identity":"rs-5907990","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5907990","identity":"rs-5907990","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-21T05:10:58.409756+00:00
License: CC-BY-4.0