Global distribution of cattle, horses, goats, sheep and buffaloes at 1 km resolution for 2000 — 2022 based on subnational census data and spatiotemporal Machine Learning | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Data Note Global distribution of cattle, horses, goats, sheep and buffaloes at 1 km resolution for 2000 — 2022 based on subnational census data and spatiotemporal Machine Learning Leandro Parente, Steffen Ehrmann, Tomislav Hengl, Steffen Fritz, and 9 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6201916/v2 This work is licensed under a CC BY 4.0 License Status: Posted Version 2 posted You are reading this latest preprint version Show more versions Abstract The paper describes the production and evaluation of annual livestock densities and headcounts of cattle, horses, sheep, goats and buffaloes (including 95% probability prediction intervals) at 1~km spatial resolution for the 2000--2022 period using spatiotemporal machine learning. A compilation of subnational livestock census data has been imported, harmonized and used as reference data (55,336 census polygons and 939,257 individual data entries; covering 147 countries) to build predictive models. A large stack of multi-source harmonized raster data sets (128 individual layers) were used as features. Models were fitted using scikit-map and scikit-learn libraries with recursive feature elimination and Poisson criteria to represent the distribution of the target variable. Intermediate rasters estimating potential land for livestock production based on grassland and cropland extent, along with biophysical features, were used to estimate the spatial domain of livestock. The final predictions at 1~km were further adjusted to annual headcounts based on FAOSTAT national statistics to ensure consistency. Model benchmarking based on 10% test samples (with spatial blocking) shows that Random Forest outperforms Gradient Boosting Tree for predicting livestock densities, with CCC values of 0.603, 0.547, 0.598, 0.622, 0.689, and RMSE values of 104.59, 6.06, 64.09, 67.57, 30.37 (heads per km2) for cattle, horses, sheep, goats and buffaloes. Feature importance analysis shows that the key variables include climate and socio-economic layers, such as water vapour, aridity index, land surface temperature, travel time to the nearest cities, and religious population distribution. Further evaluation of the output layers shows similar distributions to existing global livestock products (FAO Gridded Livestock of The World --- GLW, and Annual Gridded Livestock of the World — AGLW). The spatial domain of livestock (active grazing/forage areas) is often difficult to validate, with many countries having very specific management cultures that can not be seamlessly represented using existing global raster layers, hence modeling distribution of livestock per country using local country-specific features (instead of using global models) could help increase accuracy, specially for regional/local applications. The modeling pipeline is open source and available on Github (https://github.com/wri/global-pasture-watch) with output layers (both original ML predictions and FAOSTAT-adjusted values) publicly available under CC-BY license on Zenodo (https://doi.org/10.5281/zenodo.17491242). Geographic Information Systems Artificial Intelligence and Machine Learning census data livestock machine learning areal regression agriculture random forest gradient boosting tree spatiotemporal modeling global mapping open data Full Text Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 2 posted You are reading this latest preprint version Show more versions Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6201916","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Data Note","associatedPublications":[],"authors":[{"id":427865454,"identity":"180d8e85-10e3-491d-ac5b-87652b031c12","order_by":0,"name":"Leandro Parente","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABM0lEQVRIiWNgGAWjYBACPhDBgyQgB2MkMEhg18KGqiWBwZh0LYkNBLWwNz978HaPDYM5+9mHjwt/2KT3Szcfe/ilwibP4HYD4+OKX5haeI6ZG855lsZg2ZNubDwjIS135pxj6cYyZ9KKDe4cYDY824epRSLBTJrnwOH6DQfS2KR5Eg7nbriRYyYt2XY4ccONBDbJxh5MLfLPv4G0MBicf8b+myfhf7rBjfxv0pL/8GiR4DGDaLmRxsbMk3AgweBGDpvkxwaoloYfWPySUyY55wDQLzOeMUvzpCUbAv1iJs1wLC1x5p2DzYaNDRha+NmPb5N4cwAYYvxpjJ95bOzk+aWbn0n+qLFJ7LvdfPBhwx/sIQ0CBnAWMDqYITHF2MDA2EakFkaED/DYMgpGwSgYBSMFAAAxPW3VmAPdaQAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0003-1589-0467","institution":"OpenGeoHub Foundation","correspondingAuthor":true,"prefix":"","firstName":"Leandro","middleName":"","lastName":"Parente","suffix":""},{"id":427865455,"identity":"1cf9efa7-f9d9-4707-8da5-e0ea17d7775c","order_by":1,"name":"Steffen Ehrmann","email":"","orcid":"https://orcid.org/0000-0002-2958-0796","institution":"German Centre for Integrative Biodiversity Research (iDiv)","correspondingAuthor":false,"prefix":"","firstName":"Steffen","middleName":"","lastName":"Ehrmann","suffix":""},{"id":427865456,"identity":"27979112-7260-4461-ac33-e2f2fd73b5da","order_by":2,"name":"Tomislav Hengl","email":"","orcid":"https://orcid.org/0000-0002-9921-5129","institution":"OpenGeoHub Foundation","correspondingAuthor":false,"prefix":"","firstName":"Tomislav","middleName":"","lastName":"Hengl","suffix":""},{"id":427865457,"identity":"11df50fd-adbd-4ea8-a27f-045758919d37","order_by":3,"name":"Steffen Fritz","email":"","orcid":"https://orcid.org/0000-0003-0420-8549","institution":"International Institute for Applied Systems Analysis (IIASA)","correspondingAuthor":false,"prefix":"","firstName":"Steffen","middleName":"","lastName":"Fritz","suffix":""},{"id":427865458,"identity":"346ddd17-1315-41bd-843a-29e3438ff680","order_by":4,"name":"Carmelo Bonannella","email":"","orcid":"https://orcid.org/0000-0002-5391-8427","institution":"OpenGeoHub Foundation","correspondingAuthor":false,"prefix":"","firstName":"Carmelo","middleName":"","lastName":"Bonannella","suffix":""},{"id":427865459,"identity":"eba64261-80da-4c17-abcc-f58d19ce823d","order_by":5,"name":"Žiga Malek","email":"","orcid":"https://orcid.org/0000-0002-6981-6708","institution":"University of Ljubljana, Biotechnical Faculty","correspondingAuthor":false,"prefix":"","firstName":"Žiga","middleName":"","lastName":"Malek","suffix":""},{"id":427865460,"identity":"e071d96b-67ba-4675-bc5b-e0527374c1f0","order_by":6,"name":"Carlos Fischer","email":"","orcid":"https://orcid.org/0000-0002-4140-7152","institution":"Cornell University, Department of Global Development","correspondingAuthor":false,"prefix":"","firstName":"Carlos","middleName":"","lastName":"Fischer","suffix":""},{"id":427865461,"identity":"53f2625d-3dcb-4623-9238-de77aedb9584","order_by":7,"name":"Katya Perez","email":"","orcid":"https://orcid.org/0000-0001-5189-6570","institution":"International Institute for Applied Systems Analysis (IIASA)","correspondingAuthor":false,"prefix":"","firstName":"Katya","middleName":"","lastName":"Perez","suffix":""},{"id":427865462,"identity":"b59c773e-3a77-4914-90e1-806c033407da","order_by":8,"name":"Radost Stanimirova","email":"","orcid":"https://orcid.org/0000-0001-9617-5830","institution":"World Resources Institute","correspondingAuthor":false,"prefix":"","firstName":"Radost","middleName":"","lastName":"Stanimirova","suffix":""},{"id":427865463,"identity":"883ed909-ec0d-46b6-9205-dcf80bc9d60a","order_by":9,"name":"Carsten Meyer","email":"","orcid":"https://orcid.org/0000-0003-3927-5856","institution":"German Centre for Integrative Biodiversity Research (iDiv)","correspondingAuthor":false,"prefix":"","firstName":"Carsten","middleName":"","lastName":"Meyer","suffix":""},{"id":427865464,"identity":"bbd6e54c-7d44-4283-8498-ca3cf36f0104","order_by":10,"name":"Dominik Wisser","email":"","orcid":"https://orcid.org/0000-0001-8368-3801","institution":"Food and Agriculture Organization of the United Nations | FAO","correspondingAuthor":false,"prefix":"","firstName":"Dominik","middleName":"","lastName":"Wisser","suffix":""},{"id":427865465,"identity":"baa8d29e-4b5b-436f-b4d7-ec16c9d7be45","order_by":11,"name":"Giuseppina Cinardi","email":"","orcid":"https://orcid.org/0000-0003-0783-6065","institution":"Food and Agriculture Organization of the United Nations | FAO","correspondingAuthor":false,"prefix":"","firstName":"Giuseppina","middleName":"","lastName":"Cinardi","suffix":""},{"id":427865466,"identity":"632811d8-2969-4a18-8fd5-4ed3bc4df0c9","order_by":12,"name":"Lindsey Sloat","email":"","orcid":"https://orcid.org/0000-0002-2986-9725","institution":"World Resources Institute","correspondingAuthor":false,"prefix":"","firstName":"Lindsey","middleName":"","lastName":"Sloat","suffix":""}],"badges":[],"createdAt":"2025-03-11 10:01:39","currentVersionCode":2,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-6201916/v2","doiUrl":"https://doi.org/10.21203/rs.3.rs-6201916/v2","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":98623017,"identity":"a414f256-13b9-458e-ae37-008699f307bd","added_by":"auto","created_at":"2025-12-19 17:04:03","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":24096523,"visible":true,"origin":"","legend":"","description":"","filename":"manuscriptclean.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6201916/v2_covered_7ec772e2-2297-48f0-8990-b55800d5c52c.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003eGlobal distribution of cattle, horses, goats, sheep and buffaloes at 1 km resolution for 2000 — 2022 based on subnational census data and spatiotemporal Machine Learning\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[{"identity":"7b0ec52a-f30c-46d6-9767-2792475b9e8e","identifier":"10.13039/100010661","name":"Horizon 2020 Framework Programme","awardNumber":"101059548","order_by":0},{"identity":"00c00566-110d-4860-8858-7cb53bf3e180","identifier":"10.13039/501100001659","name":"Deutsche Forschungsgemeinschaft","awardNumber":"DFG-FZT 118","order_by":1}],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"World Resource Institute Land \u0026 Carbon Lab","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"census data, livestock, machine learning, areal regression, agriculture, random forest, gradient boosting tree, spatiotemporal modeling, global mapping, open data","lastPublishedDoi":"10.21203/rs.3.rs-6201916/v2","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6201916/v2","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThe paper describes the production and evaluation of annual livestock densities and headcounts of cattle, horses, sheep, goats and buffaloes (including 95% probability prediction intervals) at 1~km spatial resolution for the 2000--2022 period using spatiotemporal machine learning. A compilation of subnational livestock census data has been imported, harmonized and used as reference data (55,336 census polygons and 939,257 individual data entries; covering 147 countries) to build predictive models. A large stack of multi-source harmonized raster data sets (128 individual layers) were used as features. Models were fitted using scikit-map and scikit-learn libraries with recursive feature elimination and Poisson criteria to represent the distribution of the target variable. Intermediate rasters estimating potential land for livestock production based on grassland and cropland extent, along with biophysical features, were used to estimate the spatial domain of livestock. The final predictions at 1~km were further adjusted to annual headcounts based on FAOSTAT national statistics to ensure consistency. Model benchmarking based on 10% test samples (with spatial blocking) shows that Random Forest outperforms Gradient Boosting Tree for predicting livestock densities, with CCC values of 0.603, 0.547, 0.598, 0.622, 0.689, and RMSE values of 104.59, 6.06, 64.09, 67.57, 30.37 (heads per km2) for cattle, horses, sheep, goats and buffaloes. Feature importance analysis shows that the key variables include climate and socio-economic layers, such as water vapour, aridity index, land surface temperature, travel time to the nearest cities, and religious population distribution. Further evaluation of the output layers shows similar distributions to existing global livestock products (FAO Gridded Livestock of The World --- GLW, and Annual Gridded Livestock of the World — AGLW). The spatial domain of livestock (active grazing/forage areas) is often difficult to validate, with many countries having very specific management cultures that can not be seamlessly represented using existing global raster layers, hence modeling distribution of livestock per country using local country-specific features (instead of using global models) could help increase accuracy, specially for regional/local applications. The modeling pipeline is open source and available on Github (https://github.com/wri/global-pasture-watch) with output layers (both original ML predictions and FAOSTAT-adjusted values) publicly available under CC-BY license on Zenodo (https://doi.org/10.5281/zenodo.17491242).\u003c/p\u003e","manuscriptTitle":"Global distribution of cattle, horses, goats, sheep and buffaloes at 1 km resolution for 2000 — 2022 based on subnational census data and spatiotemporal Machine Learning","msid":"","msnumber":"","nonDraftVersions":[{"code":2,"date":"2025-12-17 19:20:23","doi":"10.21203/rs.3.rs-6201916/v2","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}},{"code":1,"date":"2025-03-12 13:58:07","doi":"10.21203/rs.3.rs-6201916/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"94523d88-e6b5-44bf-bc86-dc8146ae050a","owner":[],"postedDate":"December 17th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":59231782,"name":"Geographic Information Systems"},{"id":59231783,"name":"Artificial Intelligence and Machine Learning"}],"tags":[],"updatedAt":"2025-03-12T13:58:07+00:00","versionOfRecord":[],"versionCreatedAt":"2025-12-17 19:20:23","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v2","identity":"rs-6201916","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6201916","identity":"rs-6201916","version":["v2"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.