Quantifying world geography as seen through the lens of Soviet propaganda | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Quantifying world geography as seen through the lens of Soviet propaganda Mikhail Tamm, Mila Oiva, Ksenia Mukhina, Mark Mets, Maximilian Schich This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5623774/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 02 Feb, 2026 Read the published version in Nature Cities → Version 1 posted You are reading this latest preprint version Abstract Cultural data typically contains a variety of biases. In particular, geographical locations are unequally portrayed in media, creating a distorted representation of the world. Identifying and measuring such biases is crucial to understand both the data and the socio-cultural processes that have produced them. Here we suggest measuring geographical biases in a large historical news media corpus by studying the representation of cities. Leveraging ideas of quantitative urban science, we develop a mixed quantitative-qualitative procedure, which allows us to get robust quantitative estimates of the biases. These biases can be further qualitatively interpreted resulting in a hermeneutic feedback loop. We apply this procedure to a corpus of the Soviet newsreel series 'Novosti Dnya' (News of the Day) and show that city representation grows super-linearly with city size and is further biased by city specialization and geographical location. This allows to systematically identify geographical regions which are explicitly or sneakily emphasized by Soviet propaganda and quantify their importance. Scientific community and society/Social sciences/Interdisciplinary studies Physical sciences/Mathematics and computing/Statistics Scientific community and society/Social sciences/History Scientific community and society/Social sciences/Culture Figures Figure 1 Figure 2 Figure 3 Figure 4 Full Text Additional Declarations There is NO Competing Interest. Supplementary Files Geography.NC.Supplementary.pdf Supplementary Materials for Quantifying world geography as seen through the lens of Soviet propaganda Daily.news.outlines.by.story.csv Dataset 1 USSR.citilist.xlsx Dataset 2A Foreign.citilist.xlsx Dataset 2B Cite Share Download PDF Status: Published Journal Publication published 02 Feb, 2026 Read the published version in Nature Cities → Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5623774","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":395977962,"identity":"aee2e689-d9b6-4ff5-aa6c-90578a7bd31c","order_by":0,"name":"Mikhail Tamm","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAArElEQVRIiWNgGAWjYDACCQglByIOPCBGBw9UizFYSwIpWhIbQCRRWuyle0w33WDYlj4/7PBDoC12croNhGyROWN2O4fhdu7G22kGQC3JxmYHCDosB6pldgJIy4HEbcRqSTecnf6BNC0J8tI5xNpyI63sdo7BbcMN0jkFBxIMiPAL+4zkbbdzKm7Ly89O3/zhQ4WdHEEtEGAARAegDOKBfAMpqkfBKBgFo2BEAQDzHERIy/mgEAAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0003-3168-1307","institution":"Tallinn University","correspondingAuthor":true,"prefix":"","firstName":"Mikhail","middleName":"","lastName":"Tamm","suffix":""},{"id":395977963,"identity":"9f1e7ca1-23ec-4c1d-a33e-737750e4152c","order_by":1,"name":"Mila Oiva","email":"","orcid":"","institution":"University of Turku","correspondingAuthor":false,"prefix":"","firstName":"Mila","middleName":"","lastName":"Oiva","suffix":""},{"id":395977964,"identity":"2836a2dd-5111-4b1b-8690-0474eea79492","order_by":2,"name":"Ksenia Mukhina","email":"","orcid":"","institution":"Tallinn University","correspondingAuthor":false,"prefix":"","firstName":"Ksenia","middleName":"","lastName":"Mukhina","suffix":""},{"id":395977965,"identity":"01ff33e0-1911-465d-b01f-152029ec3b8c","order_by":3,"name":"Mark Mets","email":"","orcid":"","institution":"Tallinn University","correspondingAuthor":false,"prefix":"","firstName":"Mark","middleName":"","lastName":"Mets","suffix":""},{"id":395977966,"identity":"ec891725-8440-46e2-8e04-28556cb3b6e3","order_by":4,"name":"Maximilian Schich","email":"","orcid":"","institution":"Tallinn University","correspondingAuthor":false,"prefix":"","firstName":"Maximilian","middleName":"","lastName":"Schich","suffix":""}],"badges":[],"createdAt":"2024-12-11 11:20:50","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5623774/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5623774/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s44284-025-00380-1","type":"published","date":"2026-02-02T05:00:00+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":78869322,"identity":"b245feb7-e9ac-426f-91fc-4bd50d937d44","added_by":"auto","created_at":"2025-03-20 05:28:30","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":170810,"visible":true,"origin":"","legend":"\u003cp\u003eThe workflow pipeline of the suggested procedure to extract information on media representation of cities. The black arrows correspond to the flow of data. The green arrow denotes classification of hypothetical parameters into relevant and irrelevant according to a predetermined information theoretic criterion. The red arrow signifies the feedback loop, i.e. the systematic refinement of the hypothesis based on the qualitative study of model outliers.\u003c/p\u003e","description":"","filename":"Newsreelgeography.drawio.png","url":"https://assets-eu.researchsquare.com/files/rs-5623774/v1/69462a47cdf7256a2aac6b56.png"},{"id":78869325,"identity":"5a2fc6da-d332-408e-8dc0-d48902909ac2","added_by":"auto","created_at":"2025-03-20 05:28:30","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":429564,"visible":true,"origin":"","legend":"\u003cp\u003eCities of interest on the map of the USSR. Cities with population exceeding 0.03% of the USSR population and their mentions vs. expected from population-only model. Significantly (𝑝\u0026lt;0.05) over- and under-represented cities, insignificantly (0.5\u0026gt;𝑝\u0026gt;0.05) over- and under-represented cities and cities which are mentioned roughly as expected (𝑝\u0026gt;0.5) are shown with cyan, red, grey-cyan, grey-pink and grey circles, respectively. Cities in Moscow and Donbas regions are shown in smaller circles to reduce their overlap.\u003c/p\u003e","description":"","filename":"Figure3.USSRmap120dpi.png","url":"https://assets-eu.researchsquare.com/files/rs-5623774/v1/edb8dde1d1261dcc65c4930d.png"},{"id":78869324,"identity":"151cbf47-fa89-4e35-9fb5-2c7f7a93e14d","added_by":"auto","created_at":"2025-03-20 05:28:30","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":520640,"visible":true,"origin":"","legend":"\u003cp\u003eObserved city mentions vs expectation from population-only and full models. City mentions vs. (A) population of the cities and (B) prediction of the full model, which accounts for population, geographical regions and city specialization for all cities with population above 0.03% of the population of the USSR. Cities mentioned zero times in the dataset are shown in black, out of scale. The red straight lines correspond to ideal correspondence with model and observation (power law regression (1) in panel (A), identity in panel (B)). Dashed and dotted lines correspond to deviations with p=0.05 (dashed) and p=0.001 (dotted). Note that number of big outliers is much smaller in the full model (cf. cities outlined with black circles).\u003c/p\u003e","description":"","filename":"Figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-5623774/v1/4496f61b0849025b271b229e.png"},{"id":78869330,"identity":"f45e0171-16ff-43f1-81cd-a2f5d2397a12","added_by":"auto","created_at":"2025-03-20 05:28:30","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":499126,"visible":true,"origin":"","legend":"\u003cp\u003eOverview of models explaining Soviet city representation. Top left: seed regions used to initiate optimization. Borders of union-level republics and borders of subregions inside Kazakhstan, Russia and Ukraine are shown in blue and red, respectively. Top right: relevant regions according to the geolocation model, overmentioned regions shown in gradations of blue, underrepresented – in gradations of yellow. Bottom right: relevant city specializations. Bottom left: relevant regions according to the full model, overmentioned regions shown in gradations of green, underrepresented – in gradations of pink. See Table 2 for the values of regional and specializational boost factors.\u003c/p\u003e","description":"","filename":"4mapsfigure150dpi.png","url":"https://assets-eu.researchsquare.com/files/rs-5623774/v1/f4b2d97f08cc82cad9687a1c.png"},{"id":101740357,"identity":"9f9e3b40-1247-4880-88de-607faaa70a0a","added_by":"auto","created_at":"2026-02-03 08:10:57","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1297806,"visible":true,"origin":"","legend":"Article File","description":"","filename":"geography.NC.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5623774/v1_covered_34c3eab7-58fc-4044-af91-571d0a02562e.pdf"},{"id":78869705,"identity":"99cefafd-d9b1-446a-bb19-15369cf9ce3d","added_by":"auto","created_at":"2025-03-20 05:36:30","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":244780,"visible":true,"origin":"","legend":"Supplementary Materials for Quantifying world geography as seen through the lens of Soviet propaganda","description":"","filename":"Geography.NC.Supplementary.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5623774/v1/d6a6f764e820e85bf0b94d39.pdf"},{"id":78869708,"identity":"7ab2eb0e-8404-4f2f-85f3-5a7fe73302b2","added_by":"auto","created_at":"2025-03-20 05:36:31","extension":"csv","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":3317929,"visible":true,"origin":"","legend":"\u003cp\u003eDataset 1\u003c/p\u003e","description":"","filename":"Daily.news.outlines.by.story.csv","url":"https://assets-eu.researchsquare.com/files/rs-5623774/v1/2eec14a284becd0b6cc9212f.csv"},{"id":78869328,"identity":"3e1f426f-9981-48d1-862b-98a928a371da","added_by":"auto","created_at":"2025-03-20 05:28:30","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":389328,"visible":true,"origin":"","legend":"\u003cp\u003eDataset 2A\u003c/p\u003e","description":"","filename":"USSR.citilist.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-5623774/v1/99d60a1c2da5235ef44ef173.xlsx"},{"id":78869329,"identity":"f3389b11-5252-419d-aba8-217a2adeb211","added_by":"auto","created_at":"2025-03-20 05:28:30","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":82246,"visible":true,"origin":"","legend":"\u003cp\u003eDataset 2B\u003c/p\u003e","description":"","filename":"Foreign.citilist.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-5623774/v1/ef9e8ecfc40230e394b34c7e.xlsx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"Quantifying world geography as seen through the lens of Soviet propaganda","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-5623774/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5623774/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Cultural data typically contains a variety of biases. In particular, geographical locations are unequally portrayed in media, creating a distorted representation of the world. Identifying and measuring such biases is crucial to understand both the data and the socio-cultural processes that have produced them. Here we suggest measuring geographical biases in a large historical news media corpus by studying the representation of cities. Leveraging ideas of quantitative urban science, we develop a mixed quantitative-qualitative procedure, which allows us to get robust quantitative estimates of the biases. These biases can be further qualitatively interpreted resulting in a hermeneutic feedback loop. We apply this procedure to a corpus of the Soviet newsreel series 'Novosti Dnya' (News of the Day) and show that city representation grows super-linearly with city size and is further biased by city specialization and geographical location. This allows to systematically identify geographical regions which are explicitly or sneakily emphasized by Soviet propaganda and quantify their importance.","manuscriptTitle":"Quantifying world geography as seen through the lens of Soviet propaganda","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-03-20 05:28:25","doi":"10.21203/rs.3.rs-5623774/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"nature-cities","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"natcities","sideBox":"Learn more about [Nature Cities](https://www.springer.com/journal/44284)","snPcode":"44284","submissionUrl":"","title":"Nature Cities","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"9cb05b9a-36a0-4b14-b862-00fe10367922","owner":[],"postedDate":"March 20th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":42183353,"name":"Scientific community and society/Social sciences/Interdisciplinary studies"},{"id":42183354,"name":"Physical sciences/Mathematics and computing/Statistics"},{"id":42183355,"name":"Scientific community and society/Social sciences/History"},{"id":42183356,"name":"Scientific community and society/Social sciences/Culture"}],"tags":[],"updatedAt":"2026-02-03T08:10:38+00:00","versionOfRecord":{"articleIdentity":"rs-5623774","link":"https://doi.org/10.1038/s44284-025-00380-1","journal":{"identity":"nature-cities","isVorOnly":false,"title":"Nature Cities"},"publishedOn":"2026-02-02 05:00:00","publishedOnDateReadable":"February 2nd, 2026"},"versionCreatedAt":"2025-03-20 05:28:25","video":"","vorDoi":"10.1038/s44284-025-00380-1","vorDoiUrl":"https://doi.org/10.1038/s44284-025-00380-1","workflowStages":[]},"version":"v1","identity":"rs-5623774","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5623774","identity":"rs-5623774","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.