GraphSOS: Graph Sampling and Order Selection to Help LLMs Understand Graphs Better | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article GraphSOS: Graph Sampling and Order Selection to Help LLMs Understand Graphs Better xu chu, hanlin xue, zhijie tan, bingce wang, tong mo, weiping li This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7288034/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 9 You are reading this latest preprint version Abstract The success of Large Language Models (LLMs) in various domains has led researchers to apply them to graph-related problems by converting graph data into natural language text. However, unlike graph data, natural language inherently has sequential order. We observe a counter-intuitive fact that when the order of nodes or edges in the natural language description of a graph is shuffled, despite describing the same graph, model performance fluctuates between high performance and random guessing. Additionally, due to LLMs' limited input context length, current methods typically randomly sample neighbors of target nodes as representatives of their neighborhood, which may not always be effective for accurate reasoning. To address these gaps, we introduce GraphSOS (Graph \underline{S}ampling and \underline{O}rder \underline{S}election). This novel model framework features an Order Selector Module to ensure proper serialization order of the graph and a Subgraph Sampling Module to sample subgraphs with better structure for better reasoning. Furthermore, we propose Graph CoT obtained through distillation, and enhance LLM's reasoning and zero-shot learning capabilities for graph tasks through instruction tuning. Experiments on multiple datasets for node classification and graph question-answering demonstrate that GraphSOS improves LLMs' performance and generalization ability on graph tasks. Graph learning LLMs Order sensitivity Subgraph Sampling Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 26 Mar, 2026 Reviews received at journal 03 Dec, 2025 Reviews received at journal 02 Dec, 2025 Reviewers agreed at journal 25 Nov, 2025 Reviewers agreed at journal 23 Nov, 2025 Reviewers invited by journal 29 Oct, 2025 Editor assigned by journal 12 Aug, 2025 Submission checks completed at journal 12 Aug, 2025 First submitted to journal 04 Aug, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7288034","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":538954142,"identity":"32e7b179-3414-4972-a759-8b0856906ff9","order_by":0,"name":"xu chu","email":"","orcid":"","institution":"Peking University","correspondingAuthor":false,"prefix":"","firstName":"xu","middleName":"","lastName":"chu","suffix":""},{"id":538954143,"identity":"128c1895-29d6-42d1-ba41-13ebcbccf69a","order_by":1,"name":"hanlin xue","email":"","orcid":"","institution":"Peking University","correspondingAuthor":false,"prefix":"","firstName":"hanlin","middleName":"","lastName":"xue","suffix":""},{"id":538954145,"identity":"4a05f3ad-b164-4b9e-84c8-518621ddf46a","order_by":2,"name":"zhijie tan","email":"","orcid":"","institution":"Peking University","correspondingAuthor":false,"prefix":"","firstName":"zhijie","middleName":"","lastName":"tan","suffix":""},{"id":538954146,"identity":"e8d63bf5-f93e-4135-b3f6-4597a38292ea","order_by":3,"name":"bingce wang","email":"","orcid":"","institution":"Peking University","correspondingAuthor":false,"prefix":"","firstName":"bingce","middleName":"","lastName":"wang","suffix":""},{"id":538954147,"identity":"d1ebed73-84dd-49f5-8ae2-5af5a049661f","order_by":4,"name":"tong mo","email":"","orcid":"","institution":"Peking University","correspondingAuthor":false,"prefix":"","firstName":"tong","middleName":"","lastName":"mo","suffix":""},{"id":538954148,"identity":"569fbeb2-1cdc-4b2f-85c4-f7e581dd9bab","order_by":5,"name":"weiping li","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAArklEQVRIiWNgGAWjYFAC5oYDHxgSQCwDYrUwNhycQbIWZh6StBjcbmw8bLsjLbGBvXmbBEPNHSK03DnYcDj3TE5iA8+xMgmGY88IazG7kQjU0laR2CCRYybB2HCYSC2WIC3yb0jRwtgGdJgED5Fa7IF+Odh7Js24jSet2CLhGBFaJGc3H/7wc0eybD/74Y03PtQQoYVBggEUNQwMbCBOAhEaEFpGwSgYBaNgFOAEAHKIP8LCY5cwAAAAAElFTkSuQmCC","orcid":"","institution":"Peking University","correspondingAuthor":true,"prefix":"","firstName":"weiping","middleName":"","lastName":"li","suffix":""}],"badges":[],"createdAt":"2025-08-04 07:23:16","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7288034/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7288034/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":95365058,"identity":"6c0082f7-674a-4199-a5c9-f153805fa01b","added_by":"auto","created_at":"2025-11-07 08:31:42","extension":"json","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":7100,"visible":true,"origin":"","legend":"","description":"","filename":"7f3bd39cd60b461f89c8f24b545de7d0.json","url":"https://assets-eu.researchsquare.com/files/rs-7288034/v1/fb1cb1235e5e5b80c362ff42.json"},{"id":95525593,"identity":"c92f7c51-9ee9-4912-a101-f3965cb38404","added_by":"auto","created_at":"2025-11-10 10:05:22","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1371166,"visible":true,"origin":"","legend":"","description":"","filename":"GraphSOSGraphSamplingandOrderSelectiontoHelpLLMsUnderstandGraphsBetter15.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7288034/v1_covered_d5cbfb05-a85d-43d5-a3d1-bbc57d71fdc2.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"GraphSOS: Graph Sampling and Order Selection to Help LLMs Understand Graphs Better","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"world-wide-web","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"wwwj","sideBox":"Learn more about [World Wide Web](http://link.springer.com/journal/11280)","snPcode":"11280","submissionUrl":"https://submission.nature.com/new-submission/11280/3","title":"World Wide Web","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Graph learning, LLMs, Order sensitivity, Subgraph Sampling","lastPublishedDoi":"10.21203/rs.3.rs-7288034/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7288034/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"The success of Large Language Models (LLMs) in various domains has led researchers to apply them to graph-related problems by converting graph data into natural language text. However, unlike graph data, natural language inherently has sequential order. We observe a counter-intuitive fact that when the order of nodes or edges in the natural language description of a graph is shuffled, despite describing the same graph, model performance fluctuates between high performance and random guessing. Additionally, due to LLMs' limited input context length, current methods typically randomly sample neighbors of target nodes as representatives of their neighborhood, which may not always be effective for accurate reasoning. To address these gaps, we introduce GraphSOS (Graph \\underline{S}ampling and \\underline{O}rder \\underline{S}election). This novel model framework features an Order Selector Module to ensure proper serialization order of the graph and a Subgraph Sampling Module to sample subgraphs with better structure for better reasoning. Furthermore, we propose Graph CoT obtained through distillation, and enhance LLM's reasoning and zero-shot learning capabilities for graph tasks through instruction tuning. Experiments on multiple datasets for node classification and graph question-answering demonstrate that GraphSOS improves LLMs' performance and generalization ability on graph tasks.","manuscriptTitle":"GraphSOS: Graph Sampling and Order Selection to Help LLMs Understand Graphs Better","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-11-07 08:31:37","doi":"10.21203/rs.3.rs-7288034/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-03-26T12:08:40+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-04T02:19:34+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-02T14:57:43+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"105329519802690562448235046430811270555","date":"2025-11-26T03:50:34+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"200420854298363521967341316936419401769","date":"2025-11-23T23:30:40+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-10-29T06:25:52+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-08-13T03:31:08+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-08-13T03:29:40+00:00","index":"","fulltext":""},{"type":"submitted","content":"World Wide Web","date":"2025-08-04T07:12:39+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"world-wide-web","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"wwwj","sideBox":"Learn more about [World Wide Web](http://link.springer.com/journal/11280)","snPcode":"11280","submissionUrl":"https://submission.nature.com/new-submission/11280/3","title":"World Wide Web","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"642d4d5b-9058-4aa2-ba5b-7afef7bad7ad","owner":[],"postedDate":"November 7th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-04-23T08:08:53+00:00","versionOfRecord":[],"versionCreatedAt":"2025-11-07 08:31:37","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7288034","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7288034","identity":"rs-7288034","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.