Identification of Trolling Memes  in Kannada and Tulu - Under-resourced Dravidian Languages

doi:10.21203/rs.3.rs-4663307/v1

Identification of Trolling Memes in Kannada and Tulu - Under-resourced Dravidian Languages

2024 · doi:10.21203/rs.3.rs-4663307/v1

preprint OA: closed

Full text JSON View at publisher

Full text 12,113 characters · extracted from preprint-html · click to expand

Identification of Trolling Memes in Kannada and Tulu - Under-resourced Dravidian Languages | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Identification of Trolling Memes in Kannada and Tulu - Under-resourced Dravidian Languages Asha Hegde, Shashirekha Hosahalli Lakshmaiah This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4663307/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract On social media platforms, information, ideas, or other forms of expressions are created and/or shared among people in an interactive manner. During exchange of information, users may encounter humorous, funny, offensive, trolling, or malicious content targeting the individuals, groups, or communities. One common way of trolling on social media is to create memes by combining an image with textual information - usually a catchy phrase often obscured by humor or sarcasm, and share it on social media. Memes shared with the intention of trolling need to be filtered out from social media as they may hurt the sentiments of people and create an unhealthy atmosphere in the society. The increasing number of social media users and the increasing number of trolls on social media complicates the task of identifying the trolling memes manually. Hence, there is a demand for the tools to automatically identify the trolling memes. However, this task is challenging due to the unavailability of annotated data. The complexity of the task gets intensified if the text is written in code-mixed under-resourced regional languages like Kannada or Tulu - the languages of south India. To tackle the unavailability of annotated data and tools to identify trolling memes in under-resourced languages - Kannada and Tulu, we created two datasets: i) \textit{KAmemes} - a meme dataset embedded code-mixed text in Kannada and ii) \textit{TUmemes} - a meme dataset embedded code-mixed text in Tulu, consisting of memes labeled as \lq Troll' and \lq Not_Troll'. To benchmark these datasets, Uni-modal and Multi-modal models are proposed to classify a given meme as \lq Troll' or \lq Not_Troll'. While the uni-modal approaches consider only text or only image to classify a given meme, multi-modal approaches explore both text and image modalities. Several ML and DL baselines are implemented for uni-modal and multi-modal models. The proposed baselines are also evaluated on the available \textit{TamilMemes} dataset to illustrate their efficacy. Among the proposed baselines, a multi-modal joint representation based dual encoder model achieved the best macro F1 scores of 0.90, 0.78, and 0.58 for \textit{TUmemes}, \textit{KAmemes}, and \textit{TamilMemes} datasets respectively. Trolling memes Under-resourced languages Dravidian languages Multi-modal Early fusion Joint representation Code-mixing Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4663307","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":330681201,"identity":"2175c31e-a016-4638-8c56-abf2c3da408c","order_by":0,"name":"Asha Hegde","email":"","orcid":"","institution":"Mangalore University","correspondingAuthor":false,"prefix":"","firstName":"Asha","middleName":"","lastName":"Hegde","suffix":""},{"id":330681202,"identity":"56386b59-cf53-48c7-99e0-4b116347d962","order_by":1,"name":"Shashirekha Hosahalli Lakshmaiah","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABTklEQVRIie3OsWrCQBjA8S9cSZajrp9EzCtcOUgUrH2VhIAuQjfp4CAIcRFdU3wJoeCccpAumYtgh4rg5JBCEYdim2scTBDnQvMfLnfH/S4HUFT0B2OaHIMASum6ln5sOZDjGTVHyJGU+79rPCXkMmHBKUkjcC6LaJs4jt4M/io2q88eVq3hyHxfeY17Y1gKEXpNuNb7p6Q+oPzRX2xu5suWxSsh8koUWczxWvWZICpC6IJaCTIPE5QQGgtlvrRNHVV0fOyYaEeCMSKJGoCKdpZoa/IVi7unaXun4yEhxlaSb2YMJDmcIcAJLIQz0ztm+cOTf6EJeQgYyIcp3hlCuTKKhOsvO11dGSNH2uomxGVMEF5zxi7Nk5fhGvahuJ1M2/PyfteooiaSCWsyY/K8WsS7ZtXwMyQTobmNK3l/fjOTss/fEV86XlRUVPRf+gH1j3JeCNGV9AAAAABJRU5ErkJggg==","orcid":"","institution":"Mangalore University","correspondingAuthor":true,"prefix":"","firstName":"Shashirekha","middleName":"Hosahalli","lastName":"Lakshmaiah","suffix":""}],"badges":[],"createdAt":"2024-06-30 15:06:06","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4663307/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4663307/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":62949092,"identity":"381dd53f-5108-4634-8148-0cfa2030a384","added_by":"auto","created_at":"2024-08-21 10:53:36","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1342730,"visible":true,"origin":"","legend":"","description":"","filename":"trollingmemesclassification.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4663307/v1_covered_bc3f903b-2032-45f6-9b21-effe7d0ec2cf.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Identification of Trolling Memes in Kannada and Tulu - Under-resourced Dravidian Languages","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Trolling memes, Under-resourced languages, Dravidian languages, Multi-modal, Early fusion, Joint representation, Code-mixing","lastPublishedDoi":"10.21203/rs.3.rs-4663307/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4663307/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"On social media platforms, information, ideas, or other forms of expressions are created and/or shared among people in an interactive manner. During exchange of information, users may encounter humorous, funny, offensive, trolling, or malicious content targeting the individuals, groups, or communities. One common way of trolling on social media is to create memes by combining an image with textual information - usually a catchy phrase often obscured by humor or sarcasm, and share it on social media. Memes shared with the intention of trolling need to be filtered out from social media as they may hurt the sentiments of people and create an unhealthy atmosphere in the society. The increasing number of social media users and the increasing number of trolls on social media complicates the task of identifying the trolling memes manually. Hence, there is a demand for the tools to automatically identify the trolling memes. However, this task is challenging due to the unavailability of annotated data. The complexity of the task gets intensified if the text is written in code-mixed under-resourced regional languages like Kannada or Tulu - the languages of south India. To tackle the unavailability of annotated data and tools to identify trolling memes in under-resourced languages - Kannada and Tulu, we created two datasets: i)~\\textit{KAmemes} - a meme dataset embedded code-mixed text in Kannada and ii)~\\textit{TUmemes} - a meme dataset embedded code-mixed text in Tulu, consisting of memes labeled as \\lq Troll' and \\lq Not\\_Troll'. To benchmark these datasets, Uni-modal and Multi-modal models are proposed to classify a given meme as \\lq Troll' or \\lq Not\\_Troll'. While the uni-modal approaches consider only text or only image to classify a given meme, multi-modal approaches explore both text and image modalities. Several ML and DL baselines are implemented for uni-modal and multi-modal models. The proposed baselines are also evaluated on the available \\textit{TamilMemes} dataset to illustrate their efficacy. Among the proposed baselines, a multi-modal joint representation based dual encoder model achieved the best macro F1 scores of 0.90, 0.78, and 0.58 for \\textit{TUmemes}, \\textit{KAmemes}, and \\textit{TamilMemes} datasets respectively.","manuscriptTitle":"Identification of Trolling Memes in Kannada and Tulu - Under-resourced Dravidian Languages","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-07-29 15:05:18","doi":"10.21203/rs.3.rs-4663307/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"cfe03112-b05d-49ee-91c7-9f5a7bf6881f","owner":[],"postedDate":"July 29th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-08-21T10:45:28+00:00","versionOfRecord":[],"versionCreatedAt":"2024-07-29 15:05:18","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4663307","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4663307","identity":"rs-4663307","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00