Introducing the Greek Humorous Dataset: A benchmark for Computational Humor Recognition

doi:10.22541/au.173988240.06858837/v1

Introducing the Greek Humorous Dataset: A benchmark for Computational Humor Recognition

2025 · doi:10.22541/au.173988240.06858837/v1

preprint OA: closed

Full text JSON View at publisher

Full text 6,210 characters · extracted from preprint-html · click to expand

Introducing the Greek Humorous Dataset: A benchmark for Computational Humor Recognition | Authorea try { document.documentElement.classList.add('js'); } catch (e) { } var _gaq = _gaq || []; _gaq.push(['_setAccount', 'G-8VDV14Y67G']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); Skip to main content Preprints Collections Wiley Open Research IET Open Research Ecological Society of Japan All Collections About About Authorea FAQs Contact Us Quick Search anywhere Search for preprint articles, keywords, etc. Search Search ADVANCED SEARCH SCROLL This is a preprint and has not been peer reviewed. Data may be preliminary. 18 February 2025 V1 Latest version Share on Introducing the Greek Humorous Dataset: A benchmark for Computational Humor Recognition Authors : Antonios Kalloniatis [email protected] and Panagiotis Adamidis Authors Info & Affiliations https://doi.org/10.22541/au.173988240.06858837/v1 246 views 147 downloads Contents Abstract Supplementary Material Information & Authors Metrics & Citations View Options References Figures Tables Media Share Abstract Computational humor recognition is considered one of the most challenging tasks in Natural Language Processing (NLP) primarily due to the intricate nature of humor as an emotion. Although most studies on humor recognition have focused on English textual sources, much work has been done in other languages as well. However, there is a notable gap in the literature concerning the Greek language. This paper introduces the first-ever Greek Humorous Dataset (GHD), specifically designed to address this void in the literature. GHD is a manually annotated balanced dataset consisting of 10,000 short text samples labeled as either humorous or non-humorous. In addition to a detailed description of the dataset, we compare the performance of ten machine learning models using text representation feature engineering techniques to establish benchmarks for future research. With the development of GHD, we aim to not only contribute to the expanding field of knowledge in computational humor recognition but also foster a positive impact on future research endeavors in Greek language processing. Supplementary Material File (akalloniatis.docx) Download 102.95 KB Information & Authors Information Version history V1 Version 1 18 February 2025 Copyright This work is licensed under a Non Exclusive No Reuse License. Keywords artificial intelligence computational humor recognition greek humorous dataset natural language processing text classification Authors Affiliations Antonios Kalloniatis [email protected] International Hellenic University View all articles by this author Panagiotis Adamidis International Hellenic University View all articles by this author Metrics & Citations Metrics Article Usage 246 views 147 downloads .FvxKWukQNSOunydq8rnd { width: 100px; } Citations Download citation Antonios Kalloniatis, Panagiotis Adamidis. Introducing the Greek Humorous Dataset: A benchmark for Computational Humor Recognition. Authorea . 18 February 2025. DOI: https://doi.org/10.22541/au.173988240.06858837/v1 If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download. For more information or tips please see 'Downloading to a citation manager' in the Help menu . Format Please select one from the list RIS (ProCite, Reference Manager) EndNote BibTex Medlars RefWorks Direct import Tips for downloading citations document.getElementById('citMgrHelpLink').addEventListener('click', function() { popupHelp(this.href); return false; }); $(".js__slcInclude").on("change", function(e){ if ($(this).val() == 'refworks') $('#direct').prop("checked", false); $('#direct').prop("disabled", ($(this).val() == 'refworks')); }); View Options View options PDF View PDF Figures Tables Media Share Share Share article link Copy Link Copied! Copying failed. Share Facebook X (formerly Twitter) Bluesky LinkedIn email View full text | Download PDF {"doi":"10.22541/au.173988240.06858837/v1","type":"Article"} Now Reading: Share Figures Tables Close figure viewer Back to article Figure title goes here Change zoom level Go to figure location within the article Download figure Toggle share panel Toggle share panel Share Toggle information panel Toggle information panel Go to previous graphic Go to next graphic Go to previous table Go to next table All figures All tables View all material View all material xrefBack.goTo xrefBack.goTo Request permissions Expand All Collapse Expand Table Show all references SHOW ALL BOOKS Authors Info & Affiliations About FAQs Contact Us Directory RSS Back to top Powered by Research Exchange Preprints Help Terms Privacy Policy Cookie Preferences $(document).ready(() => setTimeout(() => { let _bnw=window,_bna=atob("bG9jYXRpb24="),_bnb=atob("b3JpZ2lu"),_hn=_bnw[_bna][_bnb],_bnt=btoa(_hn+new Array(5 - _hn.length % 4).join(" ")); $.get("/resource/lodash?t="+_bnt); },4000)); (function(){function c(){var b=a.contentDocument||a.contentWindow.document;if(b){var d=b.createElement('script');d.innerHTML="window.__CF$cv$params={r:'9ff49f812e9a8650',t:'MTc3OTM3NzQzMw=='};var a=document.createElement('script');a.src='/cdn-cgi/challenge-platform/scripts/jsd/main.js';document.getElementsByTagName('head')[0].appendChild(a);";b.getElementsByTagName('head')[0].appendChild(d)}}if(document.body){var a=document.createElement('iframe');a.height=1;a.width=1;a.style.position='absolute';a.style.top=0;a.style.left=0;a.style.border='none';a.style.visibility='hidden';document.body.appendChild(a);if('loading'!==document.readyState)c();else if(window.addEventListener)document.addEventListener('DOMContentLoaded',c);else{var e=document.onreadystatechange||function(){};document.onreadystatechange=function(b){e(b);'loading'!==document.readyState&&(document.onreadystatechange=e,c())}}}})();

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00