BLUFF: Benchmarking in Low-resoUrce Languages for detecting Falsehoods and Fake news

Feb 1, 2026·
Jason Lucas
Jason Lucas
,
Matt Murtagh-White
,
Adaku Uchendu
,
Ali Al-Lawati
,
Michiharu Yamashita
,
Dominik Macko
,
Ivan Srba
,
Robert Moro
,
Dongwon Lee
· 1 min read
BLUFF Framework Overview
Abstract
Multilingual falsehoods threaten information integrity worldwide, yet detection benchmarks remain confined to English or a few high-resource languages, leaving low-resource linguistic communities without robust defense tools. We introduce BLUFF (Benchmarking in Low-resoUrce Languages for detecting Falsehoods and Fake news), a comprehensive benchmark for detecting false and synthetic content, spanning 79 languages with over 202K samples, combining human-written fact-checked content (122K+ samples across 57 languages) and LLM-generated content (79K+ samples across 71 languages). BLUFF uniquely covers both high-resource “big-head” (20) and low-resource “long-tail” (59) languages, addressing critical gaps in multilingual research on detecting false and synthetic content. Our dataset features four content types (human-written, LLM-generated, LLM-translated, and hybrid human-LLM text), bidirectional translation (English↔X), 39 textual modification techniques, and varying edit intensities generated using 19 diverse LLMs. We present AXL-CoI (Adversarial Cross-Lingual Agentic Chain-of-Interactions), a novel multi-agentic framework for controlled fake/real news generation, paired with mPURIFY, a quality filtering pipeline ensuring dataset integrity. Experiments reveal state-of-the-art detectors suffer up to 25.3% F1 degradation on low-resource versus high-resource languages.
Type
Publication
Under Review 2026 — Datasets and Benchmarks Track
publication

BLUFF is the largest multilingual fake news detection benchmark to date, spanning 79 languages (20 high-resource “big-head” + 59 low-resource “long-tail”) with over 202,000 samples. The benchmark combines human-written fact-checked content from 130 IFCN-certified organizations with LLM-generated content from 19 diverse models.

Key contributions include:

  • AXL-CoI (Adversarial Cross-Lingual Agentic Chain-of-Interactions): A multi-agentic framework using 10 fake chains and 8 real chains for controlled multilingual content generation
  • mPURIFY: A 4-stage quality filtering pipeline with 32 features across 5 dimensions, ensuring dataset integrity through asymmetric evaluation thresholds
  • Bidirectional translation: English↔X coverage across 70+ languages with 4 prompt variants
  • Comprehensive evaluation: State-of-the-art detectors suffer up to 25.3% Macro-F1 degradation on low-resource versus high-resource languages

Resources:

Jason Lucas
Authors
Ph.D. Candidate · Incoming Assistant Professor & Director, Secure and Ethical AI Lab (SEAL) — CU Boulder (Aug 2026)

I am a PhD candidate in Informatics in the College of IST at Penn State University, where I conduct research at the PIKE Research Lab under the guidance of Dr. Dongwon Lee. Starting August 2026, I will join the Department of Information Science at the College of Media, Communication and Information (CMDI), University of Colorado Boulder, as a Tenure-Track Assistant Professor and founding Director of the Secure and Ethical AI Lab (SEAL). My research advances trustworthy and equitable AI for the world’s languages and communities — spanning multilingual NLP, low-resource and dialectal language technology, AI safety, and information integrity, with work extending across 70+ languages. I have authored 14+ peer-reviewed papers with 315+ citations in premier venues including ACL, EMNLP, NAACL, ICML, and IEEE.

My doctoral research focuses on bridging the digital language divide through transfer learning, classification (NLU), generation (NLG), adversarial attacks, and developing end-to-end AI pipelines using RAG and Agentic AI workflows for combating multilingual threats. Drawing from my Grenadian background and knowledge of local Creole languages, I bring a global perspective to AI challenges, working to democratize state-of-the-art AI capabilities for underserved linguistic communities worldwide. My mission is to develop robust multilingual multimodal systems and mitigate evolving security vulnerabilities while enhancing access to human language technology through cutting-edge solutions.

As an NSF LinDiv Fellow, I conduct transdisciplinary research advancing human-AI language interaction for social good. I actively mentor 5+ research interns and teach Applied Generative AI courses. Through industry experience at Lawrence Livermore National Lab, Interaction LLC, and Coalfire, I bridge academic research with practical applications in combating evolving security threats and enhancing global AI accessibility. I see multilingual advances and interdisciplinary collaboration as a competitive advantage, not a communication challenge. Beyond research, I stay active through dance, fitness, martial arts, and community service.

Authors