BLUFF Benchmark Featured in Slator

Mar 11, 2026 · 1 min read
post

Excited to share that our BLUFF benchmark is getting some attention!

Slator just covered our work: “New Benchmark Tests AI Detection Across Languages and Translation” — and it captures exactly why this research matters.

BLUFF (Benchmarking in LowresoUrce Languages for detecting Falsehoods and Fake news) is a multilingual benchmark spanning 79 languages (20 high-resource, 59 low-resource) and 200K+ samples, built to stress-test AI detection systems in the places they’re most likely to fail.

What we found

Detection models don’t just struggle with low-resource languages; they struggle systematically. Performance gaps are significant, not marginal. And when you add text transformations like AI translation or hybrid human-AI editing into the mix, even the best systems lose their footing.

This is the digital language divide in action. The communities least represented in training data are also the least protected from AI-generated disinformation.

Grateful to co-author this work with Matt Murtagh, Adaku Uchendu, Ali Al-Lawati, Michiharu Yamashita, Dominik Macko, Ivan Srba, Robert Moro, and my advisor Dr. Dongwon Lee.

The benchmark is publicly available. If you’re working on multilingual NLP, information integrity, or AI robustness, I’d love to hear your thoughts.

Jason Lucas
Authors
Ph.D. Candidate in Informatics

I am a PhD candidate in Informatics in the College of IST at Penn State University, where I conduct research at the PIKE Research Lab under the guidance of Dr. Dongwon Lee. I specialize in AI/ML research focused on Information Integrity, Safe and Ethical AI, including combating harmful content across multiple languages and modalities. My research spans low-resource multilingual NLP, generative AI, and adversarial machine learning, with work extending across 79 languages. I have published 12 papers with 260+ citations in premier venues including ACL, EMNLP, IEEE, and NAACL.

My doctoral research focuses on bridging the digital language divide through transfer learning, classification (NLU), generation (NLG), adversarial attacks, and developing end-to-end AI pipelines using RAG and Agentic AI workflows for combating multilingual threats. Drawing from my Grenadian background and knowledge of local Creole languages, I bring a global perspective to AI challenges, working to democratize state-of-the-art AI capabilities for underserved linguistic communities worldwide. My mission is to develop robust multilingual multimodal systems and mitigate evolving security vulnerabilities while enhancing access to human language technology through cutting-edge solutions.

As an NSF LinDiv Fellow, I conduct transdisciplinary research advancing human-AI language interaction for social good. I actively mentor 5+ research interns and teach Applied Generative AI courses. Through industry experience at Lawrence Livermore National Lab, Interaction LLC, and Coalfire, I bridge academic research with practical applications in combating evolving security threats and enhancing global AI accessibility. I see multilingual advances and interdisciplinary collaboration as a competitive advantage, not a communication challenge. Beyond research, I stay active through dance, fitness, martial arts, and community service.