Adaku Uchendu

DIA-HARM: Harmful Content Detection Robustness Across 50 English Dialects featured image

DIA-HARM: Harmful Content Detection Robustness Across 50 English Dialects

DIA-HARM evaluates 16 harmful content detection models across 50 English dialects using 195K+ samples, revealing 1.4–3.6% F1 drops for fine-tuned models and up to 27% for zero-shot …

avatar
Jason Lucas
BLUFF: Benchmarking in Low-resoUrce Languages for detecting Falsehoods and Fake news featured image

BLUFF: Benchmarking in Low-resoUrce Languages for detecting Falsehoods and Fake news

BLUFF is the largest multilingual fake news detection benchmark, spanning 79 languages with 202K+ samples. It introduces AXL-CoI for adversarial generation and mPURIFY for quality …

avatar
Jason Lucas
Beyond speculation: Measuring the Growing Presence of LLM-generated texts in Multilingual Disinformation featured image

Beyond speculation: Measuring the Growing Presence of LLM-generated texts in Multilingual Disinformation

This IEEE article provides empirical measurements of LLM-generated texts in multilingual disinformation, moving beyond speculation to analyze the growing presence and …

dominik-macko
Beemo: Benchmark of Expert-edited Machine-generated Outputs featured image

Beemo: Benchmark of Expert-edited Machine-generated Outputs

Beemo introduces a novel benchmark featuring 6.5k expert-edited machine-generated texts across diverse domains from creative writing to summarization. Through comprehensive …

ekaterina-artemova
Authorship Obfuscation in Multilingual Machine-Generated Text Detection featured image

Authorship Obfuscation in Multilingual Machine-Generated Text Detection

This research from Penn State and KiNiT, benchmarks the effectiveness of 10 authorship obfuscation (AO) techniques against 37 machine-generated text (MGT) detection methods across …

dominik-macko

Fighting Fire with Fire - EMNLP 2023

The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation

jason-lucas
MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark featured image

MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

This research from Penn State and KiNiT introduces MULTITuDE, a novel multilingual dataset for detecting machine-generated text. Comprised of over 74,000 authentic and …

dominik-macko
Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation featured image

Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation

This research project is a collaboration with Penn State and MIT Lincoln Lab. Our study demonstrates the dual capacity of LLMs for offensive misuse and defense detection against …

avatar
Jason Lucas