In this talk, we present Beemo, one of the first multi-author benchmarks for machine-generated text (MGT) detection that includes expert-edited responses. Our benchmark comprises 19.6k texts spanning human-written, machine-generated, and expert-edited content across five use cases. We evaluate 33 MGT detector configurations and reveal that expert-based editing effectively evades detection, while LLM-edited texts remain more detectable than human-authored content.