Case study
//
Studocu

case study

SaaS

Studocu: Ensuring content integrity in the age of generative AI

We partnered with Studocu to build a custom detection system that differentiates between student-authored notes and AI-generated content, protecting the authenticity of their global knowledge-sharing platform.

Dale Wesdorp

February 27, 2026

Share

The challenge

The problem
behind the brief

Studocu’s business model relies on the authentic exchange of study materials between students. However, the rise of large language models created a significant risk: a surge in AI-generated uploads that threatened to dilute the quality and reliability of their library.

To maintain user trust, Studocu needed a way to accurately identify the origin of every document. Existing off-the-shelf tools weren't precise enough for their specific needs. They required a custom, high-performance solution that could scale with their massive volume of daily uploads.

Our approach

How we built the right thing, and built it right

Strategy

Our focus was on technical de-risking and data quality. We knew that for a model to be effective, it needed to understand the nuances of student-specific writing versus machine-generated patterns. We defined a strategy centered on high-fidelity data generation to "teach" the system what to look for before these challenges became mainstream.

Design & Data

We designed a robust data structure to generate vast amounts of AI content in bulk. This wasn't about building a "cool feature," but about creating a rigorous training environment. By building this comprehensive dataset, we could conduct precise comparisons between human and machine output.

Development

We fine-tuned the 'RoBERTa' language model, specifically optimizing it for detection within an academic context. We handled the full technical implementation, ensuring the model could be integrated into Studocu’s existing pipeline to assess documents in real-time. The resulting architecture was built for performance and accuracy, outperforming industry-standard alternatives.

No items found.

Miyagami helped us develop a high-performing AI model, seeing the final product work so effectively has given us the perfect head start.

Marnix Broer

CEO

Collaboration

Results that scale

A custom-built detection engine doubling industry-standard accuracy.

Performance

2x flagging rate vs industry standard

Our custom-tuned model outperformed existing industry options, detecting up to 98% of both AIGC and UGC in documents, and around 80% of AIGC and UGC in question answers.

Trust

Verified content authenticity

By identifying the origin of documents, Studocu can now maintain a library of genuine, student-generated resources, protecting the core value of their platform.

Capability

RD & Strategic edge

Built before the public surge of generative AI, the system gave Studocu a RD and strategic edge, allowing them to scale their operations without compromising on content integrity.

Other case studies