2025-04-15 | By Mariusz Jażdżyk
How I Solved the Problem of a Fatigued AI Tester
Can an intelligent agent effectively help?
While working on one of the AI projects, I encountered a classic problem: the tester didn’t have enough time to thoroughly check the solution. A few conversations weren’t enough – the chatbot isn’t deterministic, so even if it works correctly today, it might make an error tomorrow. How do you ensure that the system will be reliable and won’t surprise users after deployment?
The solution was a system of four agents:
🔹 Agent – the chatbot, which must function perfectly.
🔹 User Agent – simulates a real user, conducting hundreds of conversations.
🔹 Supervisor Agent – analyzes conversations, tracks KPIs, and draws conclusions.
🔹 Teacher Agent – identifies weak points and improves the model.
Each night, this team conducted thousands of conversations, systematically improving quality. The tester no longer had to spend hours manually checking – all they needed to do was analyze the results.
Is this the future of AI testing?
We’re heading in that direction for now.
How does testing look in your project? 🚀
Author: Mariusz Jażdżyk
Lecturer at Kozminski University, author of the book “Chief Data Officer,” specializing in building data-driven organizations. He supports startups in the practical implementation of data strategies and AI solutions.