Tendem: A Hybrid AI+Human Platform

📅 2026-02-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited reliability of AI systems in complex or uncertain tasks by introducing Tendem, a collaborative platform that integrates autonomous AI agents—equipped with web browsing, tool invocation, and domain-specific reasoning capabilities—with human experts. The system delegates structured tasks to AI while enabling human intervention during model failures or when verification is required, supported by a comprehensive quality assurance mechanism throughout the workflow. Evaluated on 94 real-world tasks, Tendem significantly outperforms purely AI-driven or fully manual approaches in both output quality and turnaround time, at a cost comparable to human-only execution. Furthermore, the AI component achieves near state-of-the-art performance on third-party benchmarks, demonstrating a balanced integration of high quality, efficiency, and cost control.

Technology Category

Application Category

📝 Abstract
Tendem is a hybrid system where AI handles structured, repeatable work and Human Experts step in when the models fail or to verify results. Each result undergoes a comprehensive quality review before delivery to the Client. To assess Tendem's performance, we conducted a series of in-house evaluations on 94 real-world tasks, comparing it with AI-only agents and human-only workflows carried out by Upwork freelancers. The results show that Tendem consistently delivers higher-quality outputs with faster turnaround times. At the same time, its operational costs remain comparable to human-only execution. On third-party agentic benchmarks, Tendem's AI Agent (operating autonomously, without human involvement) performs near state-of-the-art on web browsing and tool-use tasks while demonstrating strong results in frontier domain knowledge and reasoning.
Problem

Research questions and friction points this paper is trying to address.

hybrid AI-human system
task quality
turnaround time
operational cost
real-world tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

hybrid AI-human system
quality review
autonomous agent
tool-use
human-in-the-loop
🔎 Similar Papers
No similar papers found.
K
Konstantin Chernyshev
Ekaterina Artemova
Ekaterina Artemova
Toloka.AI, ex-HSE, ex-LMU
natural language processingbenchmarkinglarge language models
V
Viacheslav Zhukov
M
Maksim Nerush
Mariia Fedorova
Mariia Fedorova
University of Oslo
NLP
I
Iryna Repik
O
Olga Shapovalova
A
Aleksey Sukhorosov
V
Vladimir Dobrovolskii
N
Natalia Mikhailova
S
Sergei Tilga