Who is a Better Matchmaker? Human vs. Algorithmic Judge Assignment in a High-Stakes Startup Competition

📅 2025-10-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the critical decision problem of intelligent judge–startup matching in high-stakes entrepreneurial competitions. We propose the Hybrid Lexical-Semantic Similarity Ensemble (HLSE) algorithm, which integrates lexical similarity, domain-expertise modeling, and ensemble learning. Evaluated on a large-scale real-world innovation challenge, HLSE achieves human-expert-level matching quality—average blind-review scores of 3.90 versus 3.94 (p > 0.05)—while enabling scalable deployment. Compared to conventional manual assignment (requiring ~1 week), HLSE reduces matching time to several hours, markedly improving both efficiency and consistency. Its core contribution is the first empirically validated, automated judge-allocation framework that jointly models semantic understanding and domain knowledge. This work establishes a reusable methodology and practical exemplar for high-stakes human-AI collaborative decision-making.

Technology Category

Application Category

📝 Abstract
There is growing interest in applying artificial intelligence (AI) to automate and support complex decision-making tasks. However, it remains unclear how algorithms compare to human judgment in contexts requiring semantic understanding and domain expertise. We examine this in the context of the judge assignment problem, matching submissions to suitably qualified judges. Specifically, we tackled this problem at the Harvard President's Innovation Challenge, the university's premier venture competition awarding over $500,000 to student and alumni startups. This represents a real-world environment where high-quality judge assignment is essential. We developed an AI-based judge-assignment algorithm, Hybrid Lexical-Semantic Similarity Ensemble (HLSE), and deployed it at the competition. We then evaluated its performance against human expert assignments using blinded match-quality scores from judges on $309$ judge-venture pairs. Using a Mann-Whitney U statistic based test, we found no statistically significant difference in assignment quality between the two approaches ($AUC=0.48, p=0.40$); on average, algorithmic matches are rated $3.90$ and manual matches $3.94$ on a 5-point scale, where 5 indicates an excellent match. Furthermore, manual assignments that previously required a full week could be automated in several hours by the algorithm during deployment. These results demonstrate that HLSE achieves human-expert-level matching quality while offering greater scalability and efficiency, underscoring the potential of AI-driven solutions to support and enhance human decision-making for judge assignment in high-stakes settings.
Problem

Research questions and friction points this paper is trying to address.

Comparing human versus algorithmic judge assignment quality
Developing AI solution for high-stakes startup competition matching
Automating complex decision-making requiring semantic understanding expertise
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid Lexical-Semantic Similarity Ensemble algorithm for judge assignment
Automated matching achieving human-expert-level quality scores
Reduced assignment time from one week to several hours
🔎 Similar Papers
No similar papers found.