KISS Sorcar: A Stupidly-Simple General-Purpose and Software Engineering AI Assistant

📅 2026-04-26

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses key challenges in leveraging large language models for software engineering assistance—namely context length limitations, single points of failure, infinite loops, and suboptimal code quality—by proposing a “minimalist yet hierarchically attentive” agent design paradigm. Built upon the KISS Agent framework (~1,850 lines of code), the system is implemented as a local VS Code extension featuring a five-layer architecture that sequentially manages execution budgeting, cross-session continuity, tool invocation, multi-turn memory, and Git isolation. It further integrates browser automation, multimodal input support, and Docker compatibility. Emphasizing output fidelity over response latency, the agent undergoes continuous stress testing via bootstrapped development. On Terminal Bench 2.0, it achieves a 62.2% overall pass rate using Claude Opus 4.6, outperforming both Claude Code (58%) and Cursor Composer 2 (61.7%).

Technology Category

Application Category

📝 Abstract

Large language models can generate code and call tools with remarkable fluency, yet deploying them as practical software engineering assistants still expose stubborn gaps: finite context windows, single mistakes that derail entire sessions, agents that get stuck in dead ends, AI slop, and generated changes that are difficult to review or revert. We present KISS Sorcar, a general-purpose assistant and integrated development environment (IDE) built on top of the KISS Agent Framework, a stupidly-simple AI agent framework of roughly 1,850 lines of code. The framework addresses these gaps using a robust system prompt and through a five-layer agent hierarchy in which each layer adds exactly one concern: budget-tracked ReAct execution, automatic continuation across sub-sessions via summarization, coding, and browser tools with parallel sub-agents, persistent multi-turn chat with history recall, and git worktree isolation so every task runs on its own branch. To assess the power of the KISS agent framework, we implemented KISS Sorcar as a free, open-source Visual Studio Code extension that runs locally and effectively for long-horizon tasks, and supports browser automation, multimodal input, and Docker containers. In this research, we deliberately prioritize output quality over latency: giving a frontier model adequate time to validate its own output -- running linters, type checkers, and tests -- dramatically reduces the low-quality code that plagues faster but less thorough agents. The entire system was built using itself in 4.5 months, providing a continuous stress test in which any agent-introduced bug immediately impairs its own ability to work. On Terminal Bench 2.0, KISS Sorcar achieves a 62.2% overall pass rate with Claude Opus 4.6, comparing favorably to Claude Code (58%) and Cursor Composer 2 (61.7).

Problem

Research questions and friction points this paper is trying to address.

software engineering AI assistant

large language models

code generation

agent reliability

development tooling

Innovation

Methods, ideas, or system contributions that make the work stand out.

KISS Agent Framework

layered agent architecture

self-validation