FLANS at SemEval-2026 Task 7: RAG with Open-Sourced Smaller LLMs for Everyday Knowledge Across Diverse Languages and Cultures

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This work addresses the challenge of everyday knowledge question answering—including both short-answer and multiple-choice formats—in multilingual and cross-cultural settings. The authors propose a retrieval-augmented generation (RAG) framework that integrates open-source small language models deployed via Ollama, a custom culture-aware knowledge base (CulKBs) combining localized Wikipedia content with real-time DuckDuckGo search results, and iterative prompt engineering. Evaluated on SemEval-2026 Task 7, the system supports English, Spanish, and Chinese while prioritizing user privacy and computational sustainability. All code and resources are publicly released to advance research on lightweight, culturally adaptive multilingual question answering systems.

Technology Category

Application Category

📝 Abstract

This system paper describes our participation in the SemEval-2025 Task-7 ``Everyday Knowledge Across Diverse Languages and Cultures''. We attended two subtasks, i.e., Track 1: Short Answer Questions (SAQ), and Track 2: Multiple-Choice Questions (MCQ). The methods we used are retrieval augmented generation (RAGs) with open-sourced smaller LLMs (OS-sLLMs). To better adapt to this shared task, we created our own culturally aware knowledge base (CulKBs) by extracting Wikipedia content using keyword lists we prepared. We extracted both culturally-aware wiki-text and country-specific wiki-summary. In addition to the local CulKBs, we also have one system integrating live online search output via DuckDuckGo. Towards better privacy and sustainability, we aimed to deploy smaller LLMs (sLLMs) that are open-sourced on the Ollama platform. We share the prompts we developed using refinement techniques and report the learning curve of such prompts. The tested languages are English, Spanish, and Chinese for both tracks. Our resources and codes are shared via https://github.com/aaronlifenghan/FLANS-2026

Problem

Research questions and friction points this paper is trying to address.

Everyday Knowledge

Diverse Languages

Cultures

Short Answer Questions

Multiple-Choice Questions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation (RAG)

open-sourced smaller LLMs

culturally aware knowledge base