🤖 AI Summary
To address the limited coverage of general-purpose commonsense knowledge bases (e.g., ConceptNet) in vertical-domain tasks, this paper proposes a weakly supervised framework that— for the first time—models semantic affinity between tasks and industry groups to enable targeted cross-domain commonsense knowledge enhancement. Methodologically, it integrates neural affinity learning, clustering-guided top-k task selection, and structured triplet extraction from news texts to automatically construct high-quality “industry-group–can-perform–task” triplets. Evaluated across 24 industry groups using publicly available news corpora, the framework achieves high-precision extraction (F1 = 0.86) of 2,339 triplets, directly injectable into existing knowledge bases. The core contributions are: (1) a novel paradigm for task–industry semantic matching, and (2) the first weakly supervised knowledge completion framework tailored for industry-specific task expansion.
📝 Abstract
Commonsense knowledge bases (KB) are a source of specialized knowledge that is widely used to improve machine learning applications. However, even for a large KB such as ConceptNet, capturing explicit knowledge from each industry domain is challenging. For example, only a few samples of general {em tasks} performed by various industries are available in ConceptNet. Here, a task is a well-defined knowledge-based volitional action to achieve a particular goal. In this paper, we aim to fill this gap and present a weakly-supervised framework to augment commonsense KB with tasks carried out by various industry groups (IG). We attempt to {em match} each task with one or more suitable IGs by training a neural model to learn task-IG affinity and apply clustering to select the top-k tasks per IG. We extract a total of 2339 triples of the form $langle IG, is~capable~of, task
angle$ from two publicly available news datasets for 24 IGs with the precision of 0.86. This validates the reliability of the extracted task-IG pairs that can be directly added to existing KBs.