This study introduces KarGus, a novel system for multi-document question answering (MD-QA) designed for diverse domains. KarGus integrates advanced Natural Language Processing techniques with Knowledge Graph (KG) construction and Graph Neural Networks (GNNs) to enhance retrieval
...
This study introduces KarGus, a novel system for multi-document question answering (MD-QA) designed for diverse domains. KarGus integrates advanced Natural Language Processing techniques with Knowledge Graph (KG) construction and Graph Neural Networks (GNNs) to enhance retrieval performance across various specialized fields.
We explore the efficacy of combining semantic similarity, TF-IDF, and Named Entity Recognition features in KG construction and information retrieval. Experimental evaluation on a corpus of 30 documents (1810 pages, 10,853 text chunks) from corporate intelligence demonstrates that KarGus outperforms traditional embedding-based methods, achieving a Recall@5 of 0.850 compared to the baseline's 0.823 (p < 0.05). The optimal configuration emphasized semantic similarity (weight 0.75), keyword relevance (0.2), and entity information (0.05).
Analysis of the KG structure revealed moderately well-defined community structures and efficient information traversal properties. While GNN models showed promising training results, they underperformed in the retrieval task, highlighting challenges in GNN application to MD-QA.
This research contributes to the field of information retrieval by demonstrating the efficacy of integrating NLP techniques with graph-based approaches in MD-QA. The adaptable nature of KarGus suggests potential applications across various specialized domains. Future work will focus on validating cross-domain performance and refining GNN implementations for diverse retrieval tasks.