New Geometric Framework Promises to Combat LLM Hallucinations with Black-Box Uncertainty

A new research paper titled "Geometric Uncertainty for Detecting and Correcting Hallucinations in LLMs" introduces a novel framework designed to enhance the reliability of large language models (LLMs) by detecting and correcting their tendency to generate factually incorrect yet plausible information, known as hallucinations. The paper, co-authored by Edward Phillips and five other researchers, was recently highlighted by Rohan Paul on social media.

The core of the research lies in a geometric framework that provides both global and local uncertainty estimates for LLM responses, a significant advancement for black-box models. While previous black-box methods could only offer global uncertainty for a batch of responses, this new approach, utilizing archetypal analysis, can attribute uncertainty to individual responses. This capability is particularly critical for high-stakes applications such as medical question-answering, where accuracy is paramount.

The methodology involves generating multiple responses to a query, converting them into embedding vectors, and then applying archetypal analysis to identify extremal semantic points that define the boundaries of potential model outputs. This allows for the quantification of "Geometric Volume" for global uncertainty and "Geometric Suspicion" for local uncertainty. Geometric Suspicion ranks responses by reliability, enabling the selection of the most plausible answer and thereby reducing hallucination rates.

Experimental results demonstrate that the geometric framework performs comparably to or better than existing methods, showing superior performance on medical datasets like K-QA and MedicalQA. The paper also provides theoretical justification by proving a link between the convex hull volume and entropy, solidifying the mathematical basis of the approach. This development offers a promising new tool for improving the trustworthiness and practical application of LLMs across various domains.