Blog Logo
TAGS

DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents

In this work, we present dialog-enabled resolving agents (DERA) which is a paradigm that provides a simple, interpretable forum for models to communicate feedback and iteratively improve output with the increased conversational abilities of LLMs. We test DERA against three clinically-focused tasks and show significant improvement over the base GPT-4 performance in both human expert preference evaluations and quantitative metrics. Additionally, we show that GPT-4’s performance (70%) on an open-ended version of the MedQA question-answering dataset is well above the passing level (60%), with DERA showing similar performance. This work represents a novel way to improve the accuracy and completeness of large language models, especially in safety-critical applications like healthcare.