Factual language model alignment

Paper Title:

FLAME: Factuality-Aware Alignment for Large Language Models

Published on:

2 May 2024

Primary Category:

Computation and Language

Paper Authors:

Sheng-Chieh Lin,

Luyu Gao,

Barlas Oguz,

Wenhan Xiong,

Jimmy Lin,

Wen-tau Yih,

Xilun Chen

Bullets

Key Details

•

Standard alignment methods may encourage language models to hallucinate more

•

Training models on unfamiliar data introduces unknown facts that lead models to fabricate claims

•

Reward functions that prefer very detailed responses also increase false claims

•

The proposed approach elicits knowledge from the model itself to reduce unfamiliar information

•

It uses separate factuality and instruction following rewards to balance the tradeoff

Explore the topics in this paper

factual alignment

hallucination reduction

instruction following

language models

reinforcement learning

AI generated summary

Factual language model alignment

This paper studies how to align language models to follow instructions while reducing false claims. It finds that standard alignment methods can increase hallucination by training models on unfamiliar data or rewarding very detailed responses. The authors propose methods to make alignment more factual, by eliciting knowledge from the model itself and using separate rewards for factuality and instruction following.