Paper Image

An open-source evaluator model

Published on:

2 May 2024

Primary Category:

Computation and Language

Paper Authors:

Seungone Kim,

Juyoung Suk,

Shayne Longpre,

Bill Yuchen Lin,

Jamin Shin,

Sean Welleck,

Graham Neubig,

Moontae Lee,

Kyungjae Lee,

Minjoon Seo

Bullets

Key Details

Outperforms existing open-source evaluators

Closely matches scores from humans and GPT-4

Performs both direct assessment and pairwise ranking

Incorporates flexible custom evaluation criteria

Models, code, and data publicly available

AI generated summary

An open-source evaluator model

This paper introduces Prometheus 2, an open-source language model specialized for evaluating the quality of text generated by other language models. It demonstrates superior performance in providing scores and rankings that closely match human judgment, while also allowing flexible evaluation based on custom criteria beyond just helpfulness.

Answers from this paper

Comments

No comments yet, be the first to start the conversation...

Sign up to comment on this paper

Sign Up