Paper Image

Measuring diverse perspectives in AI safety ratings

Published on:

9 November 2023

Primary Category:

Computation and Language

Paper Authors:

Vinodkumar Prabhakaran,

Christopher Homan,

Lora Aroyo,

Alicia Parrish,

Alex Taylor,

Mark Díaz,

Ding Wang


Key Details

Proposes comprehensive framework to measure diversity in raters' perspectives

Combines metrics for in-group and cross-group rating cohesion

Applies framework to chatbot safety ratings from diverse raters

Reveals systematic disagreements tied to raters' demographics

Informs which demographic axes are crucial for safety tasks

AI generated summary

Measuring diverse perspectives in AI safety ratings

This paper proposes a framework to analyze systematic disagreements in safety ratings of AI systems based on raters' demographics. It applies this framework to human ratings of chatbot responses, revealing that perceptions of safety differ significantly across gender, race and age groups.

Answers from this paper


No comments yet, be the first to start the conversation...

Sign up to comment on this paper

Sign Up