Misgendered: Limits of Large Language Models in Understanding Non-Binary Pronouns

Paper Code

MISGENDERED Framework: We create a dataset to evaluate the ability of large language models to correctly ‘gender’ individuals. We manually write templates, each referring to an individual and containing a blank space for a pronoun to be filled-in. We populate the templates with names (unisex, female, and male) and pronouns (binary, gender-neutral, and non-binary), and declare two to five pronoun forms are for each individual either explicitly or parenthetically. We then use masked and causal LMs to predict missing pronouns in each instance utilizing a unified constrained decoding method.

See for yourself how input instances are constructed as templates, declared pronouns, declaration types, and the number of pronouns declared are changed. See how accuracy varies by the input parameters and language model being evalauted.

Create
Evaluation
Instances
Template
Pronoun      Declaration Type      Declaration Number
Examples

    Accuracy: