Geographic Dialect Bias in Multiclass Classification Models — The Association Specialists

Geographic Dialect Bias in Multiclass Classification Models (20233)

Sidney GJ Wong 1
  1. University of Canterbury, Christchurch, CANTERBURY, New Zealand

The main contribution of the current paper is to highlight geographic dialect bias in training data used to finetune pretrained language models (such as XLM-RoBERTa). Pretrained language models are increasingly being used in industry and academia. However, there is considerable geographic dialect bias in these pretrained language models. Some strategies, such as retraining, can minimise the impacts of geographic dialect bias in downstream natural language processing tasks. We argue there is also considerable bias in, often open-source, training data used to automate different types of language classification tasks. To test this, we trained two multiclass classification models, a hate speech and offensive language detection model derived from US-based English language tweets and a homophobic and transphobic language detection model derived from Indian English-based YouTube comments. We retrained the pretrained language model with samples of social media language. We then applied the classification models on monthly samples of English language tweets from Aotearoa New Zealand (used as a control), the United States, and India between 2018 and 2023. Both models found that hate speech and offensive language as well as homophobic and transphobic language has increased across all three locations between 2018 and 2023. However, the topic analysis of the detected results found that both multiclass classification models failed to accurately detect hate speech, offensive, homophobic, and transphobic language in the Aotearoa New Zealand context. For example, tweets with the words bugger and trigger were frequently misclassified as offensive language despite not exhibiting offensive messaging. This suggests there is evidence of geographic dialect bias in the training data. This bias significantly compromises the efficacy of these multiclass classification models. We argue that training data for multiclass classification models should not be generalised and the design needs to be relevant to the social, political, and linguistic context of a community.