Ethnic Stereotypes in Language: A Validated Sentence–Word Priming Stimulus Database for EEG/Neuroimaging, Behavioral, and LLM Research
Description
This database provides a validated set of sentences designed for research on ethnic stereotyping and social cognition. The materials are suitable for use in EEG/ERP experiments, neuroimaging studies (fMRI, MEG), behavioral paradigms, and computational/LLM modelling of social bias. The stimuli were validated on a native Italian-speaking population. The database consists of 330 sentence–word pairs organized across three experimental conditions: • Incongruent (N = 110): sentences in which the terminal target word violates the ethnic stereotype associated with the speaker's accent (e.g., a speaker with an African accent described as a pilot or a surgeon) • Congruent (N = 110): sentences in which the terminal word is consistent with prevailing ethnic stereotypes (e.g., a speaker with an African accent described as working in a field or selling items at a market stall) • Neutral (N = 110): control sentences mentioning Italian individuals or regional groups, with no ethnic accent or stereotype content Sentences are distributed across seven ethnic/accent categories: African, Chinese, Roma/Sinti, Eastern European, Asian, Latin American, and Arabian. All three conditions are matched on the following variables: • Sentence length in characters • Sentence length in words • Terminal word length in characters • Lexical frequency of the terminal word • Grammatical category and imageability of the terminal word • Gender of the protagonist (male / female / mixed) Intended uses • EEG/ERP studies of stereotyping, social prediction, and language processing (N400, LAN, LP components) • Neuroimaging (fMRI/MEG) studies of social cognition and predictive coding • Behavioral studies of implicit ethnic bias, semantic priming, and accent perception • Training, testing, or probing Large Language Models (LLMs) on social bias, world knowledge and stereotype content Availability of auditory stimuli A professionally recorded auditory version of these stimuli (sentences spoken with native-accented voices across the seven ethnic/accent categories) is also available. Auditory stimuli are not shared openly, but may be made available to research groups for non-profit academic purposes under a co-authorship agreement: groups wishing to use the auditory materials in a study intended for publication are invited to contact the authors to discuss the terms of scientific collaboration, which will include co-authorship on any resulting paper. To inquire, write to: mado.proverbio@unimib.it
Files are not publicly available
You can contact the author to request the files
Steps to reproduce
THIS MATERIAL CAN BE FREELY USED FOR RESEARCH PURPOSES, OR AS A PART OF A SCIENTIFIC COLLABORATION, PROVIDED THAT APPROPRIATE CREDIT IS GIVEN TO THE SOURCE: Brusa, A. and A. M. Proverbio. Voices and Prejudice: Accent-based Information Affects the Perceived Competence and Social Attractiveness of the Speaker. PsyArXiv, 8 Feb. 2024. https://doi.org/10.31234/osf.io/3tmpf Terms and conditions This material cannot be used for commercial purposes. It cannot be transferred to technological devices of any type and sold. It cannot be placed on any internet website, nor provided to profit-making companies, including AI private companies, or to the media. To request the written stimulus set for non-profit academic research purposes, or for scientific cooperation (including sharing of EEG data for AI modelling and classification), write to: mado.proverbio@unimib.it Requests are accepted from faculty members and independent researchers only. Requests directly from undergraduate students, or submitted by students on behalf of their advisor, will not be accepted. Students must ask their faculty advisor to submit the request directly or to be included in CC.
Institutions
- University of Milano-BicoccaLombardy, Milan
Departments
Categories
Additional Metadata for University of Milano - Bicocca
| Language | English |
| Date the data was collected | 2024-01-31T23:00:00.000Z |
| UniMiB Research Centres | Centro di Neuroscienze di Milano |
| ERC Keywords | PE6_9 Human computer interaction and interface, visualisation and natural language processing, SH4_8 Language learning and processing (first and second languages), SH4_4 Neuropsychology |
| SSD Classification | M-PSI/02 - PSICOBIOLOGIA E PSICOLOGIA FISIOLOGICA, L-LIN/01 - GLOTTOLOGIA E LINGUISTICA, INF/01 - INFORMATICA |
| Geolocation | Italy |

