2  Experiment 2

Effects of a PSA and Usage Modeling on Memory and Written Production


2.1 Motivation

The results of Experiment 1 suggest that people can learn to associate pronouns with a person, but that accuracy for they/them remains lower than for he/him and she/her. Although remembering which characters used they/them was a strong predictor of producing singular they, accuracy in the sentence completion task was significantly lower than in the multiple-choice memory task. Experiment 2 investigated what kinds of exposure can support accurately remembering and producing singular they. The first factor tested is the role of conceptual knowledge about singular they and discussing gendered language preferences. Recent results show that participants are more likely to interpret they as the intended singular, instead of plural, after being told explicitly that the character uses they/them pronouns (Arnold et al., 2021) (see Section 0.4.4). This is also supported by prior experiments about the generic masculine: When a course instructor included information about why they would be using generic she instead of generic he (Adamsky, 1981), and when alternatives were taught as options to students (Flanagan & Todd-Mancillas, 1982), students were less likely to use generic he in their assignments and more likely to use gender-neutral alternatives or generic she. Similarly, in German (where nouns are gender-marked) reading brief arguments in favor of gender-neutral language increased participants’ use of gender-neutral generic nouns (Koeser & Sczesny, 2014).

The second factor tested is exposure. As singular they becomes more common and accepted (Balhorn, 2004; Camilliere et al., 2021; Hekanaho, 2020; Minkin & Brown, 2021; Parker et al., 2019), speakers are increasingly likely to be exposed to it via media and social circles, and many of these instances do not come prefaced with a discussion about pronouns or gender identity. Potentially-comparable results from studies about non-sexist language reforms are mixed: students who saw alternatives to generic masculine forms modeled in task instructions increased their use of non-sexist forms, but did not decrease their use of generic masculine forms (Cronin & Jreisat, 1995). In German, women were more likely to use alternatives to generic masculine role nouns after reading a text modeling them, but men did not change their language use until the instructions drew their attention to the gendered language used (Koeser et al., 2015).

2.2 Methods

The design and analysis plan were preregistered on the Open Science Framework. Materials, de-identified data, and analysis code are available at this dissertation’s Github repository.

2.2.1 Participants

427 responses were collected from Amazon Mechanical Turk, completing a task that took approximately 20 minutes. Participants were required to be in the U.S. and comfortable reading and writing in English. 107 participants were excluded for nonsensical responses in the sentence completion task, for a total of 320 participants in the final data set. As in Experiment 1, participants were asked about their age (Mage = 38.79, SDage = 11.54), gender (194 male, 125 female, 1 did not provide), and English experience (304 “native (learned from birth)”, 14 “fully competent, but not native”, 1 “limited but adequate competence”, 1 “some familiarity”) in order to characterize the sample.

2.2.2 Materials & Procedure

Participants read 1 of 2 500-word PSAs. The pronoun PSA was modified from a GLSEN resource and discussed talking about gendered language preferences, using singular they, and responding to misgendering someone (GLSEN, 2020). The neutral PSA was modified from a Humane Society resource and discussed the importance of spaying/neutering cats and dogs (Humane Society, 2020). Participants also read 2 fictional biographies, which made repeated third-person reference to a single character, in order to model pronoun use without explicitly commenting on it. The character in the first biography had a feminine name and was referred to with they/them or she/her pronouns (4 subject, 1 object, 4 possessive). The character in the second biography had a masculine name and was referred to with they/them or he/him pronouns (7 subject, 7 possessive). The other materials were identical to Experiment 1.

Participants read 1 PSA and 1 pair of biographies (2 they/them characters, or 1 he/him character and 1 she/her character). These were crossed to create 4 between-participants conditions [PSA: Gendered Language vs Unrelated; Biographies: They vs He/She]. Participants then completed the same pronoun memory and production tasks as in Experiment 1. Participants were randomly assigned to 1 of 3 lists within each condition, counterbalancing the name-pronoun combinations. The experiment was coded and hosted using PCIbex (Zehr & Schwartz, 2018).

2.3 Predictions

The PSA contains information about why paying attention to gendered language matters, mentions singular they as an option and shows examples of its usage, and provides scripts for talking about gendered language preferences (GLSEN, 2020). This addresses conceptual knowledge about singular they and misgendering, chosen to be similar to Diversity, Equity, and Inclusion materials that people may see in their schools or workplaces. If learning or being reminded of this information affects language use, we predict that participants who read the gendered language PSA will be more accurate at remembering and producing they/them, compared to participants who read the unrelated PSA.

As singular they becomes more common and accepted (Balhorn, 2004; Camilliere et al., 2021; Hekanaho, 2020; Minkin & Brown, 2021; Parker et al., 2019), speakers are increasingly likely to be exposed to it via media and social circles, and many of these instances do not come prefaced with a discussion about gendered language or gender identity. As such, the biographies model the use of singular they, but do not explicitly call attention to it. The biography genre allows for repeated reference to one individual, giving participants multiple examples and making it more straightforward to interpret they as singular and not plural. If seeing singular they modeled supports learning, we predict that participants who read the stories about characters referred to with they/them pronouns will be more accurate at remembering and producing singular they, compared to participants who read the stories about characters referred to with he/him and she/her pronouns.

2.4 Results

Three logistic mixed-effects models analyzed Pronoun, PSA, and Biography predicting memory accuracy (Table 2.1), production accuracy (Table 2.2), and a model relating the two measures (Table A.12). The fixed effects of PSA and Biography were mean-center effects coded; all other model specifications followed Experiment 1 (Baayen et al., 2008; Bates et al., 2015; R Core Team, 2023; Voeten, 2023). For all three models, the most complex random effects structure that converged included only by-item intercepts, and no by-participant effects.

2.4.1 Memory

In the multiple-choice memory task (Table 2.1), participants responded more accurately than not across pronouns and training conditions (β = 1.00, z = 21.47, p < .001). He/him and she/her (M = 0.77–0.82 between PSA and Biography conditions) were remembered more accurately than they/them (M = 0.51–0.59) (β = 1.22, z = 16.02, p < .001). There was no difference in accuracy between he/him and she/her, and they/them was misremembered as he/him and she/her at similar rates (Figure 2.1A). The lower accuracy of they/them compared to she/her and he/him (Figure 2.1B) was attenuated when participants read the gendered language PSA compared to the unrelated PSA (β = -0.36, z = -2.38, p < .05).

Table 2.1: Experiment 2: Model results for the effects of Pronoun, PSA, and Biography on Memory Accuracy.
Experiment 2: Memory
  Memory Accuracy
Predictors Log-Odds SE z p
(Intercept) 0.996 0.046 21.467 <0.001
Pronoun: They (-.66) vs He (+.33) + She (+.33) 1.218 0.076 16.023 <0.001
Pronoun: He (-.5) vs She (+.5) 0.140 0.111 1.265 0.206
Biography: He/She (-.5) vs Biography: They (+.5) -0.176 0.076 -2.304 0.021
PSA: Unrelated (-.5) vs PSA: Gendered Language (+.5) 0.022 0.076 0.289 0.772
Pronoun (They vs He + She) * PSA -0.362 0.152 -2.381 0.017
Pronoun (He vs She) * PSA 0.121 0.199 0.609 0.543
Pronoun (They vs He + She) * Biography -0.161 0.152 -1.060 0.289
Pronoun (He vs She) * Biography 0.049 0.199 0.245 0.807
PSA * Biography -0.073 0.153 -0.481 0.630
Pronoun (They vs He + She) * PSA * Biography -0.131 0.304 -0.431 0.666
Pronoun (He vs She) * PSA * Biography -0.088 0.398 -0.220 0.826
Random Effects
τ00 Name 0.008
N Name 12
Observations 3840

Figure 2.1: Experiment 2: [A] Pronoun accuracy in the multiple-choice memory task, split by PSA and Biography conditions. By-participant means are shown as points; error bars indicate 95% CIs calculated over the by-participant means. [B] Means and 95% CIs of memory accuracy for he/him + she/her characters and they/them characters, comparing PSA and Biography conditions. The distribution of responses is shown in the appendix (Figure A.7).

Participants also learned that each character had 1 of 3 pets, which was designed to have the same distributional characteristics but be less marked in comparison to the 3 pronouns. As in Experiment 1, there was no significant difference between accuracy for they/them characters’ pets (M = 0.54) and pronouns (β = 0.05, z = 0.48, p = .63). Accuracy for the 12 possible jobs was relatively high (M = 0.37), confirming that the experiment was not too difficult for participants. Job and pet accuracy are discussed in more detail in the appendix (Section A.2.1).

2.4.2 Production

Responses were coded by whether the sentence continuation used he/him, she/her, they/them, or no pronouns to refer to the character (Figure 2.2). Responses that did not include a pronoun were 4% of the data and are included in the analysis as incorrect responses (Table 2.2). Across all conditions, participants produced the correct pronoun more often than not (β = 0.70, z = 13.47, p < .001). He/him and she/her (M = 0.79–0.90 between PSA and Biography conditions) were produced more accurately than they/them (M = 0.11–0.33) (β = 3.16, z = 32.95, p < .001). He/him was produced somewhat more accurately than she/her (β = -0.26, z = -2.11, p < .05). The relative difficulty of they/them was attenuated with the gendered language PSA (β = -1.91, z = -10.00, p < .001), and there was a significant interaction between PSA and Biography (β = 0.43, z = 2.34, p < .05). These effects were qualified by a three-way interaction between Pronoun, PSA, and Biography (β = 0.88, z = 2.31, p < .05). A follow-up analysis probing this interaction found that the gendered language PSA reduced the relative difficulty of they/them more when paired with the biographies that used he/him and she/her (β = -2.35, z = -8.50, p < .001) than when paired with the biographies that used they/them (β = -1.47, z = -5.57, p < .001). However, examining the means for the two conditions with the gendered language PSA (red in Figure 2.2B) indicates that the difference in relative accuracy for they/them compared to he/him + she/her is due to Biography affecting accuracy for he/him + she/her characters, but not accuracy for they/them characters. Finally, an exploratory analysis measured the proportion of participants who produced singular they at all, regardless of accuracy (Table A.13). Participants who read the gendered language PSA were more likely to produce singular they at least once (β = 0.93, z = 3.95, p < .001), with proportions rising from 26% and 35% in conditions that read the unrelated PSA to 48% and 57% in conditions that read the gendered language PSA.

Table 2.2: Experiment 2: Model results for the effects of Pronoun, PSA, and Biography on Production Accuracy.
Experiment 2: Production
  Production Accuracy
Predictors Log-Odds SE z p
(Intercept) 0.695 0.052 13.469 <0.001
Pronoun: They (-.66) vs He (+.33) + She (+.33) 3.156 0.096 32.951 <0.001
Pronoun: He (-.5) vs She (+.5) -0.261 0.123 -2.113 0.035
PSA: Unrelated (-.5) vs PSA: Gendered Language (+.5) 0.107 0.091 1.178 0.239
Biography: He/She (-.5) vs Biography: They (+.5) -0.059 0.091 -0.653 0.514
Pronoun (They vs He + She) * PSA -1.907 0.191 -9.999 <0.001
Pronoun (He vs She) * PSA 0.266 0.227 1.170 0.242
PSA * Biography 0.426 0.182 2.345 0.019
Pronoun (They vs He + She) * Biography -0.160 0.191 -0.840 0.401
Pronoun (He vs She) * Biography 0.085 0.227 0.372 0.710
Pronoun (They vs He + She) * PSA * Biography 0.882 0.381 2.313 0.021
Pronoun (He vs She) * PSA * Biography 0.134 0.454 0.296 0.768
Random Effects
τ00 Name 0.007
N Name 12
Observations 3840

Figure 2.2: Experiment 2: [A] Pronoun accuracy in the written sentence completion task, split by PSA and Biography conditions. By-participant means are shown as points; error bars indicate 95% CIs calculated over the by-participant means. [B] Mean production accuracy for he/him + she/her characters and they/them characters, split by PSA and Biography conditions. [C] Number of times each participant produced singular they, split by PSA and Biography conditions. The distribution of all pronoun responses is shown in the appendix (Figure A.7).

2.4.3 Memory Predicting Production

The third model tested the effects of memory accuracy, pronoun, PSA, and Biography on production accuracy (Table A.12). In addition to the effects described above, participants were more likely to accurately use a character’s pronouns in the sentence completion task if they had remembered that character’s pronouns in the multiple-choice task (β = 0.83, z = 8.00, p < .001). No other interactions with memory accuracy were significant. Examining the combined distribution of responses, it was again more common to remember but not produce they/them than to produce but not remember they/them (Figure 2.3).

Figure 2.3: Experiment 2: [A] Production accuracy, split by memory accuracy in the prior task, then by PSA and Biography conditions. The lighter colors indicate trials where memory had been incorrect, and the darker colors indicate trials where memory had been correct. Error bars indicate 95% CIs calculated over trials. [B] Distribution of combined memory and production accuracy, split by PSA and Biography conditions.

2.5 Discussion

In Experiment 2, participants read either a PSA about gendered language or an unrelated topic, then two fictional biographies where both characters used they/them or one character used he/him and one character used she/her. Participants then completed the same character learning, memory, and production tasks as in Experiment 1. Reading the PSA about gendered language—which explained why people are talking more about their preferences for gendered language, how they/them pronouns work, and how to respond if someone corrects you—increased how likely participants were to produce singular they at least once and improved their accuracy when doing so. Seeing singular they modeled in the biographies did not directly affect memory or production, but did interact with the PSA. This demonstrates that while learning singular they may be difficult, it is not impossible, and even brief interventions can support this learning.

Compared to Experiment 1, which included undergraduates participating for course credit, Amazon MTurk participants vary more—particularly in terms of age, race, education, and socioeconomic status—but are still not fully representative of English speakers in the U.S. context (Arechar & Rand, 2021; Levay et al., 2016). MTurk participants lean more liberal than U.S. adults, and are more likely to agree that trans people are discriminated against and to support marriage equality and anti-discrimination laws for gay people (Chandler et al., 2019; Levay et al., 2016). Most, but not all, participants in both experiments reported being native English speakers; while all participants in Experiment 1 were physically located in the U.S., in Experiment 2 the web-based restrictions that limited participation to U.S.-based individuals may not have been foolproof.

However, overall performance was, broadly speaking, similar across the two studies despite the sampling differences: While participants in the Unrelated PSA + He/She Biographies condition—the condition in Experiment 2 most similar to Experiment 1—were less likely to correctly produce they than participants in Experiment 1 (M1A = 0.29, M1B = 0.39, M2 = 0.11), this is unlikely to be due to overall lower accuracy or attention to the task. Looking at the memory questions unrelated to pronouns, participants in Experiment 2 were numerically more accurate than participants in Experiment 1 for both the characters’ jobs (M1A = 0.21, M1B = 0.29, M2 = 0.37) and pets (M1A = 0.41, M1B = 0.43, M2 = 0.54). While participants in Experiment 2 were less likely to have experience with singular they than participants in Experiment 1, the PSA manipulation was intended to provide participants with some of the social context around gendered language that they may have been less familiar with. In sum, these findings show that providing people with brief information about how they/them pronouns work, why people use them, and why people choose to talk directly about the gendered language they prefer did support peoples’ use of singular they. Whether or not this effect may vary depending on participants’ prior knowledge and experience is an area for future research.

The finding that reading a brief PSA increased both the overall usage and accuracy of singular they is promising, given that a PSA is an easily implemented and not time-intensive tool. Nevertheless, in order to be useful in applied contexts, future research will need to investigate whether the effects of the gendered language PSA and other learning interventions persist past the duration of an experiment. Future work should also investigate whether the effects on written production extend to spoken production.