May 13, 2020

A New Textualist Argument in the Title VII Cases, and the Risks of Corpus Linguistics

Ryan Nees Stanford Law School, Class of 2021

The Title VII cases pending before the Supreme Court are a notable test for conservative proponents of textualism, who find themselves uncomfortably confronted in Zarda and Harris Funeral Homes with strong arguments from the interpretive school most closely associated with Justice Antonin Scalia. That debate has been well-documented, with some conservatives already speculating in the Wall Street Journal and on Twitter that they may have lost it.

Still, there are two more months before the Supreme Court is likely to rule, and thus more shadow briefing to be done. So last week James C. Phillips sought to recast the textual evidence with a subtly modified claim relying on corpus linguistics. For a textualist like Justice Gorsuch, who may be attracted to petitioners’ analysis but concerned about the outcome that would result, Phillips’s intervention could provide important new textual evidence.

Phillips’s intriguing argument is worth carefully considering, and is an instructive application of the larger project urging judges to make greater use of corpus linguistics to derive contemporaneous meaning. He argues, in summary, that the petitioners’ textualist reading depends upon “separately analyzing and then amalgamating . . . three parts” – namely, the words “discriminate,” “against,” and “sex.” Analyzed in that way, the LGBTQ petitioners may have a point, and  Justice Gorsuch seemed to acknowledge as much at oral argument. The better approach, Phillips says, would be to assess the distinct meaning of the phrase “discriminate against,” which has its own highly specific connotation entailing prejudice as a motivation, especially when the phrase is paired with a suspect class.

Corpus linguistics could be a useful resource to identify idiomatic meaning of this sort. But the surprising consequence is that, the more idiomatic the meaning uncovered, the more purposivist the reasoning starts to appear. And for reasons I explain, Phillips’s analysis is unpersuasive on its own terms, relying on a limited linguistic corpus while overlooking any possible legalistic meaning that prevailed at the time. It seeks to create a new term of art even as textualists have traditionally disfavored doing so. As the corpus-linguistic method rises in popularity as a tool of statutory interpretation, the Title VII case study demonstrates its risks.

A. The Standard Textualist Claim in the Title VII Cases

Beginning early in the Title VII litigation, the gist of textualist question has been: does the plain text of the statute, which forbids discrimination “because of . . . sex,” also encompass sexual orientation and gender identity? At oral argument in Zarda, Pam Karlan began with a simple comparison: “When a[n] employer fires a male employee for dating men but does not fire female employees who date men, he violates Title VII.” The standard response to that claim has been that it gets the analysis wrong: it fails to isolate sex as the reason for the firing because, when a male employee who dates men is substituted for a hypothetical woman who dates men, what changes is not only the gender of the employee, but also their sexual orientation. The man is fired not because he’s a man, but because he’s gay; and if he were instead a woman who dated men, that woman would necessarily be straight. The better comparison, it is said, is to compare a gay man with a lesbian woman. Only firing that man, and not the woman, would qualify under Title VII as sex discrimination.

At oral argument, Justice Kagan said any such possible imprecision didn’t matter, because the Court need not isolate gender as the sole basis for an employer’s action. Karlan’s comparator sufficed to show that, by varying gender, sex was at least a “but for” cause of the discrimination, and the statute required nothing more. Remarkably, Justice Gorsuch suggested he might agree (see page 44 of the transcript). Even if the plaintiffs’ argument did not “isolate the sole or proximate cause,” he said, and even if sexual orientation necessarily varied with sex, that only showed that “perhaps there are two causal factors at work here . . . [and] isn’t one of them sex?” If the answer is yes, the case would appear to end there for any textualist who ordinarily would resolve ambiguity with reference to a statute’s overall purpose only after the text has been judged ambiguous (if at all).

B. The “Principle of Compositionality”

Phillips’s textual analysis is different than that which has so far preoccupied commentators and the Court because he argues that “discriminate against,” especially when paired with a particular suspect class, functions like an idiom with its own distinct meaning, a “linguistic unit . . . by the time of Title VII’s enactment” in 1964. That unit, Phillips argues, “refers only to adverse treatment that rests on prejudice or bias . . . directed at some or all men in particular, or at some or all women in particular.” Because the litigants seem to agree that Congress didn’t specifically intend to prohibit discrimination on the basis of sexual orientation or gender identity, Phillips’s argument thereby effectively incorporates purpose into the very meaning of the words of the statute, rather than leaving it for consideration only after textual ambiguity is identified. Borrowing from linguistics, he associates his approach with the “principle of compositionality”: when a phrase operates as a composite unit with collective meaning that cannot be understood through the analysis of each of its parts. Having defined “discriminate against” as textually referring only to prejudice against members of a group, and noting that all such groups were identified in the statute, Phillips concludes that the text cannot be read literally to reach discrimination against LGBTQ people (who were, after all, not enumerated in the statute). If Congress had intended a literalist reading, it could have used general words rather than this particular idiomatic phrase.

1. A Questionable Specialized Meaning

To prove the prejudice-oriented meaning he claims, Phillips gathers evidence from corpus linguistics, and supports it with contemporaneous dictionaries. He argues that the Corpus of Historical American English (COHA) at the time of the statute’s passage reflects a common and particular usage of the phrase “discriminate against”; that the phrase’s collocates (the words that most often appear before or after the phrase) in that corpus suggest a special meaning limited to prejudice against members of particular suspect classes; that the phrase’s binomials (especially common pairs of words) are associated with such prejudice; and that dictionaries at the time are generally consistent with such a reading.

But Phillips’s analysis, if conceptually appealing, goes awry. First, he begins by observing that “discriminate” and “against” are frequently paired together in COHA, but he finds only 125 instances of the phrases “discriminating against,”  “discriminated against,” “discriminate against,” or “discriminates against” in COHA’s 48 million-word corpus of books, magazines, and newspapers covering the 1950s and 1960s. Even if one can rely on the unusual frequency with which “discriminate” and “against” appear together (most of the time), prejudicial collocates following the phrase collectively appear only about half the time. Reading the sentences themselves,[1] rather than relying on collocate statistics, shows that “discriminate against” (and its permutations) was used just as often to refer to non-prejudicial subjects like one country discriminating against another’s products, or a state discriminating against interstate commerce, or Americans coming to discriminate in their diets against “beef having yellowish fat.” The phrase “discriminate against,” in other words, is (at best) just as likely to embrace either of the two ordinary meanings of the word “discriminate” alone: either to distinguish, or to subordinate.

Second, in concluding that the phrase “discriminate against” (and its permutations) had a subordination-oriented meaning in 1964, Phillips overlooks the possibility that the phrase’s usage in the legal context may have differed from popular usage. Though Phillips’s analysis only relies on COHA, another American English corpus is available from the same time period: one that collects all issued U.S. Supreme Court opinions. Though it is a quarter the size – comprising some 12 million words, from more than 4,000 opinions issued in the 1950s and 1960s – it nonetheless contains 392 references to the forms of the phrase “discriminate against” that Phillips analyzed in COHA (i.e., more than three times as many). It turns out that the word  “discriminate” (and its permutations) is even more likely to precede “against” in the legal corpus (about 70% of the time) than in the popular language corpus (about 50% of the time).

In this legal context, the collocation-based connections to particular types of prejudiced motivations become even less compelling. Of the nearly 400 usages of the phrase “discriminate against” (and its permutations) in Supreme Court opinions during the 1950s and 1960s, the vast majority (clicking “context” after the searches[2]) refer not to prejudice against suspect classes, but to issues like employers discriminating against unionized employees in labor disputes, states and companies discriminating against other states’ and companies’ trade, and one governmental unit discriminating against another. Fewer than a quarter of the collocates used a variation of the phrase to refer to prejudicial motivation; larger categories referred to discriminatory trade restrictions and labor law issues. It is true, of course, that the opinions the Supreme Court issues reflect only the kinds of cases for which it grants cert. But the overwhelming skew in the data suffices to at least show that the specialized meaning of the phrase that Phillips claims existed at the time was even less true for legal writers than for popular ones. And it’s perhaps more likely that Congressional statute drafters wrote more like Supreme Court Justices than like authors of books and magazines.

Phillips also writes that textualist evidence can be derived from binomial pairs, like “cease and desist” or “aid and abet,” in which one word often appears with another. He writes that “the most common form of the binomial prejudice and [WORD] was prejudice and discrimination, appearing twice as often as any other word following the phrase prejudice and.” But that, of course, tells us only about words following “prejudice,” and little about what usually follows (or precedes) the only word that matters, the one in the statute: “discrimination.” After all, the most common noun that precedes “and discrimination” in the corpus as a whole (reaching beyond the 1950s and 1960s for illustration) is “taste” – but the fact that “discrimination” often follows “taste and” doesn’t tell us much that’s meaningful about “discrimination.” It tells us mostly about “taste.”

Finally, Phillips cites dictionaries to indicate that the phrase “discriminate against” was also defined in relation to prejudice. It is true that is one of two traditional definitions. But all of the dictionaries Phillips cites also include a distinction-oriented definition, and it’s not clear why we should opt for the prejudice-oriented definition when the distinction-oriented one was used at least as commonly in COHA and much more commonly in the corpus of Supreme Court decisions. At least two dictionaries published during the 1950s and 1960s – the Oxford Illustrated Dictionary and the Chambers’s Twentieth Century Dictionary – don’t have prejudiced-oriented definitions at all.

2. The Risks in Specialized Meanings

The work Phillips’s evidence claims to do is to render the word “discriminate” to mean something narrower than a mere “causal link” of an adverse employment action. Even if sex is, in a kind of word game, one of many but-for causes of discrimination against a gay or transgender person, it is not the central, motivating, and prejudicial reason. One could quibble over what this really achieves: the discrimination alleged against the plaintiffs was motivated by prejudice too, and it’s not clear that prejudice on the basis of sexual orientation or gender identity are not both species of discrimination on the basis of gender stereotypes, in the same way that sexual harassment of masculine women and effeminate men have been held to be (as in Price Waterhouse and Oncale; a note in the Yale Law Journal well makes this point).

But even if the evidence better supported the linguistic claim, there would be other textual problems with Phillips’s approach as well.  First, words are generally to be given their ordinary meanings. This is true with limited exceptions, as when there is a clearly borrowed common law meaning, or when the statute itself has defined a term – and even when Congress has defined a word or phrase, “It should take the strongest evidence to make us believe that Congress has defined a term in a manner repugnant to its ordinary and traditional sense” (Babbitt, Scalia, J., dissenting). Neither exception applies to the phrase “discriminate against.”

Second, whatever textual value corpus linguistics has, the Title VII example shows how equivocal the evidence can be. Some semantic tools are more reliable linguistic short-hands than others. It wouldn’t make sense to elevate weak corpus linguistics evidence over comparatively strong structural evidence.

Here, that structural evidence is unusually strong, because the statute exempts employment practices that formally, but not functionally, discriminate when they constitute “bona fide occupational qualifications.” If a prejudicial motive was required to violate Title VII, it would be bizarre to create a prominent scheme of exceptions to a public policy combating subordination. The statute expressly contemplated, and created exceptions for, otherwise facially discriminatory employment actions not motivated by prejudice, like mandatory retirement ages for pilots or denominational requirements for pastors. (And as a matter of precedent, the Supreme Court has already ruled, in Automobile Workers v. Johnson Controls, that “[t]he beneficence of an employer’s purpose does not undermine the conclusion that an explicit gender-based policy is sex discrimination under § 703(a).”)

Phillips’s argument demonstrates how frustrating it can be for some textualists to deny themselves purposivist evidence: it tempts them to find particular forms of intent in words themselves, because it generally can’t be considered at the level of an overall statutory scheme. At the very least, if textualist judges are to accept corpus linguistics evidence of meaning, as Phillips has elsewhere urged should be prominently integrated into originalist methodology and textualist statutory interpretation, they should demand clearer and better evidence than that which exists in the Title VII cases.

[1] These links don’t work in all browsers, and could not be captured by To re-create the results, search “discriminate* against” and “discriminating against” in List for the 1950 and 1960 sections of COHA and click “context.”

[2] Again, these links don’t work in all browsers, and could not be captured by To re-create the results, search “discriminate* against” and “discriminating against” in List for the 1950 and 1960 sections of the Supreme Court corpus and click “context.”

Civil rights, Equality and Liberty, Gender Equality, Labor and Employment Law, LGBTQ Equality, Workers’ Rights