Abstract
There is a need for more publicly available corpora of legal language. To help fill this gap, we have developed the Corpus of U.S. State Statutes, or CorUSSS, a new corpus comprising the statutory code from all 50 U.S. states. In total the corpus contains 1,785,742 texts, each of which represents the statutory text associated with a unique Universal Citation in one of the 50 U.S. states’ codes. This corpus provides us with the ability to explore language use in statutes within or across all 50 states. After motivating the need for this corpus, we describe its design and the methods we used to collect, clean and store the texts. We then report on a case study that illustrates the utility of this corpus for addressing important questions in statutory interpretation by investigating whether the word information can be used to refer to statements that are non-factual. We conclude with a call for researchers in law and corpus linguistics to rely on both legal and ordinary language when investigating questions of interpretation.
Original language | English (US) |
---|---|
Article number | 100047 |
Journal | Applied Corpus Linguistics |
Volume | 3 |
Issue number | 2 |
DOIs | |
State | Published - Aug 2023 |
Keywords
- Legal corpora
- Legal language
- Statutes
- Statutory interpretation
- Textualism
ASJC Scopus subject areas
- Linguistics and Language
- Social Sciences (miscellaneous)