A Taxonomy of Federal Litigation
For the last two years, Christy Boyd and I, along with some friends, have been working on a paper on how attorneys construct complaints. The project began when we were working to code some other detritus of federal litigation and decided to collect the causes of action in complaints to understand the legal issues in our cases in a better manner than NOS codes alone permitted. Soon enough, we got to thinking that our causes of action were pled in distinctively patterned ways. Obviously, this isn’t an earth-shaking insight, as most first year students have thought, at one time or another, that each of their classes’ exam fact patterns could easily substitute for any other. That is: causes of action are alternative, mutually complementary, theories that channel a limited number of fact patterns into claims to legal relief. Everyone knows that contract and tort claims are pled together, and that constitutional claims come accompanied by state law torts. But we thought it’d be worthwhile to nail down this insight using a very similar analysis to the one that enables Amazon to tell you which books you might like — i.e., if you plead a particular cause of action, what other causes of action are you likely to bring in a particular case?
We gathered a set of 2,500 complaints (from a much larger sample of federal complaints derived through RECAP). The complaints were sampled to be fairly representative of all federal litigation, excluding pro se, social security, and prisoner petition cases. The sample contained 11,500 individual causes of action – around 4.6 causes of action per case. Guided by co-authors at Temple’s Center for Data Analytics, we used spectral clustering to examine the relationship between causes of action. Two years later and presto, we’ve a (draft) paper is up on SSRN! The ungainly title is Building a Taxonomy of Litigation: Clusters of Causes of Action in Federal Complaints. I welcome your comments, and your suggestions for a better title. Follow me after the jump for an exploration of our findings.
The figure below lays out a basic descriptive picture of the types of causes of action in our data. As you can see, almost one in three causes of action in federal court sounds in tort. Contract claims are the second most common legal theory advanced. (For more on the details of coding, including a discussion of the troublesome “bare claims for relief,” you’ll have to read the paper.)
Another cool descriptive question we can ask concerns the pairing of causes of action with one another. The Figure below depicts the most common pairs of causes of action within the data. For example, if a case had three causes of action, A/B/C, we identified three pairs: (1) cause of action A with cause of action B; (2) A – C; and (3) B -C. The Figure depicts those pairings for all the causes of action in the data (gray bars) as well as those with cases with 10 or fewer causes of action.
This shows that the most common pairing is tort claim paired with tort claim. Nearly as common is tort claim paired with contract claim, and tort claim with contract claim.
That Figure then helps to set the stage for the following figure, which is the last for this post. In it, the nodes (red dots) represent the spatial distribution of causes of action, with the node’s relative size indicating the frequency of the cause of action in the data. The edges (gray lines) depict the relationship between causes of action, with stronger co-occurrences represented with thicker lines.
The Figure illustrates the close relationship between certain kinds of commercial causes of action – contract / tort / fraud -and the relative isolation of others – like tax, which stands alone. But to really understand how causes of action relate to one another, and what those relationships can tell us about attorney strategy, we need to dig into clustering analysis. I’ll save that for another post. In the meantime, enjoy the paper!