Coding the Law: An Introduction to Algorithms in the Law

See Spot; See Spot Run: Using ML to Spot Fact Patterns
~23-45 Minutes. Protip: You can watch YouTube videos at more than 1X speed.^†

Trojan Horse, Canakkale, Turkey. Photo by Peter Reed.

Optional Media. If you want to learn more about some of the topics discussed in the video above, and you have some free time, you might enjoy the following.

Spot builds upon data from the Learned Hands online game, a partnership between the LIT Lab and Stanford's Legal Design Lab. Learned Hands aims to crowdsource the labeling of laypeople's legal questions for the training of machine learning (ML) classifiers/issue spotters. Currently, this labeling is limited to publicly available historic questions from the r/legaladvice forum on Reddit. See Stanford and Suffolk Create Game to Help Drive Access to Justice.
Legal Issues Taxonomy (LIST). This taxonomy is what Learned Hands uses to label training data for Spot. It's worth noting that adoption of LIST, formerly NSMIv2, is one of the primary goals of Spot. As you may have gleaned from our discussion of data standards, it can be hard to get folks to adopt a standard. It's a chicken and egg problem. Folks want to use the standard that everyone else is using because a standard's value is a function of its community. Unfortunately, when there is no pre-existing community, it can be hard to get the ball rolling. Spot is an attempt to do this. We're building a shiny new AI tool that folks want to use. It just so happens that you have to label things in LIST for it to be useful. It's a Trojan Horse. ;)

Readings
~ 1 Hour 45 Minutes

Weapons of Math Destruction: The Targeted Citizen (Chapter 10) (20 pages)
How Not to Be Wrong: There is No Such Thing as Public Opinion (chapter 17) and "Out of Nothing I Have Created A Strange New Universe" (chapter 18) (55 pages)
Massachusetts Information for Voters - 2020 Ballot Questions: Question 2: Ranked-Choice Voting (5 pages). Following up on Ellenberg's discussion of rank-choice voting, it seemed worth noting that Question 2 on this year's MA ballot. FWIW, Question 1 makes reference to open data standards. So this class is proving super relevant. Yay!
The Supreme Court is taking on Google and Oracle one last time (6 pages)

Knowledge Base

Everyone comes to this adventure with a different background. So this section is designed to be a menu of sorts. If you already know a topic well, you can skip the relevant material. Just answer the questions below, and section(s) will disappear accordingly. That being said, if a section doesn't disappear, you should do it. Any time you save skipping a topic, however, should be spent working on your final project or reading ahead in either Weapons of Math Destruction or How Not to Be Wrong. FYI, we will be reading all of Weapons of Math Destruction and all but parts III and V of How Not to Be Wrong.

All of that being said, let's see if we can pare things down.

Are you proficient with QnA Markup?

No / I've never heard of it.
I've used it before, but I wouldn't mind a refresher.
I've used it before, and I feel comfortable skipping the introduction.

Do you have a good text editor? I'm not asking about a word processor, there's a difference.

Yes.
No.
I don't now.

Do you have a GitHub account, and do you know how to use it?

Yes.
No.

Using the Spot API
~6-12 Minutes

For those of you not working in Pythonanywhere, here is the notebook: spotAPI.ipynb. Vist the Spot website to create an account and get your API token. If you want to jump straight to the documentation, here's the link.

Making Predictions
~8-16 Minutes

We are working with the notebook file training.ipynb (pre-loaded for those of you using Pythonanywhere).

Ready to Go?

Before we add to your mission, let's make sure we're on the same page, and don't worry. Your answers to these questions are only saved to this device. It's just a self-test to make sure you know what you need to succeed on your mission. This is by no means an exhaustive test of what you need to know, but if you find yourself missing something, take it as a suggestion to revisit the materials above. If you pared things down based on an answer to the Knowledge Base questions, consider changing the answer and reviewing the material.

If you want to save output from QnA Markup without cutting and pasting text, what browser should you use?

Safari
Firefox
Chrome

Can you use spaces to indent tags in QnA Markup?

Yes.
No.

Your Mission
~1 Hour 50 Minutes

Clean and feed the following data (i.e., challenge_calls.csv and challenge_people.csv ) into the best-performing classifier you trained on the Dewey, Cheetham, and Howe data. That is, use your best model to predict for this data if a call is or isn't a take. Then produce a list of those calls that are takes. You will be asked to share the call IDs for these calls as part of your work log below (e.g., [175, 234, 327]).

NOTE: If you are using the class's Pythonanywhere accounts, the two csv files mentioned above should already be in the same directory as your notebooks. Also, I would like to remind you that you can ask for help on our Slack channel if you're not sure what your next step should be. This mission asks you to tie together a lot of prior works and make some connections. So it's understandable if you have questions.

Update: Based on several conversations I’ve had this week, I want to provide you all with this notebook (Level 9 Notebook.ipynb) to help you through this mission. If you take this notebook, read through it, run it, and turn in its output, that will meet this level's expectations. Of course, my hope is that you will do more than this, but it’s important for you to know that you don’t have to unless you want to exceed expectations. So I’m attaching a stretch goal to incentivize a little more than meeting expectations. Stretch Goal: create a model that gets an F1 score in excess of 0.7 on the challenge data. Good luck!

Self-Reflection and Logging Your Work
~20 min

As we do at the end of every level, we ask that you take a few minutes to reflect on how things are going. I've also included a set of reading questions to queue things up for our synchronous discussion. Your answers will be shared with me and it will let me know that I can look for any project work you may have posted. That being said, you've almost completed Level 9. Tell me how it's going by completing the form linked below.

Log and reflect on your work

Synchronous Meet Up, AKA our Class Time
1 hour | October 26, 2020 @ 4pm Eastern

If you're an enrolled student, we'll be meeting at this link on Monday October 26th at 4pm via Zoom. If you don't have the password, and you are a registered student, DM me on Slack, and I can give you the password. If you're not an enrolled student, I'm afraid you can't join us.

We will use this time to: (1) troubleshoot any issues folks might have had working through the your mission; and (2) discuss the readings.

Previous Level

Next Level

Coding the Law Suffolk Law School: Fall 2020 by @Colarusso

See Spot; See Spot Run: Using ML to Spot Fact Patterns ~23-45 Minutes. Protip: You can watch YouTube videos at more than 1X speed.†

Readings ~ 1 Hour 45 Minutes

Knowledge Base

Using the Spot API ~6-12 Minutes

Making Predictions ~8-16 Minutes

Ready to Go?

Your Mission ~1 Hour 50 Minutes

Self-Reflection and Logging Your Work ~20 min

Synchronous Meet Up, AKA our Class Time 1 hour | October 26, 2020 @ 4pm Eastern

Coding the Law
Suffolk Law School: Fall 2020
by @Colarusso