This is an archival version of Coding the Law's Fall 2023 course site.
Click the green flag to start. Game by simiko. See original. This game was made in Scratch, an educational programming language. We introduce coding with Scratch in Level 4 if you want to try your hand at making something similar.

Coding the Law
Suffolk Law School: Fall 2023
by @Colarusso

A self-guided LegalTech Adventure for folks with or without prior coding experience.

Data Standards
~2-7 min. Protip: You can watch YouTube videos at more than 1X speed.

XKCD Comic on regular expressions
Source: Standards from xkcd.

Optional. If all of this talk about data makes you scream, "show me the data." The following is for you. I have collected several places you might want to look to find curated data sets to give you an idea of what data looks like when they're collected in nice structured forms.

  • Measures for Justice. An attempt to collect state-level criminal justice data.
  • Data.gov. The federal government's open data portal.
  • USA Facts. A private initiative to address the gaps in access to data needed by governments to make policy decisions across the US.
  • Google's Data Set Search. A tool for searching across a number of publicly available data sets.

Readings
~ 1 Hour 17 Minutes

What the Heck is Word2Vec? Neural Nets for Lawyers
11-33 min. Protip: You can watch YouTube videos at more than 1X speed.

FWIW, preparing instructional material is an exercise in compression, and it's not lossless. The hope is that you now have a very high-level sense of how things work. For example, I glossed over how the activation function behaves with regard to the word2vec "hidden"/projection layer. Spoiler: it's not a sigmoid! Actually, there is no activation function. We just pass on the weights. That being said, I didn't explain weights very deeply. So again, as it says in the title—oversimplification. ;)

Same Stats, Different Graphs
Source: Autodesk.

Optional Media. If you want to learn more about some of the topics discussed in the video above, and you have some free time, you might enjoy the following.

See Spot; See Spot Run: Using ML to Spot Fact Patterns
~15-45 Minutes. Protip: You can watch YouTube videos at more than 1X speed.

Trojan Horse
Trojan Horse, Canakkale, Turkey. Photo by Peter Reed.

Optional Media. If you want to learn more about some of the topics discussed in the video above, and you have some free time, you might enjoy the following.

  • Spot builds upon data from the Learned Hands online game, a partnership between the LIT Lab and Stanford's Legal Design Lab. Learned Hands aims to crowdsource the labeling of laypeople's legal questions for the training of machine learning (ML) classifiers/issue spotters. Currently, this labeling is limited to publicly available historic questions from the r/legaladvice forum on Reddit. See Stanford and Suffolk Create Game to Help Drive Access to Justice.
  • Legal Issues Taxonomy (LIST). This taxonomy is what Learned Hands uses to label training data for Spot. It's worth noting that adoption of LIST, formerly NSMIv2, is one of the primary goals of Spot. As you may have gleaned from our discussion of data standards, it can be hard to get folks to adopt a standard. It's a chicken and egg problem. Folks want to use the standard that everyone else is using because a standard's value is a function of its community. Unfortunately, when there is no pre-existing community, it can be hard to get the ball rolling. Spot is an attempt to do this. We're building a shiny new AI tool that folks want to use. It just so happens that you have to label things in LIST for it to be useful. It's a Trojan Horse. ;)

Get Ready to Use the Spot API
~5 Minutes

Vist the Spot website and create an account and get your API token. FWIW, if you want to read ahead, you can skimthe documentation at this link.

Self-Reflection and Logging Your Work
~20 min

As we do at the end of every level, we ask that you take a few minutes to reflect on how things are going. I've also included a set of reading questions to queue things up for our synchronous discussion. Your answers will be shared with me and it will let me know that I can look for any project work you may have posted. That being said, you've almost completed Level 6. Tell me how it's going by completing the form linked below.

Synchronous Meet Up, AKA our Class Time
October 10, 2023 @ 4pm Eastern

If you're an enrolled student, we'll be meeting in Sargent Hall Room 305 on Tuesday October 11th at 4pm. Our remote backup is to meet via Zoom at this link. You should have received the password from me earlier. If you don't have the password, and you are a registered student, DM me on Teams, and I can give you the password. If you're not an enrolled student, I'm afraid you can't join us.

We will use this time to: (1) troubleshoot any issues folks might have had working through the knowledge base; (2) look at and talk about your mission; and (3) discuss the readings.

Time estimates are just that—estimates. The assumptions used to calculate reading time are as follows: 48 pages is assumed to take roughly an hour to read. When working with non paginated texts, it is assumed that a page is roughly equal to 250 words. Videos assume both 3X and 1X viewing. Estimates for coding are based on past experience. Each level should include about 6 hours and 40 min of work.