Optional Media.
If you want to learn more about some of the topics discussed in the video above, and you have some free time, you might enjoy the following.
- Hill for the data scientist: an xkcd story. So if correlation isn't causation and rejecting the null hypothesis isn't enough for us to prove the alternative hypothesis, how can we ever know anything? Well, we can use common sense and some rules of thumb. Hill's Criteria are a set of guidelines for evaluating causation, and this resource explains them using xkcd comics.
- If you want to learn more about the replication of scientific results and what has come to be known as the replication crisis, you may enjoy these podcast from Hi Phi Nation: Hackademics I and Hackademics II.
- If you would like to explore the idea of significance tests a little more, this Khan Academy lesson is a nice distillation—The idea of significance tests.
- Conference Diversity Distribution Calculator. "This calculator models the probability distribution for male/female speaker balance assuming random selection, which roughly follows a binomial distribution. It was inspired by the work of Dave Wilkinson and Paul Battley, who made similar models and found that the likelihood of an unbiased selection process yielding a line-up with no women at all is far lower than intuition might suggest, and – depending on the numbers you plug in – can often be far lower than the likelihood of their over-representation. That is to say: in an unbiased selection, you’re significantly more likely to see more than the expected number of women than none at all."
Knowledge Base
Everyone comes to this adventure with a different background. So this section is designed to be a menu of sorts. If you already know a topic well, you can skip the relevant material. Just answer the questions below, and section(s) will disappear accordingly. That being said, if a section doesn't disappear, you should do it. Any time you save skipping a topic, however, should be spent working on your final project or reading ahead in either Weapons of Math Destruction or How Not to Be Wrong. FYI, we will be reading all of Weapons of Math Destruction and all but parts III and V of How Not to Be Wrong.
All of that being said, let's see if we can pare things down.
Are you proficient with QnA Markup?
You've gained roughly 30 minutes by dropping a video introduction to QnA Markup. FWIW, you're going to be asked to create an interview in QnA Markup at the end of this Level. If you find yourself with questions, change this answer to unhide the QnA introduction.
Do you have a good text editor? I'm not asking about a word processor, there's a difference.
You've gained roughly 10 minutes by dropping a section on installing a text editor.
Do you have a GitHub account, and do you know how to use it?
You've gained roughly 20 minutes by dropping a GitHub exercise that walks you through creating a reop and making a pull request et al.
Your Mission: Machine Learning In Production with Google Sheets
~8-15 Minutes
This discussion build on the school closing example introduced back in level 4 when we talked about success metrics and built upon in levels 6, 7, and 9. If you're unfamiliar with Google Sheets, you can learn more on the Google Sheets website.
Your Final Project
3+ Hours
We're entering the home stretch. Remember to ask questions in Teams if you're stuck, and when we next meet, we'll do rounds—checking in with everyone to see where they are at. See The Final Project Rubric.
Self-Reflection and Logging Your Work
~20 min
As we do at the end of every level, we ask that you take a few minutes to reflect on how things are going. I've also included a set of reading questions to queue things up for our synchronous discussion. That being said, you've almost completed Level 10. Tell me how it's going by completing the form linked below.
Synchronous Meet Up, AKA our Class Time
~1.8 hours | November 7, 2022 @ 4pm Eastern
If you're an enrolled student, we'll be meeting in Sargent Hall Room 325 on Monday November 7th at 4pm. If you're not an enrolled student, I'm afraid you can't join us.
We will use this time to: (1) discuss the readings; and (2) review your work on your final project. That is, we'll go around the class and check in with everyone about their progress. We'll also work to help folks strategize about next steps and overcoming any blockers.
† Time estimates are just that—estimates. The assumptions used to calculate reading time are as follows: 48 pages is assumed to take roughly an hour to read. When working with non paginated texts, it is assumed that a page is roughly equal to 250 words. Videos assume both 3X and 1X viewing. Estimates for coding are based on past experience. Each level should include about 6 hours and 40 min of work.