Explaining Data Science to High School Students
My attempt to make data science cool (because it is)
Since the day my kids were born, I wanted to be the cool dad on career day. I imagined I'd share advice would change a life. So when my son Jack took computer science, I volunteered. I work in tech, so it should be easy, right? A week before my talk, my story seemed dry. Fortunately I had an insider.
What High School Students Want to Know
I asked Jack what he thought the class would want to know. “Dad--duh--the first thing we want to know is how much money you can make?”
Money wasn’t the inspirational opening I envisioned, but, if that’s what they wanted to know, why not? A few weeks later in class, I started with Jack’s question. “The average salary for a data scientist is $113,000 a year.”
Blank faces. Jack had the same look. But I was ready.
"That's $67 / hour," I said.
It was a cheap ploy, but had their attention.
Why Data Science is Cool
“What’s cool about data science is that you use it to change the world,” I continued. I shared the simplest definition of a data science I could find: data scientists extract meaning from data. They help us understand the world and discover new things.
I gave some examples. Al Gore’s global warming movie. The explanation of how border walls work from the New York Times. ESPN on how the Patriots won the Super Bowl.
Then I showed them a bunch of numbers (shown at right). I asked what they saw. I expected the blank stares. Data, unadorned, means nothing.
These numbers tell the story of Napoleon’s march on Russia in 1812. The graphic below, by Joseph Minard, is featured The Visual Display of Quantitative Information by Edward Tufte, which Amazon called “one of the best 100 books of the 20th century.” Tufte has called it the best statistical graphic of all time.
I walked the class through the story Minard’s graphic tells.
The tan jagged area represents the size of Napoleon’s army. Beginning at left, Napoleon entered Russia in 1812 with 422,000 troops. The tan ink dwindles as they travel east, from left to right. The troops were dying. The Russian "scorched earth" strategy was working—they destroyed food and shelter in their wake.
Six months later the French reached Moscow. 322,000 had died en route.
The French turned around. Their return is shown in black, from right to left.
Minard added a line graph on the bottom that displays temperature. At the beginning of the retreat, it was 0 degrees. The temperature drops as low as 30 degrees below zero.
The thin black sliver at far left drops the mic on Napoleon’s story. Only 10,000 survived.
Minard’s graphic is the Mona Lisa of statistical graphics — the longer you linger, the more you see. It shows army size, direction, temperature, direction and location of troops all at once.
For example, look at this small section, near the middle of the image. It shows that 50,000 French troops approaching the Berezina river, near Minsk, in September of 1813. The temperature is dropping from -11 to -20. This is the site of an infamous encounter with the Russians at Berezina.
The black line tells the ominous tale: 25,000 died.
In French, “Berezina” is now synonymous with disaster. Paintings of Berezina hang in the Musée de l’Armée in Paris. They are considered to be great works of art.
I think both images are great works of art. Each tells the story in very different ways.
In 1983, Edward Tufte, The Visual Display of Quantitative Information (1983), Tufte between 900 billion and 2 trillion images of statistical graphics appear in print each year.
That’s why data science is cool. Like great art, great data science tells a story. It spreads ideas. It shares insight. It can inspire and surprise. It can reveal.
Why does Minard’s graphic matter?
Minard's graphic matters. In one image, it shows how devastating war can be.
Generations have learned from it. It is said that, from 1850 to 1860, France’s public ministers made a point of having their portraits painted with Minard’s visualizations in the background. Minard is now revered alongside the greats of data visualization.
Last weekend I was watching 60 Minutes. In a flash, I noticed Napoleon’s March to Moscow hanging in United States General H.R. McMaster's wall, over his right shoulder, just above his Mac!
A great statistical graphic can have lasting impact, for decades. There are so many ways to make a difference in the world. Minard did, with data science.
Weave data science into the fabric of whatever you study.
Ironically, Minard wasn't a data scientist. He was a civil engineer. So my parting advice for Jack’s high school class was not that everyone should be a data scientist. My advice was for everyone to weave some data science into the fabric of their studies.
To illustrate, I asked them to shout out their favorite subject. With each one, I gave examples of how data science helps that field. Sometimes it’s hard! But I’ve managed to give examples from biology, security, wind energy, marine biology, financial trading, home automation, social media, fashion, agriculture, law enforcement, manufacturing, sociology, cybersecurity, gaming, restaurant management, dance, sports.
Being a data scientist isn’t for everyone, but AI will impact everyone. So my big takeaway for the class is this: weave a strand of data science into the fabric of their study. If you hate math, at least take a class in statistics. If you like data science even just a little, consider a minor. If you like a lot, it’s a fantastic major. But artificial intelligence should be the fourth education “R” of Reading, wRiting, aRithmatic and aRtifical intelligence.
My Talk Worked. On Me, Too
Jack overheard a few kids talk about how much data scientists make. The fact they were discussing anything I said seemed like success. But the best thing about my talk is that it helped me reconnect with why I love data science. It IS cool!