Here is another example of the Impression vs Meaning principle. Call it “Making Statistics Lie For You”. It is a simple, common, powerful way a message-bearer can bias the recipient without even deviating from objectivity. Consider this recent headline:
“Chicago is in the bottom half of cities rated for family-friendliness. A new study from WalletHub ranks Chicago 78th out of 150 cities studied.”(1)
Two statements of fact and they appear to be objective. Upon a little research I believe the statements to be true: the study exists; there are 150 cities on the list; Chicago is ranked at 78. Then why does the headline make me feel ill, sorry I live in Chicago, ready to grimace or move away? I contend it’s the word choice “bottom half”. Please note that in a field of 150, you need to rank 76 or lower to be in the bottom half. At 78, Chicago only just barely made it! But the word choice delivers a strong negative impression even while remaining objective.
Another outlet reported “Chicago Ranks Low In Family Friendliness.” I would contend “low” may be subjective, and this word choice makes me ready to put the house up for sale — but who would want it?
Let’s take a look at the study that prompted all this bad juju. The WalletHub analysis looked at the 150 most populous municipalities in the United States for which reliable data was available. The 150 number was entirely arbitrary. If Chicago had ranked 78 out of a group of 1000, we’d all be proud as peanuts! “Chicago lands in the Top 8% of Family-Friendliest Cities!” And if they had cut off the list at 80, we would be ashamed to even mention the study. So is a 78th rank good or bad, low or high? As you see, it is a matter of context.
Next consider Rank versus Score. So we are 78th on the list. But what do the scores look like? Are we in a 78-way tie for First Place? Or is our score wallowing at the bottom? The mere ranking does not tell us that. So I looked up the study and plotted the data:
That silly ball in the middle marks Chicago’s 78th place rank, with a score of about 52. The dip at the right indicates that the lowest-scoring ten cities scored really low, and the ski-jump at the left indicates the top 15 cities or so scored really really high. And that great big gentle slope in the middle? Call that the Plateau of Mediocrity, where apparently most of us live, with Chicago smack in the middle.
There’s a different way data like this is usually graphed. I think it’s called a histogram. You pick a number of bins and divide up your data by bin. Then you count how many data points (cities) you have in each bin. The result shows the distribution of the data:
You can see there are lots of cities with scores around 50: looks like 35 of them. There are two cities with scores around 72, and nine cities with scores around 37. With a score of 52, Chicago is in that big stack right in the middle. along with 23% of the cities in the study.(2) If I look one bin to the left and right and include the cities with very similar scores, I find Chicago is keeping company with fully 60% of the study group. (3) Call it a 90-way tie for Middle.
This kind of distribution is so common it is called the Normal Distribution. Invite 150 cities to church, and note their arrival times. Plot a histogram — it will look just like this. Measure the lengths of 150 left thumbs, plot the data and it will look exactly like this. Chicago’s Family Friendliness score is as average as can be. It’s normal.
Yet another outlet reported it this way: “Chicago is 78th in the U.S. for family-friendliness, study says.” There is no bias in this headline. So why not report one of the following? “Chicago ranks solidly in the middle”, “Chicago leads the nation in mediocrity”, “Chicago is perfectly normal”, “Chicago Scores Average in Study”. The choice is up to the author or editor. Without even straying from objectivity, the message-bearer can strongly bias your impression even while sticking to the actual meaning. Beware!
(1) This aired on National Public Radio, WBEZ-Chicago on September 8 or 9, 2016. I cannot for the life of me find the actual report to cite. Sorry.
(2) 35 ÷ 150 = 23%
(3) The numbers of cities in the three big bins in the middle are 26 + 35 + 29 = 90 cities in the middle. 90 ÷ 150 = 60%. In a normal distribution, 60-70% of everything is always in the middle. Homework: What does that say about the electorate?