Monthly Archives: December 2017

Robert Kelchen on higher ed data in 2017

Robert Kelchen

It was a great year for data on higher education outcomes. The release of the College Scorecard in 2015 was a big step forward for researchers, policymakers, and the public—providing the first comprehensive institutional-level data on earnings and student loan repayment rates. (And the Department of Education recently signed a five-year agreement to keep getting earnings data from Treasury, allaying the fears of some about data in the Trump administration.) This year also saw the long-awaited release of graduation rates for Pell Grant recipients and part-time/transfer students via the Integrated Postsecondary Education Data System.

But the data release that stole the show in 2017 was from the Equality of Opportunity Project, a tremendous and well-funded collaboration by several top economists. With a well-coordinated release in the New York Times, the team made available its college-level data on the percentage of students from lower-income families who reached higher income quintiles by their early 30s. This highlighted the good work of many moderately-selective public and private nonprofit colleges, as well as the incredible share of super-wealthy students at Ivy League institutions. (The dataset also has marriage rates by college, which I had fun playing around with.) One caution: since the data come from tax records, some colleges are aggregated in strange ways. Be mindful of that when using this great dataset.

Siddhartha Mukherjee on a common post hoc abuse of statistical analysis

Siddhartha Mukherjee:

then a second instinct takes over: Why not try to find the people for whom the drug did work?…

But it’s also a treacherous seduction…

Perhaps the most stinging reminder of these pitfalls comes from a timeless paper published by the statistician Richard Peto. In 1988, Peto and colleagues had finished an enormous randomized trial on 17,000 patients that proved the benefit of aspirin after a heart attack. The Lancet agreed to publish the data, but with a catch: The editors wanted to determine which patients had benefited the most. Older or younger subjects? Men or women?

Peto, a statistical rigorist, refused — such analyses would inevitably lead to artifactual conclusions — but the editors persisted, declining to advance the paper otherwise. Peto sent the paper back, but with a prank buried inside. The clinical subgroups were there, as requested — but he had inserted an additional one: “The patients were subdivided into 12 … groups according to their medieval astrological birth signs.” When the tongue-in-cheek zodiac subgroups were analyzed, Geminis and Libras were found to have no benefit from aspirin, but the drug “produced halving of risk if you were born under Capricorn.” Peto now insisted that the “astrological subgroups” also be included in the paper — in part to serve as a moral lesson for posterity. I’ve often thought of Peto’s paper as required reading for every medical student.