The Band of Debunkers Busting Bad Scientists

By Nidhi Subbaraman

Sept. 24, 2023 5:30 am ET

An award-winning Harvard Business School professor and researcher spent years exploring the reasons people lie and cheat. A trio of behavioral scientists examining a handful of her academic papers concluded her own findings were drawn from falsified data.

It was a routine takedown for the three scientists—Joe Simmons, Leif Nelson and Uri Simonsohn—who have gained academic renown for debunking published studies built on faulty or fraudulent data. They use tips, number crunching and gut instincts to uncover deception. Over the past decade, they have come to their own finding: Numbers don’t lie but people do.

“Once you see the pattern across many different papers, it becomes like a one in quadrillion chance that there’s some benign explanation,” said Simmons, a professor at the Wharton School of the University of Pennsylvania and a member of the trio who report their work on a blog called Data Colada.

Simmons and his two colleagues are among a growing number of scientists in various fields around the world who moonlight as data detectives, sifting through studies published in scholarly journals for evidence of fraud.

At least 5,500 faulty papers were retracted in 2022, compared with 119 in 2002, according to Retraction Watch, a website that keeps a tally. The jump largely reflects the investigative work of the Data Colada scientists and many other academic volunteers, said Dr. Ivan Oransky, the site’s co-founder. Their discoveries have led to embarrassing retractions, upended careers and retaliatory lawsuits.

Neuroscientist Marc Tessier-Lavigne stepped down last month as president of Stanford University, following years of criticism about data in his published studies. Posts on PubPeer, a website where scientists dissect published studies, triggered scrutiny by the Stanford Daily. A university investigation followed, and three studies he co-wrote were retracted.

Marc Tessier-Lavigne stepped down as president of Stanford University in August. Photo: Winni Wintermeyer for The Wall Street Journal

Stanford concluded that although Tessier-Lavigne didn’t personally engage in research misconduct or know about misconduct by others, he “failed to decisively and forthrightly correct mistakes in the scientific record.” Tessier-Lavigne, who remains on the faculty, declined to comment.

The hunt for misleading studies is more than academic. Flawed social-science research can lead to faulty corporate decisions about consumer behavior or misguided government rules and policies. Errant medical research risks harm to patients. Researchers in all fields can waste years and millions of dollars in grants trying to advance what turn out to be fraudulent findings.

The data detectives hope their work will keep science honest, at a time when the public’s faith in science is ebbing. The pressure to publish papers—which can yield jobs, grants, speaking engagements and seats on corporate advisory boards—pushes researchers to chase unique and interesting findings, sometimes at the expense of truth, according to Simmons and others.

“It drives me crazy that slow, good, careful science—if you do that stuff, if you do science that way, it means you publish less,” Simmons said. “Obviously, if you fake your data, you can get anything to work.”

The journal Nature this month alerted readers to questions raised about an article on the discovery of a room-temperature superconductor—a profound and far-reaching scientific finding, if true. Physicists who examined the work said the data didn’t add up. University of Rochester physicist Ranga Dias, who led the research, didn’t respond to a request for comment but has defended his work. Another paper he co-wrote was retracted in August after an investigation suggested some measurements had been fabricated or falsified. An earlier paper from Dias was retracted last year. The university is looking closely at more of his work.

Experts who examine suspect data in published studies count every retraction or correction of a faulty paper as a victory for scientific integrity and transparency. “If you think about bringing down a wall, you go brick by brick,” said Ben Mol, a physician and researcher at Monash University in Australia. He investigates clinic trials in obstetrics and gynecology. His alerts have prompted journals to retract some 100 papers with investigations ongoing in about 70 others.

Among those looking into other scientists’ work are Elisabeth Bik, a former microbiologist who specializes in spotting manipulated photographs in molecular biology experiments, and Jennifer Byrne, a cancer researcher at the University of Sydney who helped develop software to screen papers for faulty DNA sequences that would indicate the experiments couldn’t have worked.

“If you take the sleuths out of the equation,” Oransky said, “it’s very difficult to see how most of these retractions would have happened.”

Leif Nelson, left, and Joe Simmons at the University of California, Berkeley. Photo: Ian Bates for The Wall Street Journal

Training by accident

The origins of Data Colada stretch back to Princeton University in 1999. Simmons and Nelson, fellow grad-school students, played in a cover band called Gibson 5000 and a softball team called the Psychoplasmatics. Nelson and Simonsohn got to know each other in 2007, when they were faculty members in the business school at the University of California, San Diego.

The trio became friends and, in 2011, published their first joint paper, “False-Positive Psychology.” It included a satirical experiment that used accepted research methods to demonstrate that people who listened to the Beatles song “When I’m Sixty-Four” grew younger. They wanted to show how research standards could accommodate absurd findings. “They’re kind of legendary for that,” said Yoel Inbar, a psychologist at the University of Toronto Scarborough. The study became the most cited paper in the journal Psychological Science.

When the trio launched Data Colada in 2013, it became a site to air ideas about the benefits and pitfalls of statistical tools and data analyses. “The whole goal was to get a few readers and to not embarrass ourselves,” Simmons said. Over time, he said, “We have accidentally trained ourselves to see fraud.”

They co-wrote an article published in 2014 that coined the now-common academic term “p-hacking,” which describes cherry-picking data or analyses to make insignificant results look statistically credible. Their early work contributed to a shift in research methods, including the practice of sharing data so other scientists can try to replicate published work.

“The three of them have done an amazing job of developing new methodologies to interrogate the credibility of research,” said Brian Nosek, executive director of the Center for Open Science, a nonprofit based in Charlottesville, Va., which advocates for reliable research.

Nelson, who teaches at the Haas School of Business at the University of California, Berkeley, is described by his partners as the big-picture guy, able to zoom out of the weeds and see the broad perspective.

Simonsohn is the technical whiz, at ease with opaque statistical techniques. “It is nothing short of a superpower,” Nelson said. Simonsohn was the first to learn how to spot the fingerprints of fraud in data sets.

Working together, Simonsohn said, “feels a lot like having a computer with three core processors working in parallel.”

The men first eyeball the data to see if they make sense in the context of the research. The first study Simonsohn examined for faulty data on the blog was obvious. Participants were asked to rate an experience on a scale from zero through 10, yet the data set inexplicably had negative numbers.

Uri Simonsohn at the Esade Business School in Barcelona, Spain. Photo: Edu Bayer for The Wall Street Journal

Another red flag is an improbable claim—say a study that said a runner could sprint 100 yards in half a second. Such findings always get a second look. “You immediately know, no way,” said Simonsohn, who teaches at the Esade Business School in Barcelona, Spain. Another telltale sign is perfect data in small data sets. Real-world data is chaotic, random.

Any one of those can trigger an examination of a paper’s underlying data. “Is it just an innocent error? Is it p-hacking?” Simmons said. “We never rush to say fraud.”

To keep up with their blog and other ventures, the trio text almost daily on a group chat, meet on Zoom about once a week and email constantly.

Simonsohn’s phone pinged in August while he was on vacation with his family in the mountains of Spain. Simmons and Nelson broke the news that they were being sued for defamation in a $25 million lawsuit.

“I was completely dumbfounded and terrified,” Nelson said.

‘She’s usually right’

Bad data goes undetected in academic journals largely because the publications rely on volunteer experts to ensure the quality of published work, not to detect fraud. Journals don’t have the expertise or personnel to examine underlying data for errors or deliberate manipulation, said Holden Thorp, editor in chief of the Science family of journals.

Thorp said he talks to Bik and other debunkers, noting that universities and other journal editors should do the same. “Nobody loves to hear from her,” he said. “But she’s usually right.”

The data sleuths have pushed journals to pay more attention to correcting the record, he said. Most have hired people to review allegations of bad data. Springer Nature, which publishes Nature and some 3,000 other journals, has a team of 20 research staffers, said Chris Graf, the company’s research integrity director, twice as many as when he took over in 2021.

Retraction Watch, which with research organization Crossref keeps a log of some 50,000 papers discredited over the past century, estimated that, as of 2022, about eight papers have been retracted for every 10,000 published studies.

Bik and others said it can take months or years for journals to resolve complaints about suspect studies. Of nearly 800 papers that Bik reported to 40 journals in 2014 and 2015 for running misleading images, only a third had been corrected or retracted five years later, she said.

The work isn’t without risk. French infectious-disease specialist Didier Raoult threatened to sue Bik after she flagged alleged errors in dozens of papers he co-wrote, including one touting the benefits of hydroxychloroquine to treat Covid-19. Raoult said he stood by his research.

Elisabeth Bik at home in California. Photo: Clara Mokri for The Wall Street Journal

Honest work

Simonsohn got a tip in 2021 about the data used in papers published by Harvard Business School professor Francesca Gino. Her well-regarded studies explored moral questions: Why do some people lie? What reward drives others to cheat? What factors influence moral behavior?

The three scientists examined data underlying four studies and identified what they said were irregularities in how some entries appeared. Numbers in data sets look to have been manually changed. In December 2021, they sent their discoveries to Harvard, which conducted its own investigation.

Harvard concluded Gino was “responsible for ‘research misconduct,’” according to her lawsuit against Harvard, Nelson, Simmons and Simonsohn. The Harvard Business School asked journals that published the four papers to retract them, saying her results were invalid.

In June this year, the trio posted their conclusions about Gino’s studies on Data Colada. Data in four papers, they said, had been falsified. When they restored what they hypothesized was correct information in one of the four studies, the results didn’t support the study’s findings. The posts sent the social sciences community into an uproar.

Gino is on administrative leave, and the school has begun the process of revoking her tenure. In her lawsuit, Gino said Harvard’s investigation was flawed as well as biased against her because of her gender. A business school spokesman declined to comment. The suit also contends that the Data Colada blog posts falsely accused her of fraud. The three scientists said they stood by their posted findings.

Gino, through her lawyer, denied wrongdoing. She is seeking at least $25 million in damages. “We vehemently disagree with any suggestion of the use of the word fraud,” said Gino’s lawyer Andrew Miltenberg. Gino declined to comment.

Miltenberg said Gino was working on a rebuttal to Data Colada’s conclusions.

In August, a group of 13 scientists organized a fundraiser that in a month collected more than $300,000 to help defray Data Colada’s legal costs.

“These people are sending a very costly signal,” Simmons said. “They’re paying literal dollars to be, like, ‘Yeah, scientific criticism is important.’”

Write to Nidhi Subbaraman at nidhi.subbaraman@wsj.com

New Fixes for Methane Emissions Could Be a Big Climate Help

By Rob Jackson

July 25

Finance

Cash Dries Up for Locals Fighting Climate Change

By Henry Kronk

July 22

The Future of Everything

What Could Conquer the Superweeds? Bayer and Others Turn to AI

By Patrick Thomas | Photographs by Jesse Ilan Kornbluth for The Wall Street Journal

July 17

Science

Are Three Thumbs Better Than Two?

By Eric Niiler

July 18

Exclusive

Flood of Fake Science Forces Multiple Journal Closures