Is It True? Kinda Depends on Why You're Saying It
There's rigorous editorial fact-checking, and then there are readers
Hi there! This is TITLE-ABS-KEY(“science journalism“), a newsletter about science journalism research. In the previous issue, I read an academic essay by a fellow journalist who questioned the true utility of the “science journalism vs. science communication“ divide we’ve all come to respect.
This issue is deviating a bit from the traditional scope of this section (consider it a refreshing summer break from professional navel-gazing). I look at a curious study that is not about science journalism itself but rather has really pertinent implications for our work.
Today’s paper: Handley-Miner, I.J., Pope, M., Atkins, R.K. et al (2023). The intentions of information sources can affect what information people think qualifies as true. Sci Rep 13, 7718. DOI: 10.1038/s41598-023-34806-4.
Why: I found myself reading this Nieman Lab story on what journalists think of journalism research (TLDR not a lot of praise there) and decided I needed a bit less introspection.
Abstract: The concept of truth is at the core of science, journalism, law, and many other pillars of modern society. Yet, given the imprecision of natural language, deciding what information should count as true is no easy task, even with access to the ground truth. How do people decide whether a given claim of fact qualifies as true or false? Across two studies (N = 1181; 16,248 observations), participants saw claims of fact alongside the ground truth about those claims. Participants classified each claim as true or false. Although participants knew precisely how accurate the claims were, participants classified claims as false more often when they judged the information source to be intending to deceive (versus inform) their audience, and classified claims as true more often when they judged the information source to be intending to provide an approximate (versus precise) account. These results suggest that, even if people have access to the same set of facts, they might disagree about the truth of claims if they attribute discrepant intentions to information sources. Such findings may shed light on the robust and persistent disagreements over claims of fact that have arisen in the “post-truth era”.
I think this is actually a really good way to understand the truly depressing issue of post-truth and “alternative facts“ (I don’t like repeating this absurd and malicious phrase but alas) and why it’s so hard to deal with it.
In an abstract, factchecker-sandbox world, the process itself, as it is supposedly run by both journalists and audiences, is straightforward. You take a claim, you separate facts underlying that claim from opinions, and you check those facts against primary sources, i.e. sources as close to the object, person, event or whatever in time and space and those giving the least “processed” information. If the claim matches the facts, it’s classified as true; if not, it is false. Easy peasy.
Our reality is much thornier even after you control for audiences really having very little means of getting to primary sources themselves, an attention deficit, and low media literacy. If only all facts about the world were as simple and handily verifiable as the proverbial ‘two plus two equals four‘ — as my favorite Vox article famously states, everything we eat both causes and prevents cancer. Try checking that one.
This is intentionally extreme, of course, but what about your ubiquitous and generally benign confidence intervals? If you expect something to increase by 2-11%, can you really say you expect a double-digit increase? Or what about RCP8.5, the famous very high emissions scenario which is not implausible but much less likely than its ‘business as usual‘ framing can suggest?
And then there is the other, more philosophical issue at play here: humans deciding something is true or false is not nearly as straightforward as using the = operator or a toddler’s shape sorter toy. There’s natural language with all its glorious imperfections, and the process inevitably relies on a lot of priors, including what we know and how we feel about the messenger. The Russian government famously loooooves to talk about civil rights violations in the US. Since I have no reason whatsoever to believe they really give two shits about oppressed US citizens, it will take a lot of work and overwhelming evidence to convince me to sort any claim of theirs on this matter into the ‘true‘ category.
And the point isn’t that I’m still going to disagree with them just because, even if they say two plus two equals four, as a typical propaganda strawman suggests. It’s that on anything even slightly more complex than that equation, I’m giving them negative — not zero, negative — benefit of the doubt.
This paper (which I’m finally getting to, I promise) explores one possible input into that mental true/false sorting process for complex claims: do we trust the intentions of the messenger not to be malicious? i.e. do they have that negative benefit of the doubt in our minds that suggests no mistake of theirs is ever honest, and everything they say can and should be interpreted against them?
Here’s how the authors set up the problem they are studying — it’s really neat:
Imagine, for example, a scientific paper is published that predicts a novel virus will infect 1.21% of the country’s population in the next month. A news headline reports, “Scientific paper predicts new virus will infect 1% of the nation’s population by next month.” Even if people knew the results of the original scientific paper, they might disagree about the veracity of this headline. Given the imprecision of natural language, what factors influence people’s decisions about whether a piece of information qualifies as true or false?
Again, this example is so elegant I’m going to take it into my science journalism courses. First, the deviation from the primary source is literally decimal points; second, there is a simple reason, clear to anyone studying journalism, why those 0.21% are not in the headline; third, if a novel virus will indeed infect 1.21% of the country’s population, it will have also infected 1%, so at least it’s rounding down and… not completely wrong in a technical sense? And fourth, as the authors rightly observe, it’s really unlikely that people will evaluate this statement as true or false based only on the size of that 0.21% discrepancy.
People’s general ability to factor the intentions of a speaker into interpreting what they said is well-known and well-established in research; it’s also why the sarcasm sign might be the best-known joke from the show that used it. The authors narrow it down to a very specific interpretation — is this true? — and create a big shortcut in the factchecking process by presenting the primary source right along with the claim. So it’s not about checking readers’ media literacy or whether they operate in the dreaded realm of “alternative facts,“ but about finding out what matters in true/false sorting, if not the 0.21%.
They offer three reasons why this is an important way to frame the task:
For example, if situational factors such as perceived intentions of an information source affect what people consider to be true, it could be that some contemporary disagreements over claims of truth are not actually about the objective accuracy of a news report, political claim, or scientific finding, but about the intentions of the journalist, politician, or scientist.
This is what keeps bugging me about the whole alternative facts discourse: it shifts attention from the truly malicious actions of disinformation peddlers, which is twisting the intentions of messengers (and side note — oh my, one of those things on the list of sources is reeeally not like the others)Additionally, it could suggest that getting people to “agree on the facts” may be insufficient for resolving disagreements over the truth, and that interventions should also focus on boosting trust in the intentions of society’s most important information sources.
That is why factchecking alone would not save us even if it weren’t generally starved of resources;Moreover, it can shed light on how people react to intentionally misleading information, a particularly important topic given that information that is not completely false can still be misleading, and possibly even more convincing than complete falsehoods.
Again, I think those of us who are not experts on disinformation, myself included, usually vastly underestimate the malice coming from that sector. Convincing people we need zero tolerance to this — to intentionally misleading information — would go a long way in restoring sanity.
The paper describes two studies. In the first one, a sample of U.S. Republicans and Democrats looked at six stimuli consisting of a real scientific finding and a real-looking but very carefully fabricated news report about that finding. All were about politically charged scientific findings, and here’s what’s politically charged in the US in 2023: climate change, COVID-19 itself and COVID-19 vaccines, abortion, and guns. Whew.
The made up news reports were very believably slanted in accordance with one of the two dominant political ideologies, and occasionally the reports were randomly attributed not to “a news outlet“ but to CNN, Huffpost, MSNBC, the New York Times, Fox News, Breitbart, the Sean Hannity Show, or the Wall Street Journal.
The authors asked two key questions — a 1-5 scale of the degree to which the outlet was trying to purposefully mislead its audience about the original scientific finding, and a true/false about the report — and followed up with a bunch of others, such as a true/false/unsure about the finding itself.
[Interesting side note: they had also planned to see whether a match between the designed slant of the report and the perceived slant of the outlet — i.e. Breitbart gonna Breitbart — led people to feel a greater intention to mislead and shift the true/false sorting. But to do that, they classified the Wall Street Journal as a conservative media outlet whereas in actuality it ranks as moderate and overall neutral. This NYT story provides some insight into the reasons why the authors had misclassified it and the wild media times we live in]
Finding: The more participants judged news outlets as intending to mislead, the more likely they were to classify the outlets’ reports as false.
I wonder if the same correlation exists between true/false sorting and the perceived intentions of sources in the story as well; climate scientists, as we all know, are all in on a huge global conspiracy to push climate change that brings them millions of dollars (I sure hope the authors are right about people being pretty good at discerning intent and sarcasm in particular.) That would have big implications for science journalism, especially as it looks to diversify voices beyond the usual ivory tower.
The second study had 24 stimuli which included a description of the intention: so here, people had the ‘ground truth’ not just about the claim (the fact), but also about the intent. I feel this takes us slightly further from the real world than I’d like, but it’s valuable for this study — in fact because this helped suggest ‘intent’ is about more than just deception, and people can still infer it on their own rather than take what’s explicitly stated at face value.
Here’s a taste of the stimuli; italics is what was randomly changed between participants. Note the numbers in the claim: they are both off compared to the ground truth, but in evidently different ways, I guess mimicking a typo/honest or dishonest mistake and a rounding up/down.
A team of scientists conducted a study on the side effects of an important new cancer medication and discovered that 11.3% of patients using this medication experienced side effects. In a TV interview, the scientists, trying to [accurately explain how safe the drug is / increase profits for the pharmaceutical company by misleading the public], reported that [10.3% / 10%] of patients experience side effects from this medication.
Here’s the part of this research that is absolutely predictable yet still fascinating to me. Technically — on a factchecking-sandbox, is-this-number-correct level — all stimuli in the first study were true. And because of the number manipulations, everything in the second study was technically false. And yet obviously this was not what readers thought.
Finding: Participants were more likely to judge an information source’s report as false when the information source was said to be trying to deceive their audience as well as when the information source reported a non-rounded, incorrect value.
In the general discussion section, the authors unpack my ‘negative benefit of the doubt‘ idea by pointing out that true/false sorting is not an isolated exercise but feeds directly into assessing the reliability of a source — and it’s a feedback loop, i.e. that established reliability then affects future sorting too.
The two dominant theoretical explanations in the misinformation literature suggest that either motivated reasoning or lack of critical reasoning are to blame for endorsement of misinformation. Both of these accounts focus on the role that discrepant beliefs about the true state of the world play in contemporary disagreements over the truth of claims of fact. Our work suggests that discrepant beliefs about the ground truth may not be necessary to produce these disagreements; discrepant attributions of intent may be sufficient.
I did not set out to read this paper for AI content, which I feel at this point is a bit of a staple of this newsletter section, but there is AI content! Of course there is. ChatGPT and other LLMs use reinforcement learning from human feedback to better respond to human queries and preferences. Thus it is important to understand what lay people think qualifies as true since that same conceptualization of truth might be getting trained into these models through reinforcement learning from human feedback.
But it’s not just that I think. A model can have no real intention; it is just physically incapable of any intention yet. We know this for a fact. And we know that they return bogus responses constructed not as a genuine answer but as something that superficially, albeit convincingly, mimics that answer (think completely made up paper citations for real scientists: their fakeness defeats the very purpose of citations, yet the language model still returns them because it is incapable of understanding and following that purpose).
We also know that this behavior is often called ‘hallucinations’ because people like to and will anthropomorphize algorithms. There is evidence AI authorship on its own does not affect credibility or trustworthiness (even though it kind of feels like it should). It is plausible people will ascribe intentions, at least to the AI creators if not to the machines themselves, and sort accordingly — or not.
So how would these same studies play out if we replaced CNN and Breitbart with ChatGPT in the attributed sources? I'll be on the lookout for a study like that for my next break from professional navel-gazing.
That’s it! If you enjoyed this issue, let me know. If you also have opinions on science journalism research or would like to suggest a paper for me to read in one of the next issues, you can leave a comment or just respond to the email.
Cheers! 👩🔬