Welcome back to TITLE-ABS-KEY(“science journalism“), a newsletter about science journalism research. In the previous issue, I continued my journey of watching, with a tired sigh, non-journalists try to help journalists, for the sake of credibility.
This time, I’m stepping off the beaten track, and arguably into the wild blue yonder, to watch philosophers build a computational model of science journalism.
Today’s paper: O'Connor, C., Weatherall, J., Mohseni, A. (2022) The Best Paper You'll Read Today: Media Biases and the Public Understanding of Science. Preprint DOI: 10.31222/osf.io/hpks9.
Why: Even though this is still a preprint, the issue it is trying to unpack — the curatorial role of science journalism and how it shapes public perception of science — has fascinated me ever since I was in charge of reading through dozens and dozens of embargoed press releases as a junior science desk reporter. And the method… well, we’ll see.
Abstract: Scientific curation, where scientific evidence is selected and shared, is essential to public belief formation about science. Yet common curation practices can distort the body of evidence the public sees. Focusing on science journalism, we employ computational models to investigate how such distortions influence public belief. We consider these effects for agents with and without confirmation bias. We find that standard journalistic practices can lead to significant distortions in public belief; that pre-existing errors in public belief can drive further distortions in reporting; that practices that appear relatively unobjectionable can produce serious epistemic harm; and that, in some cases, common curation practices related to fairness and extreme reporting can lead to polarization.
So, I have to confess that, when I first came across this paper and decided to save it for a future newsletter issue, I did that mostly on the basis of a catchy title and compelling abstract as well as my prior interest in the topic — which probably says something about the curatorial practices of science journalists.
And now I am influencing the public belief with regard to science journalism studies, my goodness. You may walk away from this newsletter thinking that research into science journalism is significantly kookier than it really is.
And I did not even need a computational model for that paragraph!
But then again, I also love a decent modeling exercise, and that is why I intend to read the whole paper, including the methods section, and add some context to what the authors are saying. Not sure the computational model will allow for that level of nuance in its representations of science journalists, though.
Anyway, modeling-related snark aside, this paper sets out to talk about a deceptively plain but incredibly important reality of science: it’s essentially a hyperobject, and we only ever come in contact with its projections onto our three-dimensional space. The authors use a more conventional way to put it: the public — including scientists the moment they step outside their expertise — perceives science through a curatorial lens and shapes their beliefs about science based on the products of those curation efforts.
Curators are individuals, organizations, or, increasingly, algorithms that select research outputs to share, popularize, review, or amplify.
And can I just say — talk about a buried lede! One of those things is decidedly not like the others, and perhaps it's algorithmic curation that merits more attention. But I am not an algorithm (I swear!) writing a newsletter about research into algorithms, so if there is a study on that out there somewhere, I won’t know.
For this paper, I’d argue it actually does not go far enough in establishing this fundamental truth. First, scientists also rely on curation even within their area of expertise — textbooks! review articles! journals themselves! journal clubs and discussion spaces! algorithms! (I swear I’m not one.)
It’s just that, to the authors, these curatorial practices are apparently much more robust and don’t present a risk of epistemic harm. That is, they don’t lead to people forming inaccurate beliefs about scientific evidence.
Only textbooks are addressed directly in the paper, and that same paragraph quite elegantly exposes the flaw in this logic. The authors write: Textbook writers, for instance, typically seek to help students form accurate beliefs.
That is an impressive level of uncritical thinking about textbook publishing, which can be and usually is intensely political — just ask the US and Russia.
And then there are journals and their publication biases as well as corporate influence on research. So I don’t know if simply presuming the epistemically good intentions behind curatorial practices in the academic space is a good strategy. Or a good belief about science?
But the other missing bit is science journalists themselves — journalists also rely heavily on products of curation, not just by institutional science communicators (PR people) but also by other journalists. I’m not even talking about reusing content; by merely doing their job and following a topic, journalists are exposed to a firehose of curated information.
That’s more levels and effects of curation to model than the authors have covered in the paper. But even this many is probably enough to drive you and me slightly mad.
The paper focuses on journalists and their curation practices ostensibly because of the conflicting pulls they experience:
Journalists are bound by professional norms related to balance, fairness, and truth-telling, which are intended to yield good epistemic effects. Moreover, many journalists are personally motivated to improve reader beliefs by reporting accurately. But media companies and journalists also have a financial motivation to maximize readership, advertising revenue, and so on. These latter goals are, at best, independent of epistemic aims, and in some cases they conflict with them.
The authors go on to describe three particular curation practices that emerge from this pressure cooker:
hyperbole, or journalists exaggerating or sensationalizing claims to garner attention,
extremity bias, or
cherry-picking surprising, novel, or extreme events to report
,fair reporting, or giving equally weighted attention to claims from all sides of an issue, which can lead to false balance when those sides are not equal in terms of available evidence.
These are all real issues in journalism, although arguably professional science journalists are acutely aware of them and least susceptible to the epistemic problems they create. For example, a science news story about an outlier is substantially more likely to include context on the baseline, and science journalists in newsrooms are usually the ones bugging their colleagues about false balance.
The authors claim that extremity bias and fair reporting as they define them are largely compatible with journalistic norms, but again, that’s definitely not the case for science journalism. That said, I agree that most science-related content in media doesn't actually come from science journalists. And there’s never been enough of us to exert enough pressure on the rest of the industry to tip the scales against sensational, breakthrough-focused, or (un)fair reporting.
So the computational model researchers set out to build is describing journalism in general – maybe I’m not the one being modeled here after all…
It is still a model, so by definition, it is a simplified representation of reality; after we go through the model itself, I’ll talk about the simplifications and how I feel about them. The way it works is as follows:
There’s the actual distribution of events, that is, the ground truth on how often certain things happen;
Journalists observe that and, in their reporting, build a reported distribution – the transformation/distortion is defined by the three curatorial practices defined above. Handily, the authors present a visual guide:
Agents (presumably members of the public, but essentially anyone who’s using the reporting to try and learn more about science) have their own sense of the ground truth before they come in contact with media – this is prior distribution;
After engaging with media, agents form a posterior distribution, which reflects how the prior distribution has shifted after incorporating media reports.
Interestingly, in some of the versions of the core model agents have confirmation bias, that is, they can choose to filter incoming media content by accepting evidence that supports their beliefs and rejecting evidence that refutes them.
Okay, so I think these models rely on a couple of things that are not true about the real world and can interfere with how well the models reflect the processes in it. First, in the initial run without curatorial practices, the reported distribution is the same as the actual distribution, which implies that the only reason why it might not be is editorial choices? But that’s just not true unless you assume an infinite number of journalists with unlimited resources. (I WISH!)
And remember – journalists rely on curation as well, so there’s a whole extra layer of filters between actual and reported distributions that are not extrinsic to the rest of the system. They play into known biases and reinforce them, so it’s not just journalistic curation that creates the downstream effects, is it?
Pretty much the same concern applies to the next step in the process: agents do not have limitless capacity to consume media, and the way the reported distribution maps onto their posterior distribution is not exact.
(The paper itself lists different limitations of the modeling, such as assuming normal distributions and a vast simplification of how agents process media content – basically, people don’t get confused and uncertain in these models.)
So even without the self-correcting practices of science journalists – that is, if you assume the absolute worst about science coverage – or the influence of curation on journalists, I’m not sure these models really isolate the effects of these curation practices, as the paper concludes.
First, all three biases can lead to distorted beliefs, even in cases where learners would otherwise develop an accurate picture of the world. Second, current misunderstandings in public belief can drive these journalistic distortions, thus creating negative feedback effects. This happens when current, inaccurate beliefs shape what is considered extreme, or fair, and thus drive selection of particularly misleading reporting. Third, in the presence of confirmation bias, both extremity bias and fairness are very harmful to the beliefs of media consumers. This is somewhat surprising because both practices involve selecting true events to report, which is often considered relatively innocuous compared to, say, lying or exaggeration. Last, both fair reporting and extremity bias can lead to polarization in the presence of confirmation bias.
And that’s where my affection for modeling runs out, in a way. Sure, in principle I agree that there’s always a significant gap between an actual field of research and how that field is represented in the media. For astronomy or mathematics, this gap may feel benign (I say feel because we don’t really know); for public health, it is often deadly – as in, it is the kind of ignorance that kills people. And it’s something that science communicators and journalists must keep in mind.
But did we need a (smart and beautiful) computational model to be able to say that? If the purpose of this study is merely to show that these journalistic practices are not harmless, I feel like maybe we’ve known that for a while by now.