Subject: How ProPublica Uses AI Responsibly in Its Investigations

From	xxxxxx <[email protected]>
Subject	How ProPublica Uses AI Responsibly in Its Investigations
Date	March 15, 2025 12:50 AM

Links have been removed from this email. Learn more in the FAQ.

[[link removed]]

HOW PROPUBLICA USES AI RESPONSIBLY IN ITS INVESTIGATIONS
[[link removed]]

Charles Ornstein
March 13, 2025
ProPublica
[[link removed]]

*
[[link removed]]
*
[[link removed]]
*
*
[[link removed]]

_ When our reporters prompted a large language model to help identify
“woke” themes in a database of grants, AI helped them tell a vital
accountability story about science funding and Ted Cruz. _

Keyboard, Photo by Sunil Ray on Unsplash

How ProPublica Uses AI Responsibly in Its Investigations
by Charles Ornstein

_ProPublica is a Pulitzer Prize-winning investigative newsroom. Sign
up for The Big Story newsletter
[[link removed]]
to receive stories like this one in your inbox_.

Series: A Closer Look:Examining the News
More in this series [[link removed]]

In February, my colleague Ken Schwencke saw a post on the social media
network Bluesky
[[link removed]]
about a database released
[[link removed]]
by Sen. Ted Cruz purporting to show more than 3,400 “woke” grants
awarded by the National Science Foundation that “promoted Diversity,
Equity, and Inclusion (DEI) or advanced neo-Marxist class warfare
propaganda.”

Given that Schwencke is our senior editor for data and news apps, he
downloaded the data, poked around and saw some grants that seemed far
afield from what Cruz, a Texas Republican, called “the radical
left’s woke nonsense.” The grants included what Schwencke thought
was a “very cool sounding project” on the development of advanced
mirror coatings for gravitational wave detectors at the University of
Florida, his alma mater.

The grant description did, however, mention that the project
“promotes education and diversity, providing research opportunities
for students at different education levels and advancing the
participation of women and underrepresented minorities.”

Schwencke thought it would be interesting to run the data through an
AI large language model — one of those powering ChatGPT — to
understand the kinds of grants that made Cruz’s list, as well as why
they might have been flagged. He realized there was an accountability
story to tell.

In that article, Agnel Philip and Lisa Song found that “Cruz’s
dragnet had swept up numerous examples of scientific projects
[[link removed]]
funded by the National Science Foundation that simply acknowledged
social inequalities or were completely unrelated to the social or
economic themes cited by his committee.”

Among them: a $470,000 grant to study the evolution of mint plants and
how they spread across continents. As best Philip and Song could tell,
the project was flagged because of two specific words used in its
application to the NSF: “diversify,” referring to the biodiversity
of plants, and “female,” where the application noted how the
project would support a young female scientist on the research team.

Another involved developing a device that could treat severe bleeding.
It included the words “victims” — as in gunshot victims — and
“trauma.”

Neither Cruz’s office nor a spokesperson for Republicans on the
Senate Committee on Commerce, Science and Transportation responded to
our requests for comment for the article.

The story was a great example of how artificial intelligence can help
reporters analyze large volumes of data and try to identify patterns.

First, we told the AI model to mimic an investigative journalist
reading through each of these grants to identify whether they
contained themes that someone looking for “wokeness” may have
spotted. And crucially, we made sure to tell the model not to guess if
it wasn’t sure. (AI models are known to hallucinate, and we wanted
to guard against that.)

For newsrooms new to AI and readers who are curious how this worked in
practice, here’s an excerpt of the actual prompt we used:

Of course, members of our staff reviewed and confirmed every detail
before we published our story, and we called all the named people and
agencies seeking comment, which remains a must-do even in the world of
AI.

Philip, one of the journalists who wrote the query above and the
story, is excited about the potential new technologies hold but also
is proceeding with caution, as our entire newsroom is.

“The tech holds a ton of promise in lead generation and pointing us
in the right direction,” he told me. “But in my experience, it
still needs a lot of human supervision and vetting. If used correctly,
it can both really speed up the process of understanding large sets of
information, and if you’re creative with your prompts and critically
read the output, it can help uncover things that you may not have
thought of.”

This was just the latest effort by ProPublica to experiment with using
AI to help do our jobs better and faster, while also using it
responsibly, in ways that aid our human journalists.

In 2023, in partnership with The Salt Lake Tribune, a Local Reporting
Network partner, we used AI to help uncover patterns of sexual
misconduct
[[link removed]]
among mental health professionals disciplined by Utah’s licensing
agency. The investigation relied on a large collection of disciplinary
reports, covering a wide range of potential violations.

To narrow in on the types of cases we were interested in, we prompted
AI to review the documents and identify ones that were related to
sexual misconduct. To help the bot do its work, we gave it examples of
confirmed cases of sexual misconduct that we were already familiar
with and specific keywords to look for. Each result was then reviewed
by two reporters, who used licensing records to confirm it was
categorized correctly.

In addition, during our reporting on the 2022 school shooting in
Uvalde, Texas
[[link removed]],
ProPublica and The Texas Tribune obtained a trove of unreleased raw
materials collected during the state’s investigation. This included
hundreds of hours of audio and video recordings, which were difficult
to sift through. The footage wasn’t organized or clearly labeled,
and some of it was incredibly graphic and disturbing for journalists
to watch.

We used self-hosted open-source AI software to securely transcribe and
help classify the material, which enabled reporters to match up
related files and to reconstruct the day’s events, showing in
painstaking detail how law enforcement’s lack of preparation
contributed to delays in confronting the shooter.

We know full well that AI does not replicate the very time-intensive
work we do. Our journalists write our stories, our newsletters, our
headlines and the takeaways at the top of longer stories. We also know
that there’s a lot about AI that needs to be investigated, including
the companies that market their products, how they train them and the
risks they pose.

But to us, there’s also potential to use AI as one of many reporting
tools that enables us to examine data creatively and pursue the
stories that help you understand the forces shaping our world.

_Charles Ornstein [[link removed]]
is managing editor, local, overseeing ProPublica’s local
initiatives. These include offices in the Midwest, South, Southwest
and Northwest, a joint initiative with the Texas Tribune, and the
Local Reporting Network, which works with local news organizations to
produce accountability journalism on issues of importance to their
communities._

_Agnel Philip [[link removed]], Ken
Schwencke [[link removed]], Hannah
Fresques [[link removed]] and Tyson
Evans [[link removed]] contributed
reporting._

__

_Protect Journalism That Demands Accountability_

_The story you just read was made possible by our readers. We hope it
inspires you to support ProPublica, so we can continue producing
investigations that SHINE A LIGHT ON POWER, UNCOVER THE TRUTH, AND
DRIVE REAL CHANGE._

_ProPublica is a nonprofit newsroom dedicated to nonpartisan,
evidence-based journalism that holds power accountable. Founded in
2008 to address the decline in investigative reporting, we’ve spent
over 15 years uncovering injustice, corruption, and abuse of power —
work that is slow, expensive and more critical than ever to our
democracy. With seven Pulitzer Prizes and reforms sparked in state and
local governments, businesses, institutions and more, our reporting
ensures that the public interest comes first._

_Today, the stakes are higher than ever. From ethics in our government
offices, to reproductive health care, to the climate crisis and
beyond, ProPublica remains on the front lines of the stories that
matter most. Your gift helps us keep the powerful accountable and the
truth accessible._

_JOIN OVER 60,000 SUPPORTERS NATIONWIDE in standing up for
investigative journalism that informs, inspires, and creates lasting
impact. Thank you for making this work possible._

_Donate to ProPublica
[[link removed]]_

* artificial intelligence
[[link removed]]
* Journalism
[[link removed]]
* Technology
[[link removed]]

*
[[link removed]]
*
[[link removed]]
*
*
[[link removed]]

INTERPRET THE WORLD AND CHANGE IT

Submit via web
[[link removed]]

Submit via email
Frequently asked questions
[[link removed]]
Manage subscription
[[link removed]]
Visit xxxxxx.org
[[link removed]]

Twitter [[link removed]]

Facebook [[link removed]]

[link removed]

To unsubscribe, click the following link:
[link removed]

Screenshot of the email generated on import

Message Analysis

Sender: Portside
Political Party: n/a
Country: United States
State/Locality: n/a
Office: n/a