The Online News Association 2023 just wrapped up in Philadelphia, and while yours truly wasn’t in attendance, I certainly availed myself to the YouTube videos. Thanks, ONA!
One thing that retained my attention was The Marshall Project explaining how they had iterated on their Banned Books Project. For context, lots of books are forbidden per each U.S. state’s prison administration, and The Marshall Project has been looking at which books are banned and for what reason(s).
At ONA, Andrew Rodriguez Calderón, a computational journalist at The Marshall Project, explained that the team worked through various iterations of the prompting to get OpenAI’s ChatGPT to produce meaningful work.
Learning No. 1: To define good prompts, you need to lock down, very precisely, what you are trying to do
Meaningful work is first based on the proper definition of what you are trying to get to. In the first pass of the work, Calderón explained they went back to stakeholder with a preliminary version to check on quality. What they realised wasn’t so much that the quality was poor because the AI had done a poor job, but rather because what the stakeholders were after wasn’t quite what the team sought to produce in the first place.
Learning No. 2: Know when to use a Mechanical Turk
Have you ever heard the term Mechanical Turk? It’s a system that seems to be based in automation but really is powered by humans. Because sometimes it’s just easier to get a human to do something than figure out the long, painful way to automate it. And this is a bit of a paradox, but when the team at The Marshall Project first got started, they didn’t over invest in the “computer work” part of it at the start. Rather, they created a good chunk of the early examples manually and then asked ChatGPT to do the last mile. This allowed them to work on Point No. 1 — making sure that, in the first place, the goal was properly defined.
Learning No. 3: Before you prompt, think about the system prompt
Calderon explained how, armed with feedback from stakeholders, they went back to the journalistic piece of imaging what the best outcome article would be like. And they came up with a structure with subheads by topic, giving a definition to each topic.
This part is called system prompting: when you provide additional information to make sure the system (ChatGPT) will come to something with the same underlying assumptions that a human would.
This particular approach can look like telling a system what tone it may be speaking in, or the level of expertise it may assume, or, in the case of the Banned Book project, both tone and instructions for how to handle the material and how to surface inadequacies in the material.
The main prompt looked like this, per Calderón:
You [ChatGPT] are an expert in reading excerpts of prison policies and summarizing them into predefined sub-heads with definitions and using the “important instructions.”
I will provide you with a list of sub-heads with definitions. When you summarize, you will group them under the subheads based on relevance.
- When summarizing the notes into the sub-heads, if you do not find information relevant to the sub-head in the notes, include a default sentence: “There is no information relevant to this sub-head in the policy.”
- You will ignore any special or text formatting in the notes. You will also ignore bullet points of any kind in the notes.
- When you are ready to receive the sub-heads with definitions, say, “Ready for sub-heads.”
- When you are ready for the notes, say, “Ready for notes.”
(If you want to see a written version of this talk, Calderón also documented this on Medium.)
During the panel part of the presentation at ONA, Calderón spoke about the time saved by the approach they used — and whether there even was any time saved. Interestingly, they noted that whether time was saved was still a matter of debate among team members but what definitely was different was the place where labour was invested.
In this respect, this is ever something to keep in mind when we work to automate anything: We add labour in one place to save it elsewhere. Sometimes, the gain isn’t necessarily time (and therefore cost) but it may be accuracy; sometimes, the gain is purely the cognitive satisfaction of a human working on a certain kind of problem over an over (would rather work on figuring out smart prompts rather than robotically swallow a ton of material and summarise it).
But even if the matter of whether actual efficiency was found, we also have to think about what we gain from these projects over the long term: direct reusability (sometimes) or our deepening understanding of how to sketch and develop these projects. This one is really an investment into efficiency over time.
If you’d like to subscribe to my bi-weekly newsletter, INMA members can do so here.