3 lessons in prompting, iterating with generative AI

By Ariane Bernard

INMA

New York City, Paris

Connect      

Hi everyone.

Hello from the other side of a quick summer break. I had some time to catch up on all the essentials, like read some great long reports and articles (see the “FAWWW” section of this newsletter, which comes with a funny Insta Reel as well). I’m now getting ready for fall: our data master class in October (see: the agenda here) and the new season of Only Murders in the Building.

My inbox is a spoiler-free zone but whether you want to chat data or exchange Only Murders in the Building theories, you’ll find me at ariane.bernard@inma.org.

Ariane

The science, no art, of prompting and iterating with a generative AI tool 

The Online News Association 2023 just wrapped up in Philadelphia, and while yours truly wasn’t in attendance, I certainly availed myself to the YouTube videos. Thanks, ONA! 

One thing that retained my attention was The Marshall Project explaining how they had iterated on their Banned Books Project. For context, lots of books are forbidden per each U.S. state’s prison administration, and The Marshall Project has been looking at which books are banned and for what reason(s).

The Marshall Project used ChatGPT to help produce its Banned Books Project.
The Marshall Project used ChatGPT to help produce its Banned Books Project.

At ONA, Andrew Rodriguez Calderón, a computational journalist at The Marshall Project, explained that the team worked through various iterations of the prompting to get OpenAI’s ChatGPT to produce meaningful work.

Learning No. 1: To define good prompts, you need to lock down, very precisely, what you are trying to do.

Meaningful work is first based on the proper definition of what you are trying to get to. In the first pass of the work, Calderón explained they went back to stakeholder with a preliminary version to check on quality. What they realised wasn’t so much that the quality was poor because the AI had done a poor job, but rather because what the stakeholders were after wasn’t quite what the team sought to produce in the first place.

Learning No. 2: Know when to use a Mechanical Turk.

Have you ever heard the term Mechanical Turk? It’s a system that seems to be based in automation but really is powered by humans. Because sometimes it’s just easier to get a human to do something than figure out the long, painful way to automate it. And this is a bit of a paradox, but when the team at The Marshall Project first got started, they didn’t over invest in the “computer work” part of it at the start. Rather, they  created a good chunk of the early examples manually and then asked ChatGPT to do the last mile. This allowed them to work on Point No. 1 — making sure that, in the first place, the goal was properly defined.  

Learning No. 3: Before you prompt, think about the system prompt.

Calderon explained how, armed with feedback from stakeholders, they went back to the journalistic piece of imaging what the best outcome article would be like. And they came up with a structure with subheads by topic, giving a definition to each topic.

Andrew Rodriguez Calderón of The Marshall Project shows the system prompt for the AI involved in it Banned Book Project..
Andrew Rodriguez Calderón of The Marshall Project shows the system prompt for the AI involved in it Banned Book Project..

This part is called system prompting: when you provide additional information to make sure the system (ChatGPT) will come to something with the same underlying assumptions that a human would. 

This particular approach can look like telling a system what tone it may be speaking in, or the level of expertise it may assume, or, in the case of the Banned Book project, both tone and instructions for how to handle the material and how to surface inadequacies in the material.

The main prompt looked like this, per Calderón:

You [ChatGPT] are an expert in reading excerpts of prison policies and summarizing them into predefined sub-heads with definitions and using the “important instructions.”

I will provide you with a list of sub-heads with definitions. When you summarize, you will group them under the subheads based on relevance.

Important instructions:

  1. When summarizing the notes into the sub-heads, if you do not find information relevant to the sub-head in the notes, include a default sentence: “There is no information relevant to this sub-head in the policy.
  2. You will ignore any special or text formatting in the notes. You will also ignore bullet points of any kind in the notes.
  3. When you are ready to receive the sub-heads with definitions, say, “Ready for sub-heads.”
  4. When you are ready for the notes, say, “Ready for notes.”

(If you want to see a written version of this talk, Calderón also documented this on Medium.) 

During the panel part of the presentation at ONA, Calderón spoke about the time saved by the approach they used — and whether there even was any time saved. Interestingly, they noted that whether time was saved was still a matter of debate among team members but what definitely was different was the place where labour was invested. 

In this respect, this is ever something to keep in mind when we work to automate anything: We add labour in one place to save it elsewhere. Sometimes, the gain isn’t necessarily time (and therefore cost) but it may be accuracy; sometimes, the gain is purely the cognitive satisfaction of a human working on a certain kind of problem over an over (would rather work on figuring out smart prompts rather than robotically swallow a ton of material and summarise it).

But even if the matter of whether actual efficiency was found, we also have to think about what we gain from these projects over the long term: direct reusability (sometimes) or our deepening understanding of how to sketch and develop these projects. This one is really an investment into efficiency over time. 

Bonus content: What is prompting?

You’ve probably heard this term: “a prompt for ChatGPT” or “prompt engineering.” What does it mean?

First, let me make this into an SAT question: “Prompting is to a generative AI system what a command is to a traditional computer programme.” 

Now that I have not only made everything more murky but also given you light PTSD from your high school days, I’ll explain.

Prompting is the term used for issuing commands to generative AI systems.
Prompting is the term used for issuing commands to generative AI systems.

In traditional computer programming, a command is an order (an instruction) given to a programme for it to execute something. Like “turn right 90 degrees” (I’m using LOGO here for my example, because I am an 1980s kid). A command has to be provided in the specific computer syntax that the programme expects — meaning that even a typo would derail things and the command would fail.

Prompting is the term we use for issuing commands to generative AI systems. From a lexical perspective, the choice of the term “prompting” is correct in that it’s less a literal direct order but rather an attempt at driving the system in a certain direction.

And the reason this is less direct and literal is that prompting takes place in natural language — natural language, in computer terms, is the language of humans (regardless of their speaking tongue). So the upside is the human trying to interact with the programme does not have to speak a computer language, and the human also doesn’t need to think about the order of operation for carrying out the prompt.

What it does mean, however, is that we are shifting some amount of work to the computer system. What a computer command does in traditional computing leaves no room for ambiguity. But prompting an AI means the AI first has to parse your prompt, work it through its engines to “understand” it, and then actually execute it.

You can readily appreciate that there will clearly be many opportunities for misunderstanding between the human giving the prompt and what the system will do as it tries to figure out what the human wants it to do.

As a result, the art of prompting is really disambiguating what humans would generally infer from context they already know or can infer into straightforward and ambiguous language that a computer will find complete enough and unambiguous enough that it will be able to properly understand what is being asked.

Sometimes, more advanced prompt engineering may involve explaining what way-station tasks the AI should take to carry out the prompt. For example, when Ippen.Media looked for ways to include quotes in their summaries, they broke down the task into, first, extracting quotes from the text and keeping them in memory, then summarising and including these quotes where appropriate. Asking the system to do everything in one prompt proved a non-starter. But setting the job through a series of prompts of more discrete complexity allowed Ippen to tackle the challenge.

For more: Here’s an in-depth dive into prompting, from Microsoft (written for non-technical humans).

Further afield on the wide, wide Web

Some good reads from the wider world of data: 

  • This week I shall start with dessert: This funny reel, personally selected for me by the Instagram personalisation algorithm: The comedian Elle Cordova giving us a peak in the “break room of the AI assistants. Siri, Alexa, and the gang are not happy when GPT-4 walks in. 
  • It’s back to school season, and in the U.S., that means this is the great semester of college applications for high school seniors. The New York Times looked at how universities were taking on having generative AI in the mix of college admission essays and added a sidebar where the author put chatbots to work on the essay questions of schools like Harvard and Yale. As ever, the chatbots made up a lot of stuff. (Links are gift links to jump over the paywall.)
  • Why protecting paywalled content from AI bots is difficult business” from Digiday. It comes down to identifying crawlers — and whether the crawler is trying to copy your content to aggregate it. But it’s an imperfect science, and, as publishers in this article note, the crawling also changes all the time, with new bots, change of IPs, and other switcharoos. Still, The Guardian is reporting that a growing number of publishers added blocks against bots in August.
  • “The illusion of AI’s existential risk” by a group of computer science and mathematics professors from Canada and a fellow at Google Research in Noéma magazine. A long read but a good one, which argues there are doomsday-AI hinges on the compounded presence of a number of assumptions, each with extremely long odds. And, in the meantime, that AI does have identified, far more likely risks, which are the ones to worry about. 
  • OpenAI published a paper proposing an approach for using GPT4 for content moderation. Content moderation is a reliable center of costs for publishers, and I have yet to meet one who was not interested in reducing the costs of moderation. What’s great for a filtering problem like content moderation is that we can absolutely satisfy ourselves with imperfect solutions, as long as they deliver some value over at least a portion of the problem. 
  • Report: “Automating Democracy: Generative AI, Journalism, and the Future of Democracy” by Amy Ross Arguedas and Felix M. Simon. The report followed a symposium at Balliol College at Oxford University and is a very useful overview of some of the recurring topics that come up around generative AI. From big concepts to how it may affect democracy, to misinformation and trust. 
  • Long read, “The AI Power Paradox” in Foreign Affairs. The most interesting bit of this article is the analysis of the issue of regulation of AI filtered through the lens of the diplomatic push-and-pull between the big powers, the U.S. and China primarily. There is an ambiguity, the authors argue, between the desire to regulate and, on the other hand, gaining an edge in building the latest and greatest technology. Separately, in adding to the complexity of regulation is the fact that the technology is both highly complex and fluid, where the lawmaking process is usually at a deficit of information from a technical perspective, and its response time is slow.  

About this newsletter

Today’s newsletter is written by Ariane Bernard, a Paris- and New York-based consultant who focuses on publishing utilities and data products, and is the CEO of a young incubated company, Helio.cloud

This newsletter is part of the INMA Smart Data Initiative. You can e-mail me at Ariane.Bernard@inma.org with thoughts, suggestions, and questions. Also, sign up to our Slack channel.

About Ariane Bernard

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.
x

I ACCEPT