How I Reverse-Engineered GPT-3 Prompt Behind a Popular Site — IdeasAI
I am a fan of indie developer Pieter Levels. He is open, competent, and versatile. As he showed in his latest project IdeasAI, where he generates startup ideas with GPT-3 and people vote on them.
Note: GPT-3 is the biggest language model that can generate human-like text. Learn more about this marvellous technology on YouTube.
There is a discussion about how to build a business around GPT-3, with Bakz T. Future writing a great article about the issue. Generally speaking, the moats to defend a business must be built around distribution or other technological features as guessing text (a prompt) that is sent to GPT-3 is moderately easy.
One of the most notorious business use-cases is deploying GPT-3 as PPC copy tool. The commercial value is extremely high, but there are also many projects that popped up after exposing the value. Namely, UseBroca, CopySmith, HeadLime, MagicFlow, and Copy.ai. Copy.ai is likely the most promising project, currently generating $11k/MMR, but it is not yet clear that they will be the winner in the given segment.
With this in mind, I decided to try reverse engineer the prompt used in IdeasAI to see how difficult it is in reality and even dare to suggest some tweaks.
Experiment n.1
The simplest thing to do at the beginning is to take six examples and see how well GPT-3 can follow the pattern. I did not take random examples, as they can be polluted with randomness, so I picked only the six most popular ideas of the month and run with it.
Note: For the experiments, I used an alternative interface for GPT-3 called Prompt.ai.
PROMPT
RESULTS (10 completions, concatenated)
That doesn’t look that bad. GPT-3 got the pattern right away, and all ten examples are decent for the first attempt. However, it seems that we are still missing a variety of results that IdeasAI has. Terms like “A new kind of search…”, “A tool for restaurants…”, “A way to help people…”, and 2/10 contain extra text.
Experiment n.2
In the second experiment, I took some examples out to hopefully give GPT-3 more space to complete a wider range of ideas. I also added an intent “The following are ideas for startups, apps, marketplaces, platforms:” to substitute missing examples and lowered the temperature a bit to 0.85.
PROMPT
RESULTS (10 completions, concatenated)
Not great, not terrible. Our results are more versatile (e.g. Kickstarter), but polluted with information about funding (3/10). I assume there is a problem with one of our shots where I left information about funding. Let’s run another round without this information.
PROMPT WITHOUT INFORMATION ABOUT FUNDING
RESULTS (10 runs, concatenated)
Much better, “Fiverr for cats” will be for sure a unicorn. We no longer see funding information. 10/10 are startup ideas. Versatility is also quite good. One completion has extra text, so with that respect, I will add one more example in the hope of further improving accuracy: “A way to help people make better decisions in healthcare, by making it possible to compare the cost and effectiveness of different treatments.”
Experiment n.3
I tested the completion before hand and it is solid. So run the prompt 50 times to test the consistency.
PROMPT
RESULTS (50 completions, concatenated)
I think we are done here. The accuracy is pretty high. Only one completion contained extra text. This prompt can very likely run behind IdeasAI without much of a hassle. Of course, the prompt is only one part of the success, and as we discovered, execution and distribution are, in the case of GPT-3 based businesses, more important.
RESULTS WITH EXTRA TEXT
Experiment n.4
We’ve arrived at the suggested improvement. In my opinion, Pieter is aiming too much toward accuracy rather than diversity, and it is likely due to the fact that there are too many too similar examples. Thus GPT-3 does enough space to come up with interesting results (and also make mistakes). So let me introduce futuristic oriented prompt…
PROMPT
RESULTS (10 completions, concatenated)
Future, here we go. We have teleportation, blockchain, drone, 3D printing, and of course, chatbots.
Final thoughts
As we can see, designing prompts often involves a lot of trial and error, and the number of examples and words used is very important. Its cost-benefit analysis depends heavily on how many times the prompt will be used. At the time of writing, there are over 9,000 ideas on IdeaAI and in my opinion, optimization makes sense there.
Pieter, please, if you are reading this, I would be happy to hear your feedback and see how much your prompt differs. And please keep shipping. You are a huge inspiration.