Introduction
In my recent article, New ChatGPT Prompt Engineering Technique: Program Simulation, I explored a new category of prompt engineering techniques that aim to make ChatGPT-4 behave like a program. While working on it, what struck me in particular was the ability of ChatGPT-4 to self-configure functionality within the confines of the program specifications. In the original program simulation prompt, we rigidly defined a set of functions and expected ChatGPT-4 to maintain the program state consistently. The results were impressive and many readers have shared how they have successfully adapted this method for a range of use cases.
But what happens if we loosen the reins a bit? What if we give ChatGPT-4 more leeway in defining the functions and the program’s behavior? This approach would inevitably sacrifice some predictability and consistency. However, the added flexibility might give us more options and is likely adaptable across a broader spectrum of applications. I have come up with a preliminary framework for this entire category of techniques which is shown in the below figure:
Let’s spend a little of time examining this chart. I have identified two key dimensions that are broadly applicable to the way program simulation prompts can be crafted:
- Deciding how many and which functions of the program simulation to define.
- Deciding the degree to which the behavior and configuration of the program is autonomous.
In the first article, we crafted a prompt that would fall into the “Structured Pre-Configured” category (purple dot). Today, we are going to explore the “Unstructured Self-Configuring” approach (blue dot). What is useful about this diagram is that it provides a concise conceptual roadmap for crafting program simulation prompts. It also provides easy to apply dimensionality for experimentation, adjustment and refinement as you apply the technique.
Unstructured Self-Configuring Program Simulation Prompt
Without further ado, let’s begin our examination of the “Unstructured Self-Configuring Program Simulation” approach. I crafted a prompt whose purpose is to create illustrated children’s stories as follows:
“Behave like a self-assembling program whose purpose is to create illustrated children’s stories. You have complete flexibility on determining the program’s functions, features, and user interface. For the illustration function, the program will generate prompts that can be used with a text-to-image model to generate images. Your goal is to run the remainder of the chat as a fully functioning program that is ready for user input once this prompt is received. ”
As you can see, the prompt is deceptively very simple. This may be appealing in an era where prompts are getting long, confusing and so specific that they are difficult to tailor to different situations. We have given GPT-4 full discretion over function definition, configuration and program behavior. The only specific instructions are aimed at guiding the output for illustrations to be prompts that can be used for text-to-image generation. Another important ingredient is that I have set a goal that the chat model should strive to accomplish. One final thing to note, is that I used the term “self-assembling” as opposed to “self-configuring”. You can try both, but “self-configuring” tends to nudge ChatGPT into simulating an actual program/user interaction.
“Behave like” vs. “Act like”
It’s also worth highlighting another distinct word choice in the prompt. You have all encountered the guidance to use “Act like an expert of some kind or other” in your prompts. In my testing “Act Like” tends to guide chat models toward persona-driven responses. “Behave like” offers more flexibility especially when the aim is for the model to operate more like a program or a system. And, it can be used in the persona-centric contexts as well.
If all went as planned, the resulting output should look something like this (note: you will all see something a little different.)
That looks and feels like a program. The functions are intuitive and appropriate. The menu even goes as far as including “Settings” and “Help & Tutorials”. Let’s explore those since I will admit, they were unexpected.
The “Settings” presented are very helpful. I’ll make some selections to keep the story short, and to set the language and vocabulary level to “Beginner.”
Since we are interested in examining the ability of the model to autonomously self-configure the program, I will combine the setting changes into one line of text and see if it works.
The settings update is confirmed. The menu choices that follow are completely free-form but appropriate for the context of where we are in the “program.”
Now let’s check “Help & Tutorials”
And from there let’s take a closer look at “Illustration Prompts & Generation.”
Again, very helpful and nothing short of impressive as we defined none of this in our program definition.
I will navigate back to the main menu and launch into creating a new story.
It’s a nice and simple little story that is 3 pages long and geared at a beginner vocabulary level (exactly as we specified in our settings). The functions presented again make sense for where we are in the program. We can generate illustrations, modify the story or exit to the main menu.
Let’s work on our illustration prompts.
I have not included the text generated for the other illustration prompts but they are similar to the one you see above for page 1. Let’s provide the illustration prompt as-is to MidJourney to produce some images.
“A cute brown teddy bear with big, round eyes sitting on a window sill of a little blue house in a peaceful town.”
Very nice. This step was manual and we have the additional challenge of getting consistent illustrations across all three pages. It can be done with MidJourney but requires uploading one of the images to use as a base to generate the additional images. Perhaps DALL·E 3 will include capabilities that will allow this to be done seamlessly. At a minimum the functionality announced by OpenAI indicates we can generate the images directly in ChatGPT.
Let’s “Save and Exit” and see what happens in our ChatGPT dialogue:
And now, let’s try and “Load Saved Story”.
“The Lost Teddy” was “saved” and when I instruct it to “Open” it recalls the entire story and all the illustration prompts. At the end it provides this self-assembled menu of functions:
Ok. Let’s stop here. You can proceed to generate your own stories if you’d like but keep in mind, that due to the prompt’s design, the resultant behavior will be different for everyone.
Let’s move on to some overarching conclusions and observations.
Conclusions and Observations
The Unstructured Self-Configuring Program Simulation technique showcases powerful capabilities stemming from a simple prompt that provides a clear and concise objective but otherwise gives the model broad discretion.
How might it be useful? Well, maybe you don’t know how to define the functions that you want your program simulation to perform. Or you have defined some functions but are not sure if there are others that might be useful. This approach is great for prototyping and experimenting and ultimately devising a “Structured Pre-Configured Program Simulation” prompt.
Given that program simulation naturally integrates elements of techniques like Chain of Thought, Instruction Based, Step-by-Step, and Role Play, it is a very powerful technique category that you should try and keep handy as it aligns with a broad cross-section of use cases for chat models.
Beyond Generative Chat Models and Towards a Generative Operating System
As I continue to dive deeper into the program simulation approach, I definitely have a better grasp of why Sam Altman of OpenAI stated that the significance of prompt engineering might wane over time. Generative models may evolve to such an extent, that they go well beyond generating text and images and instinctively know how to perform a given set of tasks to reach a desired outcome. My latest exploration makes me think that we are nearer to this reality than we may have thought.
Let’s consider where generative AI may be headed next and to do so, I think it is helpful to think of generative models in human terms. Using that mindset let’s consider how people attain proficiency in a given area of competence or knowledge domain.
- The person is trained (either self-trained or externally trained) using domain specific knowledge and techniques in both supervised and unsupervised settings.
- The person’s abilities are tested relative to the competence area in question. Refinements and additional training are provided as needed.
- The person is asked (or asks themselves) to perform a task or accomplish a goal.
That sounds a lot like what is done to train generative models. A key distinction does however surface in the execution phase or the “ask”. Typically, proficient individuals do not need detailed directives.
I believe that in the future, when interacting with generative models, the mechanics of the “ask” will more closely resemble our interaction with proficient humans. For any given task, models will exhibit a profound ability to understand or infer the objective and desired outcome. Given this trajectory, it should be no surprise to see the emergence of multi-modal capabilities, such as the integration of DALL·E 3 with ChatGPT, and ChatGPT’s newly announced abilities to see, think, and hear. We might eventually see the emergence of a meta-agent that essentially powers the operating systems of our gadgets — be it phones, computers, robots, or any other smart device. Some might raise concerns about the inefficiency and environmental impact of what would amount to massive amounts of ubiquitous compute. But, if history serves as an indicator, and these approaches yield tools and solutions that people want, innovation mechanics will kick in and the market will deliver accordingly.
Thanks for reading and I hope you find program simulation a useful approach in your prompt adventures! I am in the midst of additional explorations so be sure to follow me and get notified when new articles are published.
Unless otherwise noted, all images in this article are by the author.