Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A prompt pattern catalog to enhance prompt engineering with ChatGPT (arxiv.org)
130 points by aliparlakci on June 5, 2023 | hide | past | favorite | 25 comments


While somewhat useful - there are no systematic ablation and comparison studies, datasets, and quantitative evals here. It's all anecdotal, so should probably be a blogpost and not a "paper"


"systematic ablation" - https://stats.stackexchange.com/questions/380040/what-is-an-.... How do you ablate a black box? You work with the inputs. I agree that this article lacks rigor, but I don't mind some structured thinking about what seems to work. I don't think the authors over-promised here.


I think of this paper as being in the same category as the GoF Design Patterns book.

Short video summary: https://youtu.be/ueRuMDb-cPo


Couple more that I found useful:

>List all the entities in the text that are ambiguous. Entities are ambiguous if you don't understand them, if they have multiple meanings and you can't decide which one from the context, or are phraseal terms whose meaning can be different in different cultures.

This allows the LLM to identify parts of previous prompt that the AI did not fully understand.

>List all the entities and relationships to other entities. Be precise, avoid duplicates, list each relationship only once.

This allows to compress the past context for longer tasks.

>Make a list of Google searches needed to build a factual answer, one sentence per search.

You get the list then

>For each sentence extract the main entity, decide whether to find more information about it on Google, wikipedia, ..., answer in json

This is my "ghetto" agent, it's not iterative, but it's stable enough to handle the unexpected

> It's the context enough to answer factually the user question? Answerer yes or no. Only answer yes if you are sure you can answer

To make the agent able to loop, or ask user for more context.


It's maybe interesting to think about a simple chat history over time giving some of the knowledge needed for improved interactions as related to the more complex patterns outlined here.

Here's a similar paper I ran across this morning: https://arxiv.org/abs/2305.18323. Github is here: https://github.com/billxbf/ReWOO

I indexed both documents with my own project, which uses semantic graphs to help with prompt assembly: https://github.com/FeatureBaseDB/DoctorGPT. DoctorGPT doesn't have dynamic prompt chaining yet, but I'm working on it. I hesitate posting any of the analysis of these papers using DoctorGPT here because it would be generated by the LLM, and not me...and some people seem to have an issue with that given this is a human forum.

My sense is that SKGs are important in refining questions, offering alternative approaches, managing context, reflecting on LLM responses, and more.


I really like the troubleshooting advice for the installation of DoctorGPT:"These instructions are long, so ensure you follow them carefully. It is suggested you use ChatGPT to assist you with any errors. Simply paste in the entire content of this README into ChatGPT before asking your question about the install process."


I think you shouldn't call it "prompt engineering" until you do some real "engineering". For example, if you use Guidance from microsoft, where you write a prompt "template" and let guidance filter the logits probabilities at the corresponding points for you, I claim that this is "engineering (the way we use it as SWEs anyway) because it's analogous to writing an actual program.

I was actually upset about everyone claiming they're a "prompt engineer" without any real engineering that I wrote a snarky github gist about it that was on the front page for a big. It's mainly pointing out that NLP can't even do real prompt engineering like Stable Diffusion folks can because we didn't build the right tooling for it yet.

https://gist.github.com/Hellisotherpeople/45c619ee22aac6865c...


I recognize there's plenty of catnip here when it comes to calling this "engineering" or not, however, whatever you want to call it (prompt fiddling?), the techniques are crucial if you want to achieve reasonably consistent output from current-state LLMs. As models improve concerns about context window limitations will be reduced and it will be easier to discern user intent.

These are good straight-to-the-point guides:

- Prompt Engineering by BrexHQ: https://github.com/brexhq/prompt-engineering

- OpenAI guidance: https://help.openai.com/en/articles/6654000-best-practices-f...

- https://devblogs.microsoft.com/dotnet/gpt-prompt-engineering...

- (great examples): https://www.deeplearning.ai/short-courses/chatgpt-prompt-eng...

- (~hr) Karpathy talk: https://www.youtube.com/watch?v=bZQun8Y4L2A

tl;dr:

- Begin with the best and most forgiving model available, optimizing it later if necessary.

- Write your prompts in a clear, specific, and detailed manner

- Be mindful, however, of the tradeoff between including more detail and increasing latency or cost.

- If you're doing any complex reasoning, ask the model to show its work, help "stretch out" computation over more tokens

- "Escape" any included or quoted text properly to avoid confusing the model.

- Keep track of your token budget, considering the context window in conversations and response length.

- Employ an iterative process of measuring, adjusting, and improving your prompt engineering techniques.


Thanks for sharing this list! (parent comment)


Here is a more colorful formatted version of their recipe list.

https://docs.google.com/document/d/1G0TGB16lOf6hhGNUDnrM-kQy...

Not a permalink so recommend making your own local copy if you like it.


Thank you!


How much overlap is there currently between LLM prompt engineering and (for example) academic test/assignment question guidelines?


I can't wait until there is a standardized ChatGPT Bible to guide this emerging machine-priest caste. And these posts will be the equivalent of the dead sea scrolls.


Is it just me that thinks calling it "Engineering" is absolutely ridiculous?


Matches well with the definition of the engineering method provided by Bill Hammack, “the engineer guy” [1]:

> Solving problems using rules of thumb that cause the best change in a poorly understood situation using available resources.

1: https://youtu.be/_ivqWN4L3zU


Lots of other people think that, but I disagree.

Have you tried getting great, repeatable results out of an LLM? It requires great depth of knowledge - about both how LLMs work and the specific topic you are trying to build against - plus a methodical process in figuring out what works and what doesn't.

I see no reason not to label that "engineering".

I also think it's important to distinguish between prompting and prompt engineering. Prompting is when people type prompts in a box. Prompt engineering, by my own definition, is when developers build further software on top of LLMs.


So is medicine “human engineering”?

Then again, I’m a software engineer that doesn’t consider my job engineering and I hate the title.


Isn't it? Wrangling a poorly understood system with partial knowledge and experimental skill seems like engineering to me.


I don’t know. The title ’engineer’ is overloaded. It’s like Sandwich Artist.

If we accept your definition. Then I am a food engineer, I am my own human engineer, I am a coffee engineer, I am child engineer, I am a dog engineer.

The word engineer has lost all meaning and this looks like title inflation.


Engineering is the application of science; as the science here is weak at best, calling it engineering seems a stretch. Perhaps “prompt author” would be more appropriate.


How about prompt construction? I first heard about it from the maintainer of LangChain, and I also came across it in Azure docs: https://learn.microsoft.com/en-us/azure/cognitive-services/o...


I think that there is a science here that people are working to discover. Prompt engineering is slowly turning into its own language and the more it's studied like this I think the more validity there will be to calling it engineering.


It’s not ridiculous

it’s just like the search tool engineers that would engineer the string to put in your google search

/s


Given the term 'software engineer' is now almost universally accepted (outside of legal jurisdictions where engineering = liability) I don't see why prompt engineering is a bad term.


1. It's already taken hold.

2. It looks great on LinkedIn.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: