Prompt engineering emerged with the release of OpenAI’s GPT. The world has heavily experimented with LLM prompts, ranging from simple queries to cool innovative prompts, to test the capabilities of LLMs. But for more domain-specific tasks, we often need the output in a certain format to save cost on API calls (or system memory).
It is often easier and cheaper to get the desired output from LMs by prompting them rather than fine-tuning. This blog will look at some prompt engineering techniques to get better-structured outputs from LMs.
Prompt Engineering
Prompt Engineering, also known as In-Context Prompting, refers to methods for communicating with LLM to steer its behavior toward desired outcomes without updating the model weights. It is an empirical science, and the effect of prompt engineering methods can vary widely among models, thus requiring heavy experimentation and heuristics.
Best Practices of Prompt Engineering
Zero-Shot Learning
Zero-shot learning prompts the model with your question and asks for results.
E.g., Sentiment Analysis example
text: i'll bet sushi tastes better than ramen.
sentiment:
Clear and specific instructions
Providing clear and specific instructions within the prompt helps the model to generate the desired output. Try to remove any ambiguity from the prompts because if present model might generate unexpected output.
Prompt: Describe the order of Olympic circles
Response: The order of the Olympic rings, from left to right, is:Blue, Yellow, Black, Green, Red
Using explicit constraints
In some cases, it might be required to get output in a specific format, length, or scope. You can specify your constraints in your prompt, and it will help the models generate responses that adhere to the desired limitations.
Prompt: Summarize the working of air conditioner in three sentences
Response: An air conditioner acts like a heat pump, moving unwanted warmth outdoors. Inside, warm air is drawn across a cold coil filled with refrigerant, absorbing heat and making the air cool. The refrigerant itself then travels outside, where it releases the heat and condenses back into a liquid before returning inside to repeat the cycle.
Controlling output verbosity
Adjusting the verbosity of model’s responses can be beneficial for obtaining information at the desired level of detail. You can control verbosity by explicitly requesting responses of a specific length or by asking the model to provide more or less information.
Prompt: Briefly explain working of a car engine
Response: Car engines rely on repeated controlled explosions. Inside each cylinder, a piston sucks in a mix of air and fuel (intake stroke). This mixture is then compressed (compression stroke) for a stronger bang. A spark plug ignites the mixture, causing a small explosion that pushes the piston down (power stroke). Finally, the burned gas is expelled (exhaust stroke) and the cycle repeats. This up-and-down motion of the pistons translates to rotation via a crankshaft, which ultimately powers the car's wheels.
Iterative testing and refining
One of the most effective ways to improve prompt engineering is through an iterative process of testing and refining. Continuously refining your prompts based on the generated responses helps to identify areas that require improvement and fine-tune the model’s behavior.
Initial Prompt: Briefly explain about working of a car engine
Initial Response: Car engines rely on repeated controlled explosions. Inside each cylinder, a piston sucks in a mix of air and fuel …
Refined Prompt: Provide a detailed explanation on working of car engine, including its various stages and processes.
Refined Response: Car engines are marvels of controlled explosions, meticulously converting fuel and air into the power that drives your car. Let's delve into the four-stroke cycle, the heart of this process …
Temperature and token control
Adjusting the temperature and token count of the model’s responses can significantly impact the generated content. Lower temperature values result in more focused and deterministic outputs, while higher values lead to more diverse and random/creative responses.
Adapting prompts for domain-specific applications
You can set the domain or industry and tailor it in your prompt to get more accurate and relevant information from the model.
Example:
Domain: Web development
Prompt: “Explain the concept of cookies and its importance in modern web development.”
Response: Cookies are small pieces of data stored on the user's device by the web browser while they are browsing a website. They are used to remember information about the user, such as login status, preferences, and other session details. Here’s a brief overview of their importance in modern web development:
Session Management
Personalization
Tracking and Analytics
Targeted Advertising
Few-shot Learning.
Few-shot learning allows you to show high quality examples to the model so it understand the intent and generate desired output for the task. It often leads to better performance than zero-shot learning. One drawback of this technique is high consumption of tokens. If after adding examples and the actual task in the prompt, you might exceed token limit.
Text: as a singular character study, the drama was perfect
Sentiment: positive
Text: once you get through the accents, all the performances becomes abysmal and pain to watch
Sentiment: negative
Text: the paintings provide a beauty, conquer over fear, exposed as history corners them
Sentiment: positive
Text: this is one for the ages & great things to come
Sentiment:
In few-shot learning, the LM can get biased with the provided examples due to the following reasons:
- Majority label bias - It exists if there is an unbalanced distribution of labels among the examples
- Recency bias - When the model repeats the labels from the last/recent examples
- Common token bias occurs when LMs tend to produce common tokens due to their high frequency in the text, while rare tokens are often ignored.
To avoid such bias, you can calibrate the label probabilities output by the model to be uniform when the input string is N/A.
Tips, for example, selection & ordering
You can use k-nn clustering in the embedding space to test output of semantically similar examples.
You can use graph-based approaches to select a diverse and representative set of examples by not selecting the nodes' many neighbors.
A good practice would be to keep the examples random and diverse to avoid majority label bias and recency bias.
You can order the examples to have a balanced distribution of labels.
Instruction Prompting
Few-shot can be expensive in token usage and may restrict the length of the input token, which can cause a loss of context.
When interacting with instruction models, we should describe the task in detail, covering every constraint and specifying what to do in case of out-of-context data.
[Instruction + Task]: Given a sentence, translate it from English to French.
[Example 1]:
Input: The cat is on the table.
Output: Le chat est sur la table.
[Example 2]:
Input: I am going to the market.
Output: Je vais au marché.
[Question]:
Input: She loves reading books.
Output: [Your model's response here]
Chain-of-Thought (CoT)
The model generates reasoning logic step by step, known as reasoning chains or rationales in the sequence of sentences, to eventually lead to the final answer.
Two main types of CoT prompting:
Few-shot CoT prompts the model with a few demonstrations of high-quality reasoning chains and manually explicitly shows steps to solve a problem.
Prompt: Mary has 10 candies. She eats 3 candies and then buys 7 more. How many candies does Mary have now? Let's think step by step.
Response:
Mary starts with 10 candies.
She eats 3 candies.
So, she has, 10−3=7 candies left.
Then she buys 7 more candies.
Now she has 7+7=14 candies.
So, Mary has 14 candies.
Prompt: John has 15 books. He gives 5 books to his friend and then receives 8 more books as a gift. How many books does John have now? Let's think step by step.
Response: ...
Zero shot CoT
In zero-shot prompting, the model is asked to solve a problem with reasoning steps without being provided with examples.
Prompt: Mary travels 5 miles. She reaches the destination in 10 mins. What is the speed of Mary? Let's think step by step.
Optimize your AI outputs with best practices in prompt engineering! Explore practical data science services that enhance AI performance. Reach out for more information!
AUTHOR - FOLLOW
K C Prakash
Associate Manager, Data Science
Topic Tags