Prompt Engineering Best Practices for Structured AI Outputs

Data Science

Date : 10/23/2024

Data Science

Date : 10/23/2024

Prompt Engineering Best Practices for Structured AI Outputs

Learn expert prompt engineering techniques for optimizing LLM outputs. Explore best practices for zero-shot learning, controlling verbosity, iterative refining, and more, to enhance AI-driven solutions in data science and industry-specific applications.

K C Prakash

AUTHOR - FOLLOW
K C Prakash
Associate Manager, Data Science

Prompt Engineering Best Practices for Structured AI Outputs
Like the blog

Table of contents

Prompt Engineering Best Practices for Structured AI Outputs

Table of contents

Prompt Engineering Best Practices for Structured AI Outputs

Prompt Engineering Best Practices for Structured AI Outputs

Prompt engineering emerged with the release of OpenAI’s GPT. The world has heavily experimented with LLM prompts, ranging from simple queries to cool innovative prompts, to test the capabilities of LLMs. But for more domain-specific tasks, we often need the output in a certain format to save cost on API calls (or system memory).

It is often easier and cheaper to get the desired output from LMs by prompting them rather than fine-tuning. This blog will look at some prompt engineering techniques to get better-structured outputs from LMs.

Prompt Engineering

Prompt Engineering, also known as In-Context Prompting, refers to methods for communicating with LLM to steer its behavior toward desired outcomes without updating the model weights. It is an empirical science, and the effect of prompt engineering methods can vary widely among models, thus requiring heavy experimentation and heuristics.

Best Practices of Prompt Engineering

Zero-Shot Learning

Zero-shot learning prompts the model with your question and asks for results.

E.g., Sentiment Analysis example

text: i'll bet sushi tastes better than ramen.

sentiment:

Clear and specific instructions

Providing clear and specific instructions within the prompt helps the model to generate the desired output. Try to remove any ambiguity from the prompts because if present model might generate unexpected output.

Prompt: Describe the order of Olympic circles

Response: The order of the Olympic rings, from left to right, is:Blue, Yellow, Black, Green, Red

Using explicit constraints

In some cases, it might be required to get output in a specific format, length, or scope. You can specify your constraints in your prompt, and it will help the models generate responses that adhere to the desired limitations. 

Prompt: Summarize the working of air conditioner in three sentences

Response: An air conditioner acts like a heat pump, moving unwanted warmth outdoors. Inside, warm air is drawn across a cold coil filled with refrigerant, absorbing heat and making the air cool. The refrigerant itself then travels outside, where it releases the heat and condenses back into a liquid before returning inside to repeat the cycle.

Controlling output verbosity 

Adjusting the verbosity of model’s responses can be beneficial for obtaining information at the desired level of detail. You can control verbosity by explicitly requesting responses of a specific length or by asking the model to provide more or less information.

Prompt: Briefly explain working of a car engine

Response: Car engines rely on repeated controlled explosions. Inside each cylinder, a piston sucks in a mix of air and fuel (intake stroke). This mixture is then compressed (compression stroke) for a stronger bang. A spark plug ignites the mixture, causing a small explosion that pushes the piston down (power stroke). Finally, the burned gas is expelled (exhaust stroke) and the cycle repeats. This up-and-down motion of the pistons translates to rotation via a crankshaft, which ultimately powers the car's wheels.

Iterative testing and refining 

One of the most effective ways to improve prompt engineering is through an iterative process of testing and refining. Continuously refining your prompts based on the generated responses helps to identify areas that require improvement and fine-tune the model’s behavior.

Initial Prompt: Briefly explain about working of a car engine

Initial Response: Car engines rely on repeated controlled explosions. Inside each cylinder, a piston sucks in a mix of air and fuel …

Refined Prompt: Provide a detailed explanation on working of car engine, including its various stages and processes.
Refined Response: Car engines are marvels of controlled explosions, meticulously converting fuel and air into the power that drives your car. Let's delve into the four-stroke cycle, the heart of this process …

Temperature and token control 

Adjusting the temperature and token count of the model’s responses can significantly impact the generated content. Lower temperature values result in more focused and deterministic outputs, while higher values lead to more diverse and random/creative responses.

Adapting prompts for domain-specific applications 

You can set the domain or industry and tailor it in your prompt to get more accurate and relevant information from the model. 

Example: 
Domain: Web development

Prompt: “Explain the concept of cookies and its importance in modern web development.”

Response: Cookies are small pieces of data stored on the user's device by the web browser while they are browsing a website. They are used to remember information about the user, such as login status, preferences, and other session details. Here’s a brief overview of their importance in modern web development:

  • Session Management
  • Personalization
  • Tracking and Analytics
  • Targeted Advertising

Few-shot Learning.

Few-shot learning allows you to show high quality examples to the model so it understand the intent and generate desired output for the task. It often leads to better performance than zero-shot learning. One drawback of this technique is high consumption of tokens. If after adding examples and the actual task in the prompt, you might exceed token limit.

Text: as a singular character study, the drama was perfect

Sentiment: positive

Text: once you get through the accents, all the performances becomes abysmal and pain to watch

Sentiment: negative

Text: the paintings provide a beauty, conquer over fear, exposed as history corners them

Sentiment: positive

Text: this is one for the ages & great things to come

Sentiment:

In few-shot learning, the LM can get biased with the provided examples due to the following reasons:

  • Majority label bias - It exists if there is an unbalanced distribution of labels among the examples
  • Recency bias -  When the model repeats the labels from the last/recent examples
  • Common token bias occurs when LMs tend to produce common tokens due to their high frequency in the text, while rare tokens are often ignored.

To avoid such bias, you can calibrate the label probabilities output by the model to be uniform when the input string is N/A.

Tips, for example, selection & ordering

You can use k-nn clustering in the embedding space to test output of semantically similar examples.

You can use graph-based approaches to select a diverse and representative set of examples by not selecting the nodes' many neighbors.
A good practice would be to keep the examples random and diverse to avoid majority label bias and recency bias.

You can order the examples to have a balanced distribution of labels.

Instruction Prompting

Few-shot can be expensive in token usage and may restrict the length of the input token, which can cause a loss of context.

When interacting with instruction models, we should describe the task in detail, covering every constraint and specifying what to do in case of out-of-context data.

[Instruction + Task]: Given a sentence, translate it from English to French.

[Example 1]:

Input: The cat is on the table.
Output: Le chat est sur la table.

[Example 2]:

Input: I am going to the market.
Output: Je vais au marché.

[Question]:

Input: She loves reading books.
Output: [Your model's response here]

Chain-of-Thought (CoT)

The model generates reasoning logic step by step, known as reasoning chains or rationales in the sequence of sentences, to eventually lead to the final answer.

Two main types of CoT prompting:

Few-shot CoT prompts the model with a few demonstrations of high-quality reasoning chains and manually explicitly shows steps to solve a problem.

Prompt: Mary has 10 candies. She eats 3 candies and then buys 7 more. How many candies does Mary have now? Let's think step by step.

Response:

Mary starts with 10 candies.

She eats 3 candies.

So, she has, 10−3=7 candies left.

Then she buys 7 more candies.

Now she has 7+7=14 candies.

So, Mary has 14 candies.

Prompt: John has 15 books. He gives 5 books to his friend and then receives 8 more books as a gift. How many books does John have now? Let's think step by step.

Response: ...

Zero shot CoT

In zero-shot prompting, the model is asked to solve a problem with reasoning steps without being provided with examples.

Prompt: Mary travels 5 miles. She reaches the destination in 10 mins. What is the speed of Mary? Let's think step by step.

Optimize your AI outputs with best practices in prompt engineering! Explore practical data science services that enhance AI performance. Reach out for more information!

 

K C Prakash

AUTHOR - FOLLOW
K C Prakash
Associate Manager, Data Science

Topic Tags



Next Topic

Mitigating Hallucination in Large Language Models



Next Topic

Mitigating Hallucination in Large Language Models


Ready to talk?

Join forces with our data science and AI leaders to navigate your toughest challenges.

×
Thank you for a like!

Stay informed and up-to-date with the most recent trends in data science and AI.

Share this article
×

Ready to talk?

Join forces with our data science and AI leaders to navigate your toughest challenges.