Best Practices for Prompt Engineering¶

When writing these input prompts to CodeWhisperer like the natural language comments, one important concept is prompt engineering. Prompt engineering is the process of refining interactions with large language models (LLMs) in order to better refine the output of the model. In this case, we want to refine our prompts provided to CodeWhisperer to produce better code output.

Practice 1: Keep prompts specific & concise¶

When crafting prompts for CodeWhisperer, conciseness while maintaining objectives in your prompt is important. Overly complex prompts lead to poor results. A good prompt contains just enough information to convey the request clearly and concisely

Example:

*Good prompt*
# create a function that eliminates duplicates lines in a text file

*Not effective prompt*
# create a function to look for lines of code that are seen multiple times throughout the file and delete them

In summary, focused, straightforward prompts helps CodeWhisperer understand exactly what you want and provide better outputs

Practice 2: Additional context in prompts¶

The additional context can aid in prompt engineering. While specific and concise prompts are crucial, some additional context can aid CodeWhisperer comprehension
It provides more information to help the model understand the task, topic, and goal of the requested response

Example:

# Create a function called "remove_duplicate_lines" that takes a filename as a parameter

with open('input.txt') as rf, open('output.txt', 'w') as wf:

  # Read all lines from the input file and save them in a set
  # A set will only contain unique elements so it acts as our duplicate check
  seen = set()
  # Iterate over each line  
  for line in rf:
    # Check if the current line was not seen before
    if line not in seen:
      # Write it to the output file
      wf.write(line)
      # Add current line to our seen set
      seen.add(line)


# The function should:
# 1. Open the file and read each line 
# 2. Check if the line has already been seen, if so skip it
# 3. If it's unique, write it to a new file
# 4. Close both files
# 5. Return the name of the new file

Practice 3: Utilizing multiple comments¶

The multiple comments can be a useful technique in prompt engineering. When used strategically, multiple comments allow prompt engineers to offer more context without sacrificing brevity or cluttering the prompt
It allows you to break down a problem or task into smaller sub-problems or steps. This helps models like CodeWhisperer better understand what is being asked at a granular level
Each comment can focus on a specific aspect or requirement. This keeps individual comments focused and clear
You can experiment with the ordering of comments to see how it impacts the response. Different orderings may work better for certain types of problems.

Example:

# Create a function that removes duplicate lines from a text file
# The function should take a filename as a parameter
# Open the input file and read all lines into a set
# A set will only contain unique elements
# Iterate over each line of the file
# Check if the line is already in the set. If so, skip it. If not, add it to the set and write to the output file
# Close both files
# Return the name of the output file

Practice 4: Context taken from comments and code¶

Context taken from comments and code can be an effective technique when prompting CodeWhisperer. It allows the model to better understand the overall task or problem based on multiple pieces of contextual information
The code sample provides structural or syntactical context that helps guide the model towards generating syntactically valid code snippets in the correct programming language or style

Example:

import pandas as pd
#open a csv file and return a list of lines in alphabetical order
#Insert a period at the end of each line
#Replace duplicate lines with a single line
def open_csv(filename):
    df = pd.read_csv(filename)
    df = df.sort_values(by='line')
    df = df.drop_duplicates(subset='line')
    df['line'] = df['line'] + '.'
    return df['line'].tolist()

By seeing Pandas imported, CodeWhisperer understands our intent is likely to leverage it in the solution. This allows it to provide a more relevant recommendation using Pandas functions like read_csv(), sort_values(), and drop_duplicates()

Practice 5: Prompts with cross file context¶

When prompts with cross file context, CodeWhisperer can generate code that is more coherent and consistent with the existing codebase. This helps maintain a unified coding style and architecture
CodeWhisperer is using context from one file to generate code recommendation in another file. CodeWhisperer was able to analyze the function, understand its purpose and interface, and generate a set of code to validate it.

Example:

#create unit tests for the open_csv function from example4.py file
class TestOpenCsv(unittest.TestCase):
   def test_open_csv(self):
       self.assertEqual(open_csv('example4.csv'), ['a.', 'b.', 'c.'])
       self.assertEqual(open_csv('example4.csv'), ['a.', 'b.', 'c.', 'd.'])

In this example, we will prompt CodeWhisperer to write a comment referencing the open_csv function in order to write unit tests. With our prompts, we were able to utilize CodeWhisperers cross file context to help us generate unit tests

Practice 6: Chain of thought prompting¶

Chain of thought prompting is a prompt engineering technique that allows large language models (LLMs) to have more natural, contextual outputs by linking multiple prompts together to solve a complex problem
In regard to CodeWhisperer, we can use this technique to break a very complex coding task down into smaller steps, allowing CodeWhisperer to provide more accurate suggestions to the use case

Example:

With single prompt approach

import logging
'''
Using the input() function and store it in a variable called filename and create a function 
that will validate the input using the isalnum() method and ensure the file ends in .csv then process 
the file accordingly. 
'''
def validate_file(filename):
    if filename.isalnum() and filename.endswith('.csv'):
        return True
    else:
        return False

With chain of thought approach

import logging
# Comment 1
# Take a users input using the input() function and store it in a variable called filename
filename = input("Enter the name of the file you want to read: ")

# Comment 2
# create a function that will take a filename as an input
def open_file(filename):
    # Comment 3
    # Validate the input using the isalnum() method and ensure the file ends in .csv then process the file using logging.info() 
    if filename.isalnum() and filename.endswith('.csv'):
        lines = open_csv(filename)
        logging.info(lines)
        return lines
    else:
        print('Invalid file name')
        return None