Using LLMs to generate code snippets can greatly assist developers in their coding. While LLMs cannot perform as well as a senior engineer (yet), they can be used to create code skeletons or provide useful starting points.
To make the most of this approach, start by setting the context with a clear problem statement or outline of your required code. Proceed by asking specific questions about the code you need, such as function definitions, loop structures, or syntax usage.
If the initial response is unsatisfactory, provide additional context or ask follow-up questions to guide the LLM in the right direction. Once the model generates a code snippet, review it for correctness and adapt it to your requirements.
Note: While LLMs can be a valuable resource, applying your programming expertise to validate and refine the generated snippets is essential.
Let's try to generate some code using an LLM. For this example, we will be using ChatGPT and asking the LLM to help us read the following CSV file in Python.
Name,Age,CityJohn,25,New YorkEmily,30,LondonMichael,35,ParisSophia,28,TokyoDaniel,32,SydneyOlivia,27,BerlinDavid,29,TorontoEmma,31,RomeLiam,26,Madrid
A prompt is an input or query that guides the model to generate a relevant response or output. Let’s start with a short and simple prompt.
Prompt: How can I read a CSV file in Python?
Output:
import csvfilename = "data.csv"with open(filename, "r") as file:reader = csv.reader(file)for row in reader:# Process each row hereprint(row)
The model returned some generic code, which was expected. Let's provide additional context to the model to fill in the # Process each row here
comment. We will pass it to our CSV file as context. While this may work in our case since our file is small, for larger files, sending just the header and their data types might be better.
Newer models such as Google’s Gemini offer context windows of up to 2 million tokens! This can allow us to send entire code bases as context!
Prompt: Process the following CSV file to extract the name and city for each row:
Name,Age,CityJohn,25,New YorkEmily,30,LondonMichael,35,ParisSophia,28,TokyoDaniel,32,SydneyOlivia,27,BerlinDavid,29,TorontoEmma,31,RomeLiam,26,Madrid
Output:
import csvfilename = "data.csv"# Reading data from CSV filewith open(filename, "r") as file:reader = csv.DictReader(file)for row in reader:name = row["Name"city = row["City"]print(f"Name: {name}, City: {city}")
The LLM has used the additional context we provided to process each row based on our requirements. The code appears to be correct; however, let’s validate the code by running it.
While LLMs typically generate safe code, it is good practice to review the code thoroughly before execution. The generated be unsafe or could result in potentially harmful or unexpected behavior.
We have added the generated code in the widget below. Check to see if it works by using the “Run” button.
import csvfilename = "data.csv"# Reading data from CSV filewith open(filename, "r") as file:reader = csv.DictReader(file)for row in reader:name = row["Name"]city = row["City"]print(f"Name: {name}, City: {city}")
Voila! The code is working as expected and printing the names and cities. We can also tune the output format as per our needs.
Modern LLMs excel in code generation and code explanation as well. Let’s prompt the LLM to generate the explanation for the code it generated.
Prompt: Please explain the following code line by line:
import csvfilename = "data.csv"# Reading data from CSV filewith open(filename, "r") as file:reader = csv.DictReader(file)for row in reader:name = row["Name"]city = row["City"]print(f"Name: {name}, City: {city}")
Output:
Certainly! Here's an explanation of each line of the code:
import csv
: This line imports the csv
module, which provides functionality for working with CSV files in Python.
filename = "data.csv"
: This line assigns the filename “data.csv” to the variable filename
. This is the name of the CSV file that we want to read.
with open(filename, "r") as file
: This line opens the CSV file specified by the filename
variable in read mode ("r"
). The with
statement ensures that the file is properly closed after it is processed, even if there are exceptions.
reader = csv.DictReader(file)
: This line creates a DictReader
object called reader
using the csv.DictReader
class. The DictReader
treats the first row of the CSV file as the column headers and allows us to access the data using column names as keys.
for row in reader
: This line starts a loop that iterates over each row in the CSV file. The reader
object acts as an iterator, and each iteration provides a dictionary-like row
object representing a row of data.
name = row["Name"]
: This line retrieves the value of the “Name” column from the current row
using the column name as the key. It assigns the value to the variable name
.
city = row["City"]
: This line retrieves the value of the “City” column from the current row
using the column name as the key. It assigns the value to the variable city
.
print(f"Name: {name}, City: {city}")
: This line prints the extracted values of name
and city
using an f-string. It displays the name and city information for each row in the CSV file.
By executing this code, it reads the CSV file, iterates through each row, extracts the “Name” and “City” values from each row, and prints them out in the specified format.
As you can see, the LLM is able to generate a very verbose explanation for the code. Let's try a more concise explanation prompt.
Prompt: Please give a brief overview (50 words) of the following code:
import csvfilename = "data.csv"# Reading data from CSV filewith open(filename, "r") as file:reader = csv.DictReader(file)for row in reader:name = row["Name"]city = row["City"]print(f"Name: {name}, City: {city}")
Output:
This code imports the CSV module and assigns a file name data.csv
. It then opens the file and reads its contents using the DictReader
function from the CSV module. The code iterates over each row in the file and extracts the values of the “Name” and “City” columns. It then prints the name and city for each row in the file.
Code generation with LLMs involves giving the LLM a prompt describing the desired code’s functionality. The LLM then generates code that (hopefully) matches your description. Learning how to use prompts effectively is a key skill when it comes to using LLMs. It's like giving an AI assistant instructions on what program to write. This can save time for developers and even inspire new coding ideas. However, LLMs are still under development, so generated code might require review and refinement for accuracy and security.