How to implement the SmartDataframe of PandasAI
PandasAI is a Python library that extends the capabilities of pandas by providing natural language processing (NLP) capabilities. It uses a large language model (LLM) to generate Python code to answer questions about data, perform data analysis, and generate visualizations. In this answer, we will learn how to use PandasAI for data analysis with a dataframe.
What is SmartDataframe?
SmartDataframe is a class in PandasAI that provides a high-level interface to the library. It allows users to interact with their data in natural language to answer their questions or perform the desired task. We can interact with it in natural language to answer questions about our data, perform data analysis, and generate visualizations.
Implementation
Let’s see the implementation of SmartDataframe in Python to interact with it in natural language.
import pandas as pd
from pandasai import SmartDataframe
# SampleDataFrame
df = {
"Movie Title": ["The Shawshank Redemption", "The Godfather", "Pulp Fiction", "The Dark Knight", "Forrest Gump", "Inception", "Schindler's List", "The Matrix", "Fight Club", "The Lord of the Rings: The Fellowship of the Ring"],
"Year": [1994, 1972, 1994, 2008, 1994, 2010, 1993, 1999, 1999, 2001],
"IMDb Rating": [9.3, 9.2, 8.9, 9.0, 8.8, 8.8, 8.9, 8.7, 8.8, 8.8],
"Runtime (minutes)": [142, 175, 154, 152, 142, 148, 195, 136, 139, 178],
"Genre": ["Drama", "Crime", "Crime", "Action", "Drama", "Action", "Biography", "Action", "Drama", "Adventure"]
}
from pandasai.llm import OpenAI
llm = OpenAI(api_token="OpenAI_API_key")
df = SmartDataframe(df, config={"llm": llm})
answer = df.chat('What are the five best movies?')
print(answer)Note: Make sure to replace
OPENAI_API_KEYwith your actual OpenAI API key.
Code explanation
Line 2: We import SmartDataframe from pandasai to answer our questions i.e., for data analysis.
Lines 5–11: We create a sample dataframe of movies including its IMDb rating, its Genre, its Runtime (minutes) and its Year of release.
Lines 12–13: We import and initialize the OpenAI language model (referred to as llm here) from the pandasai.llm module.
Lines 15–17: We instantiated a SmartDataframe object to interact with it in natural language to answer questions about our data.
Now, let's try a new prompt and observe the response generated by the LLM of PandasAI.
import pandas as pd
from pandasai import SmartDataframe
# SampleDataFrame
df = {
"Movie Title": ["The Shawshank Redemption", "The Godfather", "Pulp Fiction", "The Dark Knight", "Forrest Gump", "Inception", "Schindler's List", "The Matrix", "Fight Club", "The Lord of the Rings: The Fellowship of the Ring"],
"Year": [1994, 1972, 1994, 2008, 1994, 2010, 1993, 1999, 1999, 2001],
"IMDb Rating": [9.3, 9.2, 8.9, 9.0, 8.8, 8.8, 8.9, 8.7, 8.8, 8.8],
"Runtime (minutes)": [142, 175, 154, 152, 142, 148, 195, 136, 139, 178],
"Genre": ["Drama", "Crime", "Crime", "Action", "Drama", "Action", "Biography", "Action", "Drama", "Adventure"]
}
from pandasai.llm import OpenAI
llm = OpenAI(api_token="OpenAI_API_key")
df = SmartDataframe(df, config={"llm": llm})
answer = df.chat('Which is the second best movie of Crime genre?')
print(answer)Unlock your potential: PandasAI series, all in one place!
If you've missed any part of the series, you can always go back and check out the previous Answers:
What is PandasAI?
Understand the basics of PandasAI and how it enhances data analysis with AI-driven capabilities.How to use PandasAI with a CSV file
Learn how to integrate PandasAI with CSV files for efficient data processing and analysis.How to use PandasAI with an Excel file
Discover how to leverage PandasAI to analyze and manipulate Excel files effortlessly.How to implement the SmartDataframe of PandasAI
Explore the SmartDataFrame feature of PandasAI and how it simplifies complex data operations.
Free Resources