How to use YData Profiling for exploratory data analysis?
YData Profiling makes data analysis easy and fast. It is an open-source Python module that conducts exploratory data analysis and generates web-based interactive reports with just a few lines of code. It is a very vast library offering many functions to understand our data. It generates comprehensive reports on complex data with numerous features to help us understand our data easily and quickly.
Now, we’ll look at how to install ydata-profiling and use it to create an interactive report for a given dataset.
Installation
The ydata-profiling module can be easily installed using the pip command provided below:
pip install ydata-profiling
Syntax
The ydata-profiling module contains a method called ProfileReport that generates the report for the provided dataset.
ydata_profiling.ProfileReport(df, **kwargs)
Parameters
The ProfileReport function takes in one necessary parameter along with multiple optional parameters to further customize the report.
Note: Only the
dfparameter is required. The rest are optional to customize the report.
Arguments | Type | Description |
| DataFrame | Dataset to be analyzed |
| boolean | If |
| string | Title for the report, shown in the header and title bar. |
| int | Number of wrokers in the pool. Default value is the number of CPU. |
Example
The following code shows how we can use the ProfileReport function in pandas:
from flask import Flask,render_template
import pandas as pd
from ydata_profiling import ProfileReport
app = Flask('__name__',template_folder='template')
@app.route('/')
def home():
data = pd.read_csv("IRIS.csv")
profile = ProfileReport(data)
profile.to_file("Profiling_Report_Results.html")
return render_template("Profiling_Report_Results.html")
if(__name__=='__main__'):
app.run(debug=True,host="0.0.0.0",port=5000)
Explanation
Line 1: We use
from flask import Flask,render_templateto import the flask library.Line 2: We use
import pandas as pdto import the pandas library.Line 3: We use
from ydata_profiling import ProfileReportto import ProfileReport method from the pandas-profiling library.Line 4: We create a flask app using
app = Flask('__name__',template_folder='template').Line 8: We read the dataset using the
pd.read_csv()method.Line 9: We use
profile = ProfileReport(data)to generate the report for the dataset.Line 10: We write the result in an
.htmlfile using theprofile.to_file()function.
Free Resources