Generate synthetic dataset
Explore how to generate synthetic datasets in Scikit-Learn for both classification and regression tasks. Understand key parameters for creating controlled data distributions, visualize sample data, and prepare artificial datasets to enhance your machine learning experiments.
We'll cover the following...
In the last lesson, we showed how to load the built-in dataset.
In addition to those built-in datasets, scikit-learn also provides some functions that could generate data that follows some distributions.
Generate classification dataset
As we have already mentioned above, scikit-learn provides some functions to build artificial datasets. As you can see from the code below, make_classification generates a random n-class classification dataset.
Notice: The default number of the class is
2, you can change it from the parametern_classes.
In this example below, the data has two ...