What is sklearn.datasets.fetch_olivetti_faces() in Python?
fetch_olivetti_faces() from the sklearn.dataset module is used to load or fetch
Dataset overview
There are 10 different 64x64 images in Olivetti faces dataset each of which has 40 distinct subjects or classes. The subjects are lighting conditions, facial expressions etc. Moreover, it contains 400 samples (no_of_images*classes) with dimensionality of 4096. The target or predicting values are integers between 0 & 39 indicating the identity of the person.
Details
Type | Description |
Classes | 40 |
Total samples | 400 |
Dimensionality | 4096 |
Features | real values between 0 and 1 |
Syntax
sklearn.datasets.fetch_olivetti_faces(*,
data_home=None,
shuffle=False,
random_state=0,
download_if_missing=True,
return_X_y=False
)
Parameters
data_home: This parameter is of typestr. It helps to specify another cache and download the folder for datasets.shuffle: This parameter is of typebooland its default value isFalse. If we set its value asTrue, the order of the images will shuffle to avoid the same images being grouped.random_state: Its default type isintand its default value is 0. It will define the random number to shuffle the dataset.download_if_missing: Its type isbooland its default type isTrue. If it isFalsethenIOErrorwill occur. This error will be raised if the data is not available locally.return_X_y: This parameter has the typebooland its default value isFalse. Thedataandtargetobjects (data, target) will be returned instead of theBunchobject if it istrue.
Return value
Data: It is a dictionary-like object with multiple attributes like:
-
data:ndarray, shape (400, 4096), etc. Every row of this attribute is parallel to the image having the original size of64X64pixels. -
target: array of shape (400): The labels are related to every face image. These labels have a range from 0 to 30. They correspond to the subject IDs. -
images:ndarray, shape (400,64,64): Every row is a face image that is parallel to one of the 40 subjects or classes of the dataset. -
DESC: It shows the description of the modified Olivetti Faces Dataset. -
(data, target): Tuple if thereturn_X_yis set asTrue.
Note:
shape(400)shows a one dimensional array of labels.
Explanation
The code mentioned below helps to understand the working of the fetch_olivetti_faces(*[, …]) method.
# Load useful librariesimport numpy as npimport matplotlib.pyplot as pltfrom sklearn.datasets import fetch_olivetti_faces# To check RGB Images for dimdef is_colored(image):# Check for three channelsif len(image.shape) == 3:R, G, B = image[:, : , 0], image[:, :, 1], image[:, :, 2]if (R == G).all() and (G == B).all():return Truereturn False# method to show images as griddef show_images(images, grid=True, total_cols=2, figsize=(30, 20)):assert len(images) > 0assert isinstance(images[0], np.ndarray)# extracting length of images i.e 6totalImages = len(images)total_cols = min(totalImages, total_cols)total_rows = int(totalImages / total_cols) + (1 if totalImages % total_cols != 0 else 0)# Create a grid of subplots.fig, axes = plt.subplots(total_rows, total_cols, figsize=figsize)# Create list of axes for easy iteration.if isinstance(axes, np.ndarray):list_axes = list(axes.flat)else:list_axes = [axes]# it will helps to show total images as gridfor i in range(totalImages):img = images[i]list_axes[i].imshow(img, cmap='gray')list_axes[i].grid(grid)for i in range(totalImages, len(list_axes)):list_axes[i].set_visible(False)# loading datasetimage_data = fetch_olivetti_faces()# creating list of 6 imagesimages = [image_data.images[0], image_data.images[1], image_data.images[2],image_data.images[3],image_data.images[4],image_data.images[5]]# Using show_images method to display imagesshow_images(images, figsize=(30, 20))
- Lines 7-13: These lines of code help to tackle colored images.
- Lines 15-36:
show_images()method will print a list of images as a grid. - Line 39: Fetching Olivetti faces data set to
image_datafrom AT&T archives. - Line 41: Creates a list of 6 images as the
imagesvariable. - Line 43: Showing images as grid and calling
show_images()method.