Generating datasets for clustering

Synthetic is fully legal and meets all the requirements of privacy laws and regulations around the world. It's a valid, privacy-conscious alternative to raw data. The make_blobs() function can be used to generate data points with a Gaussian (or normal) distribution.

In this exercise, you will generate a dataset of 15000 samples.

numpy has already been imported as np, and the custom function plot_data_points() has been provided again for this exercise.

Import the corresponding function from the datasets module for generating clustering datasets.
Generate a dataset of 15000 samples with 2 features, 2 centers, and a cluster standard deviation of 3.
Print the shape of the resulting generated data.
Inspect the resulting data points in a 2-dimensional scatter plot.

övning

Generating datasets for clustering

Instruktioner

.css-6su6fj{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;}övning

Instruktioner

övning