ComenzarEmpieza gratis

Logistics eCommerce model: k-means analysis

Now that you gained your first insight into the model outputs, you can deepen your understanding of patterns and relationships between results using cluster analysis.

You will use the k-means algorithm to help you understand the main controls of your model behavior and classify data points into groups with similar properties. This will help identify bottlenecks in the real-world e-commerce/logistics operation your model is representing.

kmeans and whiten have been imported from scipy.cluster.vq and matplotlib.pyplot as plt. The original and whitened datasets have the column data listed below. The dummy variable p defines the indexes of these processes in the datasets.

  • column 1 (p=0): time_requests
  • column 2 (p=1): time_packaging
  • column 3 (p=2): time_shipping
  • column 4 (p=3): sum/total time

Este ejercicio forma parte del curso

Discrete Event Simulation in Python

Ver curso

Instrucciones del ejercicio

  • Whiten record_processes_np array to prepare it for the k-means clustering.
  • Run the k-means method on whitened array using the SciPy package, setting the k-means method to find three clusters.

Ejercicio interactivo práctico

Prueba este ejercicio completando el código de muestra.

# Whiten the record_processes_np array
whitened = ____(record_processes_np)

# Run the k-means method on whitened, using three clusters
codebook, distortion = ____(whitened, ____)

fig, axs = plt.subplots(3)
for p in range(3):
    axs[p].scatter(whitened[:, 3], whitened[:, p], marker=".", label=f"{process_names[p]}")
    axs[p].scatter(codebook[:, 3], codebook[:, p], label='Cluster Centroids')
    axs[p].legend(loc='center left', bbox_to_anchor=(1, 0.5))
    axs[p].set_ylabel(f'Process duration (days)')
    axs[p].set_xlabel('Total duration (days)')
plt.show()
Editar y ejecutar código