Clustering: Extracting Patterns from Data

This repository contains a project focused on using clustering techniques to extract patterns from data. The dataset includes consumption data of unidentified customers from a bank. The KMeans method separates these customers into clusters based on their usage patterns of the bank's services. Silhouette, Davies-Bouldin, and Calinski-Harabasz metrics are applied to evaluate the cluster separation. Finally, an analysis of the characteristics of each cluster is conducted to identify the customer patterns represented by each cluster.

The goal of this project is to explore methods for clustering customers based on their banking service usage patterns and to analyze the characteristics of each cluster.

The project focuses on the following:
- Discover how to validate and interpret results with unlabeled data.
- Learn techniques to help interpret cluster information.
- Extract information about customer behavior using data from a credit card company.
- Use scikit-learn to generate clusters and calculate different validation metrics.
- Understand the mathematics behind validation metrics: silhouette, Davies-Bouldin, and Calinski-Harabasz.

Developed: sep, 2023

Published: jul 15, 2024