Skip to content

Latest commit

 

History

History
31 lines (25 loc) · 986 Bytes

README.md

File metadata and controls

31 lines (25 loc) · 986 Bytes

gas-turbine-emision-clustering

Implementing machine learning by comparing the accuracy of the clustering algorithm on turbine gas emissions

Datasets 💾

Algorithms 🤖

  • K-Medoids (PAM)
  • CLARA

Package 📦︎

  • readxl
  • Amelia
  • ggplot2
  • GGally
  • knitr
  • caret (confusion matrix)
  • openintro
  • dplyr
  • cluster (cluster analysis)
  • factoextra (cluster visualization)
  • clValid (cluster validation)

Conclusion 💻︎

image

  • PAM Connectivity is lower than CLARA, that means PAM is better than CLARA
  • PAM Dunn Index is bigger than CLARA, that means PAM is better than CLARA
  • PAM Silhouette is closer to one than CLARA, that means PAM is better than CLARA

However, in terms of speed, CLARA clustering is faster because it is designed to cluster with large amounts of data.