Implementing machine learning by comparing the accuracy of the clustering algorithm on turbine gas emissions
- K-Medoids (PAM)
- CLARA
- readxl
- Amelia
- ggplot2
- GGally
- knitr
- caret (confusion matrix)
- openintro
- dplyr
- cluster (cluster analysis)
- factoextra (cluster visualization)
- clValid (cluster validation)
- PAM Connectivity is lower than CLARA, that means PAM is better than CLARA
- PAM Dunn Index is bigger than CLARA, that means PAM is better than CLARA
- PAM Silhouette is closer to one than CLARA, that means PAM is better than CLARA
However, in terms of speed, CLARA clustering is faster because it is designed to cluster with large amounts of data.