Wednesday, 29 November 2017

Machine Learning Podcast

Linear Digressions

Machine Learnings 101

Book "Advanced Statistical Computing"

Advanced Statistical Computing

Monday, 13 November 2017

Heredoc With R Code Embeded

How to qsub an R script as a heredoc?

Learning Data Science

Learning Data Science

K-mean Clustering

Cited from Exploring Assumptions of K-means Clustering using R

K-Means clustering method considers two assumptions regarding the clusters – first that the clusters are spherical and second that the clusters are of similar size. Spherical assumption helps in separating the clusters when the algorithm works on the data and forms clusters. If this assumption is violated, the clusters formed may not be what one expects. On the other hand, assumption over the size of clusters helps in deciding the boundaries of the cluster. This assumption helps in calculating the number of data points each cluster should have. This assumption also gives an advantage. Clusters in K-means are defined by taking the mean of all the data points in the cluster. With this assumption, one can start with the centers of clusters anywhere. Keeping the starting points of the clusters anywhere will still make the algorithm converge with the same final clusters as keeping the centers as far apart as possible.