There has been an immense improvement of theory and algorithms in recent times leading to enhancement in computation power. Also, online data availability was never easier than now. As a result, machine learning techniques have become more popular in solving real world problems. There are a number of machine learning techniques available now for continuous as well as categorical target events. K Nearest Neighbor (KNN), Gradient Boosting Machine (GBM), Random Forest, Support Vector Machine (SVM) are some of the popular techniques that have emerged in the recent past.
We will focus on elaborating the deployment of one such technique - KNN. In this paper, we will talk about the fundamental principles, methodologies, features, algorithm, pros and cons of KNN as a modeling technique.
We propose KNN methodology to bring higher computational power of the technology infrastructure present in organizations. Further, over time it would stop being a pure black box exercise, as it is being deployed by modelers today. Modelers would develop the “art” of determining appropriate thresholds of k. This paper would be particularly useful for Institutions and Individuals who apply KNN to make business decision or research activities.