Module 6: Traditional ML Methods


Topic 2: Clustering and Nearest Neighbors

Unsupervised learning methods

Our second set of foundational ML methods will be focused on a traditional unsupervised method and on a method that does not build an explicit model.  Your reading for this topic is Section 19.7 (Nonparametric models).

Clustering

Clustering is often seen as the canonical unsupervised method. I sometimes argue that it is not purely unsupervised since you do give it some info to make the algorithm work but it is very minimal info.  There are a LOT of clustering methods out there and I overview a basic one that is used in many applications.  K-means clustering forms the core of many of the fancier clustering algorithms so it is a good one to learn about first.

Copy of the slides from me, copy of the slides I also showed from Dr Moore

K-nearest Neighbors

Lazy learning sounds like the perfect learning method for a student who is smack in the middle of lots of deadlines, right?  This next model is a lazy learning model because, instead of learning a specific model and parameters, such as a tree or clustering, it lets the data be its own model.

The video contains two methods:  K-nearest neighbors and kernel regression.  They sound different but are really just the same method with weights.

Copy of the slides from me, copy of the slides I also showed from Dr Moore

Note that there is a more advanced but still conceptually simple and powerful method shown in the end of his slides which I did not cover but might be useful for one of your ML projects:  locally weighted regression.  Feel free to read about it in the slides and to use it in the project if you think it would be a good fit!

Exercise

Complete the exercise on clustering and nearest neighbors