Large Margin Nearest Neighbors


(Thanks to John Blitzer, who gave me this cake for my 30th birthday.)

Here is the original image from the paper:




Large Margin Nearest Neighbor Classifiction is a NIPS05 paper in which we show how to learn a Mahanalobis distance metric for k-nearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the k-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. On seven data sets of varying size and difficulty, we find that metrics trained in this way lead to significant improvements in kNN classification---for example, achieving a test error rate of 1.3% on the MNIST handwritten digits. Our approach has many parallels to support vector machines, including a convex objective function based on the hinge loss, but does not require modifications for problems with large numbers of classes.

More details about the algorithm can be found on the Wiki page and in the original JMLR paper.

MATLAB CODE

Stable version: Download Page

(If you have trouble compiling mex files, try to run the demo without install.m - the binaries for several architectures are included.)

Fernando J. Iglesias Garcia has implemented LMNN in Python and included it in the Shogun toolbox. He has also created a very nice iPython notebook that guides you through the steps of getting LMNN to work.


The code is based on the very simple alternating projection algorithm. Please let me know about any problems you might encounter with the implementation.