Title: How to do Machine Learning on Massive Astronomical Datasets Abstract: I'll describe algorithms and data structures for allowing the most powerful machine learning methods, which often scale quadratically or even cubically with the number of data points, to be performed many orders of magnitude faster than naive implementations. Such techniques can make previously impossible statistical analyses tractable on the scale of entire sky surveys. I will discuss scalable algorithms we have developed for n-point correlations, friends-of-friends, nearest-neighbors, kernel density estimation, nonparametric Bayes classification, principal component analysis, local linear regression, isometric non-negative matrix factorization, hidden Markov models, k-means, support vector machine-like classifiers, Gaussian process regression, and Gaussian graphical model inference, among others. In addition to techniques inspired by computational geometry, fast multipole methods, and Monte Carlo integration, we employ a distributed framework which can be thought of as a higher-order version of Google's MapReduce. Our algorithms have enabled several first-of-a-kind large-scale cosmological analyses.