I just remembered the mathematical buzzword/buzzterm(?) for the analysis of machine learning: "statistical learning theory". The American Mathematical Society Notices ran an expository article on it some time in the last 10 years, which might not sound great to you but probably is the best of its kind unless it's been updated. You can probably find it for free by looking on Google Scholar and searching using the obvious keywords.

I used to be interested in machine learning - specifically, support vector machines, since I liked their mathematical rigor. I read most of a book on support vector machines. I attend a workshop on mathematical learning theory (I forgot exactly what the phrase is), PAC learning and that sort of thing. There's a whole new branch of mathematics about it. I think Stephen Smale (of Smale Horseshoe fame) switched from dynamical systems to mathematical learning theory.

I didn't actually DO much with it, such as programming. I didn't have anyone to work with, and insufficient knowledge and background to conduct research. So I gave it up.

Someone else mentioned R. It's free and great for stats, so I'll bet people use it in machine learning. You'd have no trouble picking it up, given your background.

