This is the website for the statistical machine learning summer seminar, that takes place on Tuesdays at 10:00 am (in KED B015).
Mathematical methods in data science: a lightning introduction
Date: May 13, 2014
Speaker: Vladimir Pestov
This is an opening lecture of a summer study group on statistical machine learning, organized for the benefit of speaker's students. The lecture is based on a mini-course of the same title, taught in Brazil to an audience of about 50 undergraduates in late April / early May: http://mtm.ufsc.br/coloquiosul/notas_minicurso_8.pdf The expected length of the lecture is between 2 and 3 hours.
Slides for this presentation:
Probably Approximately Correct Learning
Date: May 20, 2014
Speaker: Stan Hatko
Hamming Cube and Other Stuff
Date: May 27, 2014
Speaker: Sabrina Sixta
Python Tutorial
Date: June 3, 2014
Speaker: Emilie Idene
k-NN Learning Rule
Date: June 3, 2014
Speaker: Yue Dong
Ottawa Mathematics Conference 2014 Talks
At the Ottawa Mathematics Conference 2014, there have been a few talks by members of this seminar. These talks are:
Borel Dimensionality Reduction of Data and Supervised Learning
Speaker: Stan Hatko
Abstract: In this talk we discuss Borel dimensionality reduction of datasets with the purpose of subsequently applying supervised learning algorithms. We will start by introducing the notions of a classifier, learning algorithm (in particular the k-NN learning algorithm), and universal consistency of a learning algorithm. Any universally consistent learning algorithm, for instance k-NN, remains so after an injective Borel map is applied. This means we can reduce the dimensionality of a high dimensional dataset by applying an injective Borel map to a lower dimensional space and subsequently apply a supervised learning algorithm. We will give some concrete examples of applying Borel dimensionality reduction to actual datasets. We will see how selecting a different Borel map at each step, depending on the sample, is equivalent to choosing from a family of metrics on the domain at each step for the k-NN learning algorithm. We would like to determine under what conditions this will produce a universally consistent classifier and avoid the problem of overfitting. We will show that k-NN with regards to norms changed depending on a sample is universally consistent provided the family of norms satisfies certain conditions.
A new feature selection technique: Mass Transportation Score
Speaker: Gaël Giordano
Abstract: The presentation introduces and defines the notion of learning algorithms and feature selection. Then we focuses on a new feature selection technique called the Mass Transportation Score (MTS), developed by the research team of Dr. Pestov at the University of Ottawa. The MTS is based on the notion of distance between two finitely supported measures. We first rigorously construct the MTS, then investigate the performance of the MTS for an individual feature, using a genetic dataset from the UOHI.
On the Problem of Universal Consistency of the kNN Classifier in Banach Spaces
Speaker: Yue Dong
Abstract: The kNN classifier is one of the oldest natural classification for labeling data. The classical Stone’s theorem says that this algorithm is universal consistent in finite Euclidean space. Since then, the result has been generalized in many directions. In particular, it will show that the algorithm is no longer universal consistent in infinite dimensional space. Of certain interest is the problem of universal consistency in more general infinite-dimensional Banach space(the so called functional learning). In this talk, we will survey and discuss what is known in this problem.
Proof of Optimality of the Bayes Classifier
Date: June 10, 2014
Speaker: Gaël Giordano
Concentration Function and Other Stuff
Date: June 17, 2014
Speaker: Sabrina Sixta
VC dimension
Date: June 17, 2014
Speaker: Gaël Giordano
Unsupervised Learning, Quantum Machine Learning
Date: June 24, 2014
Speaker: Samuel Buteau
Decision Trees and Random Forests: Part 1
Date: July 8, 2014
Speaker: Stan Hatko
Information on how to create decision trees and random forests in R.
More on the Concentration Function
Date: June 17, 2014
Speaker: Sabrina Sixta
HPCVL Documentation
Random Forests: Part 2
Date: July 22, 2014
Speaker: Stan Hatko
Random Forests: Part 3
Date: July 22, 2014
Speaker: Stan Hatko