Dataset Augmentation in Feature Space

Speaker: Graham Taylor (University of Guelph and Canadian Institute for Advanced Research)

Time: 1:30-2:30 pm, January 27 (Friday), 2017

Room: KED B015

Abstract: Dataset augmentation, the practice of applying a wide array of domain-specific transformations to synthetically expand a training set, is a standard tool in supervised learning. While effective in tasks such as visual recognition, the set of transformations must be carefully designed, implemented, and tested for every new domain, limiting its re-use and generality. In this talk, I will describe two recent efforts which transform data not in input space, but in a feature space found by unsupervised learning. The first is motivated by evidence that people mentally simulate transformations in space while comparing examples, so-called "mental rotation". We employ a model that learns relations between inputs rather than the inputs themselves.  This "relational" model actively transforms pairs of examples so that they are maximally similar in some feature space while respecting the learned transformational constraints. The second effort takes a more direct approach to domain-agnostic dataset augmentation. We start with data points mapped to a learned feature space and apply simple transformations such as adding noise, interpolating, or extrapolating between them. Working in the space of context vectors generated by sequence-to-sequence recurrent neural networks, this simple technique is demonstrated to be effective for both static and sequential data.

 

Dr. Taylor will also briefly introduce NextAI, a Canadian program for advancing AI-enabled entrepreneurship.