# Data Science and Machine Learning Group

Wang, J., Xu, C., Yang, X. and Zurada, J. (2017). A Novel Pruning Algorithm for Smoothing Feed-forward Neural Networks based on Group Lasso. IEEE Transactions on Neural Networks and Learning Systems. Accepted.

Zou, B., Tang, Y., Xu, C., Xu, J. and You, X. (2017). K-Times Markov Sampling for SVMC. IEEE Transactions on Neural Networks and Learning Systems. Accepted.

[N. Denis, M. Fraser, B. Paget, *Novel measure of regional homogeneity
for fMRI data analysis*. Poster at University of Ottawa Brain Health Research Day (2017).]

M. Fraser, Multi-step learning and underlying structure in statistical models. NIPS 2016.

[N. Denis, M. Fraser, *In search of dynamic representations of state of mind: exploring mathematical methodologies for capturing higher-order responses in subjects listening to auditory narrative under fMRI.* Poster in ``Topological, Geometric and Statistical Techniques in Biological Data Analysis" - workshop at Mathematical Biosciences Institute, Ohio State University (2016).]

Xu, C., Zhang, Y., Li, R. and Wu, X. (2016). On the Feasibility of Distributed Kernel Regression for Big Data. IEEE Transactions on Knowledge and Data Engineering, 28, 3041 - 3052.

H. Duan, V. Pestov and V. Singla, *Text categorization via similarity search: an efficient and effective novel algorithm*.
- Similarity Search and Applications (SISAP 2013). Proceedings of the
6th International Conference. A Coruña, Spain, Oct. 2013. Springer
Lecture Notes in Computer Science **8199**, pp. 182-193. doi>10.1007/978-3-642-41062-8_19. [arXiv version]

A. Stojmirović, P. Andreae, M. Boland, T. W. Jordan and V. G. Pestov, *PFMFind: a system for discovery of peptide homology and function*.
- Similarity Search and Applications (SISAP 2013). Proceedings of the
6th International Conference. A Coruña, Spain, Oct. 2013. Springer
Lecture Notes in Computer Science **8199**, pp. 319-324. doi>10.1007/978-3-642-41062-8_32. [arxiv version]

H. Duan, *Bounding the Fat Shattering Dimension of a Composition
Function Class Built Using a Continuous Logic Connective* - The
Waterloo Mathematics Review **2.1** (2012), pp. 4 - 20.
Online version.
V. Pestov,
*Is the k-NN classifier in high dimensions affected by the curse of
dimensionality?* - Computers & Mathematics with Applications
**65** (2013), pp. 1427--1437,
doi>
10.1016/j.camwa.2012.09.011.
[arXiv version]
V. Pestov,
*Lower bounds on performance of metric tree indexing schemes for
exact similarity search in high dimensions.* - Algorithmica
**66** (2013), 310-328.
doi>
10.1007/s00453-012-9638-2. [arXiv]

V. Pestov,
*PAC learnability under non-atomic measures: a problem by Vidyasagar.*
- Theoretical Computer Science **473** (2013), 29-45. doi>
10.1016/j.tcs.2012.10.015.
[arXiv]

Gonzalo Navarro and Vladimir Pestov, editors.
*Similarity Search and Applications: 5th International Conference,
SISAP 2012, Toronto, ON, Canada, August 9-10, 2012, Proceedings,*
Springer Lecture Notes in Computer Science **7404**, 2012, 255
pages, ISBN-13: 978-3642321528,
doi > 10.1007/978-3-642-32153-5. The
book webpage.

Damjan Kalajdzievski, *Measurability Aspects of the
Compactness Theorem for Sample Compression Schemes,* M.Sc. thesis,
University of Ottawa, 2012, 64 pp., arXiv.

V. Pestov,
*Indexability, concentration, and VC theory.* - J. Discrete
Algorithms **13** (2012), pp. 2-18.
doi> 10.1016/j.jda.2011.10.002
[arXiv]

V. Pestov,
*PAC learnability versus VC dimension: a footnote to a basic result
of statistical learning,* in: Proceedings of the 2011 International
Joint Conference on Neural Networks (IJCNN'2011), San José, CA
(July 31 - Aug. 5, 2011), pp. 1141 -
1145,
doi> 10.1109/IJCNN.2011.6033352. [arXiv]

V. Pestov,
* Lower bounds on performance of metric tree indexing schemes for
exact similarity search in high dimensions.* - Proceedings of the
4th International Conference on Similarity Search and Applications
(SISAP 2011), 30 June - 1 July 2011, Lipari, Sicily, Italy. Editor:
Alfredo Ferro, ACM, New York, NY, 2011, pp. 25-32.

V. Pestov,
*A note on sample complexity of learning binary output neural networks
under fixed input distributions.* - in: Proc. 2010 Eleventh
Brazilian Symposium on Neural Networks (São Bernardo do Campo,
SP, Brazil, 23-28 October 2010), IEEE Computer Society, Los
Alamitos-Washington-Tokyo, 2010, pp. 7-12. doi>
10.1109/SBRN.2010.10
[ arXiv ]

V. Pestov,
*PAC learnability of a concept class under non-atomic measures: a
problem by Vidyasagar.* - in: Proc. 21st Intern. Conference on
Algorithmic Learning Theory (ALT'2010), Canberra, Australia, 6-8 Oct.
2010 (M. Hutter, F. Stephan, V. Vovk, T. Zuegmann, eds.), Lect. Notes
in Artificial Intelligence **6331**, Springer, 2010, pp. 134-147.
[arXiv version]

V. Pestov,
*Intrinsic Dimensionality.* - The SIGSPATIAL Special, Newsletter
of the Association for Computer Machinery special interest group on
spatial information, a
special issue on searching in metric spaces,
vol. **2**, No. 2 (2010), 8-11.

V. Pestov,
*Indexability, concentration, and VC theory.* - An invited paper,
Proceedings of the 3rd International Conference on Similarity Search
and Applications (SISAP 2010), 18-19 September 2010, Istanbul, Turkey.
Editors: Paolo Ciaccia and Marco Patella, ACM, New York, NY, 2010, pp.
3-12.

V. Pestov,
*Predictive PAC learnability: a paradigm for learning from
exchangeable input data.* - In: Proc. 2010 IEEE Int. Conference on
Granular Computing (San Jose, CA, 14-16 Aug. 2010), pp. 387-391,
Symposium on Foundations and Practice of Data Mining. doi>
10.1109/GrC.2010.102
[arXiv]

Ilya Volnyansky and V. Pestov,
*Curse of dimensionality in pivot-based indexes*. - Proceedings
of the 2nd International Workshop on Similarity Search and
Applications (SISAP 2009), Prague, Czech Republic, August 29-30, 2009,
T. Skopal and P. Zezula (eds.), IEEE Computer Society, Los
Alamitos--Washington--Tokyo, 2009, pp. 39-46. [arXiv version]

Ilya Volnyansky, *Curse of Dimensionality in the Application of
Pivot-based Indexes to the Similarity Search Problem,* M.Sc.
thesis, University of Ottawa, 2009, 56 pages, arXiv.

V. Pestov,
*
An axiomatic approach to intrinsic dimension of a dataset*. -
Neural Networks **21**, 2-3 (2008), 204-213.
(A special volume on Advances in Neural Networks Research: IJCNN
′07, 2007 International Joint Conference on Neural Networks
IJCNN ′07.) [arXiv
version]

A. Stojmirović, V. Pestov,
*
Indexing schemes for similarity search in datasets of short protein
fragments*. - Information Systems **32** (2007), 1145-1165.

V. Pestov,
*Intrinsic dimension of a dataset: what properties does one expect?* -
In: Proceedings of the 20th International Joint Conference on Neural
Networks (IJCNN'2007), Orlando, Florida (Aug. 12--17, 2007), pp.
1775--1780.
[arXiv version]