Similarity search: the metric space approach
Instructor: Prof. Pavel Zezula
Affiliation: Faculty of Informatics, Masaryk University in Brno, Czech Republic
Duration: 20 hours
Date: May 15 - May 19, 2006
Place: Dipartimento di Ingegneria dell'Informazione: Elettronica, Informatica, Telecomunicazioni, via Diotisalvi, meeting room
Contacts: Dott. Fausto Rabitti, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo” del CNR di Pisa
Contents
Part I: Metric searching in a nutshell
- Foundations of metric space searching (7 hours)
- distance searching problem in metric spaces
- metric distance measures
- similarity queries
- basic partitioning principles
- principles of similarity query execution
- policies to avoid distance computations
- metric space transformations
- principles of approximate similarity search
- advanced issues
- Survey of exiting approaches (3 hours)
- ball partitioning methods
- generalized hyper-plane partitioning approaches
- exploiting pre-computed distances
- hybrid indexing approaches
- approximated techniques
Part II: Metric searching in large collections of data
- Centralized index structures for large databases (4 hours)
- M-tree family
- hash-based metric indexing
- performance trials
- Approximate similarity search (2 hours)
- relative error approximation
- good fraction approximation
- small chance improvement approximation
- proximity-based approximation
- PAC nearest neighbor searching
- performance trials
- Parallel and distributed indexes (4 hours)
- preliminaries
- processing M-trees with parallel resources
- scalable and distributed similarity search
- performance trials
|