Resumen
Handwritten word-spotting is traditionally viewed as an image matching task between one or multiple query word-images and a set of candidate word-images in a database. This is a typical instance of the query-by-example paradigm. In this article, we introduce a statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization. We explore the use of two types of HMMs for the word modeling part: continuous HMMs (C-HMMs) and semi-continuous HMMs (SC-HMMs), i.e. HMMs with a shared set of Gaussians. We show on a challenging multi-writer corpus that the proposed statistical framework is always superior to a traditional matching system which uses dynamic time warping (DTW) for word-image distance computation. A very important finding is that the SC-HMM is superior when labeled training data is scarce—as low as one sample per keyword—thanks to the prior information which can be incorporated in the shared set of Gaussians.
| Idioma original | Inglés |
|---|---|
| Páginas (desde-hasta) | 2106 |
| Número de páginas | 2116 |
| Publicación | Pattern Recognition |
| Volumen | 42 |
| N.º | 9 |
| DOI | |
| Estado | Publicada - 2012 |
Huella
Profundice en los temas de investigación de 'Handwritten word-spotting using hidden Markov models and universal vocabularies'. En conjunto forman una huella única.Cómo citar
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver