Handwritten word-spotting using hidden Markov models and universal vocabularies

Producción científica: Artículo en revista indizadaArtículorevisión exhaustiva

149 Citas (Scopus)

Resumen

Handwritten word-spotting is traditionally viewed as an image matching task between one or multiple query word-images and a set of candidate word-images in a database. This is a typical instance of the query-by-example paradigm. In this article, we introduce a statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization. We explore the use of two types of HMMs for the word modeling part: continuous HMMs (C-HMMs) and semi-continuous HMMs (SC-HMMs), i.e. HMMs with a shared set of Gaussians. We show on a challenging multi-writer corpus that the proposed statistical framework is always superior to a traditional matching system which uses dynamic time warping (DTW) for word-image distance computation. A very important finding is that the SC-HMM is superior when labeled training data is scarce—as low as one sample per keyword—thanks to the prior information which can be incorporated in the shared set of Gaussians.
Idioma originalInglés
Páginas (desde-hasta)2106
Número de páginas2116
PublicaciónPattern Recognition
Volumen42
N.º9
DOI
EstadoPublicada - 2012

Huella

Profundice en los temas de investigación de 'Handwritten word-spotting using hidden Markov models and universal vocabularies'. En conjunto forman una huella única.

Citar esto