Skip to main navigation Skip to search Skip to main content

Handwritten word-spotting using hidden Markov models and universal vocabularies

    Research output: Indexed journal article Articlepeer-review

    151 Citations (Scopus)

    Abstract

    Handwritten word-spotting is traditionally viewed as an image matching task between one or multiple query word-images and a set of candidate word-images in a database. This is a typical instance of the query-by-example paradigm. In this article, we introduce a statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization. We explore the use of two types of HMMs for the word modeling part: continuous HMMs (C-HMMs) and semi-continuous HMMs (SC-HMMs), i.e. HMMs with a shared set of Gaussians. We show on a challenging multi-writer corpus that the proposed statistical framework is always superior to a traditional matching system which uses dynamic time warping (DTW) for word-image distance computation. A very important finding is that the SC-HMM is superior when labeled training data is scarce—as low as one sample per keyword—thanks to the prior information which can be incorporated in the shared set of Gaussians.
    Original languageEnglish
    Pages (from-to)2106
    Number of pages2116
    JournalPattern Recognition
    Volume42
    Issue number9
    DOIs
    Publication statusPublished - 2012

    Fingerprint

    Dive into the research topics of 'Handwritten word-spotting using hidden Markov models and universal vocabularies'. Together they form a unique fingerprint.

    Cite this