1. "Learning regions of interest in postal automation" by H. Walischewski
Talks about digital representation of mail pieces, and
learning models
to find the regions of interest such as the target address.
The average
recognition rate is about 92%.
2. "Interval-Algebra based block layout analysis and document template
generation" by R. Singh et al
Interval-algebra is introduced to capture the block layout
descriptions
in qualitative description of the entire class of documents.
3. "On the application of Voronoi diagrams to page segmentation" by
K. Kise et al
Aims at the development of general representations of the
physical
structure of document images by creating a point Voronoi
diagram, then
an area Voronoi diagram of connected components, and finally
a neighbor
graph.
In the discussions, we talked about methods of finding the interested
items in the image document and established the fact that they are
application dependent. Hence the characteristics of the items as well
as their environment must first be established. Then physical measurements
such as size and distances can be applied afterwards.
In addressing training and learning methods, both supervised and semi-
supervised techniques have been discussed. A large number of samples
are
required to derive the parameters and thresholds to minimize errors.
Probabilistic techniques and statistics also entered into the discussions
as well as the use of ranking.