DLIA99 Discussion summary:
II: Table recognition and interpretation
Our group, consisting of sixteen people from
seven countries, about equally divided between academic, industrial, and
government institutions, discussed the problem of "table recognition and
interpretation". While it is impossible to do justice to the variety
of ideas that were presented, I attempt to extract some themes below.
We hope that this will be the start of ongoing discussions on these and
related topics. For an informal transcript of the discussion, click
here
(I am indebted to Amit Mukherjee for permission to copy his notes).
What is a table?
The answer to this question is often application-driven. A person
interested only in the financial data in the third column of row two of
an ascii table within an email message may view the problem quite differently
from someone who wants to capture details of a complex presentational
layout or design a general schema for table metadata. The various
points of view can be organized as follows:
-
Physical representation: Spatial primitives and their relationships.
-
A separate presentational level can be distinguished.
-
Logical representation: Metadata, descriptions of the logical
roles played by various tabular components.
-
Algorithmic representation: Operational definitions, in terms
of the algorithms and data structures used to describe and recognize tables.
The possibility of recursion is a common theme in all of the above representations.
While there may be a general understanding of the elements of these representations,
the community does not appear to have settled on a standard description
of any of them. Perhaps this indicates that we do not understand
the problem well enough yet.
Performance evaluation
Careful study of the performance of implemented systems is an important
way to measure progress. Simply defining what it means to evaluate
performance can clarify the goals of research. Here again it is important
to note that the objective function (i.e. performance metric) will be application-specific.
A system which can consistently identify the "third column in the second
row" of a table may be all that is needed in one kind of application, and
woefully inadequate in another.
In order to start large-scale evaluations, the existence of databases
of empirical data and the corresponding ground truth is highly desirable.
This presents a bit of a chicken-and-egg problem, since in order to generate
the ground-truth data, we first have to decide what constitutes "truth"
in this context.
General questions and comments
-
The situation in table layout recognition is similar to that of OCR two
decades ago.
-
Perhaps it is more complicated, though. It is not just a matter of
designing better classifiers.
-
Perhaps there is no such thing as a generic table recognition algorithm
- only special cases.
-
Is it possible to classify the table layouts that are actually encountered
in practice?
-
"We've only just begun."
Participants
[Note: The following names are in the order in which they appeared on the
original list (with additions), in an attempt to preserve the spatial layout!]
Hiroshi Shimodaira
JAIST
sim@jaist.ac.jp
Andreas Moser
DFKI
moser@dfki.de
Thomas Bayer
Siemens
thomas.bayer@kst.siemens.de
Lyse Robadey
University of Fribourg, CH
lyse.robadey@unifr.ch
Amit Mukerjee
IIT Kanpur
amit@iitk.ac.in
Koichi Kise
Osaka Prefecture University
kise@cs.osakafu-u.ac.jp
Steve Dennis
U.S. DOD
sjdenni@afterlife.ncsc.mil
Ihsin Phillips
Seattle University
yun@seattleu.edu
Robert Haralick
University of WA
haralick@ee.washington.edu
Jianying Hu
Bell Labs
jianhu@bell-labs.com
Ram Kashi
Bell Labs
ramanuja@bell-labs.com
Oliver Hitz
University of Fribourg
oliver.hitz@unifr.ch
Markus Junker
DFKI
markus.junker@dfki.de
Steve Simske
Hewlett-Packard
Steven_Simske@hp.com
Laurent Najman
OCE
laurent.najman@ocegr.fr
Ching Suen
Concordia University
suen@cenparmi.concordia.ca
Taku Tokuyasu
UC Berkeley
tokuyasu@cs.berkeley.edu
Thomas Breuel
Xerox Palo Alto Research Center
tbreuel@parc.xerox.com
Marcel Worring
University of Amsterdam
worring@wins.uva.nl
Taku Tokuyasu,
discussion chair.
Last modified 11/3/99