CLIN 2005 Abstracts
  • Concept Clustering in Medical Encyclopedia
    Piroska Lendvai (Dept. of Language and Information Science, Tilburg University)
    Based on encyclopedia texts that are semantically annotated on section-, sentence- and phrase level, we cluster documents' headwords as belonging to a medical concept category such as "disease", "bodily function", "treatment", etc. We use the induced knowledge for a QA module in answering medical questions.

    Based on various characteristics that an article contains (e.g. the section types, the section subtitles, the words, etc.) , headwords of documents are classified by supervised learning. For example, headwords of documents (such as "meningitis", "carpal tunnel syndrome") that contain section tags as definition, symptoms, cause, treatment are to be clustered as describing a disease, headwords of documents (such as sunbath, delivery) that have section tags as definition, treatment, side effects are to be clustered in ``procedures'', yet others are identified as ``diagnostic method'' (such as mri-scan or laparoscopy), sharing the section tags definition, diagnosis.

    We investigate what features can be utilised in creating the clusters and what performance can be attained by a supervised learning method.