CLIN 2005 Abstracts
  • Capturing Global Mood Levels using Blog Posts
    Gilad Mishne (ISLA, University of Amsterdam)
    Maarten de Rijke (ISLA, University of Amsterdam)
    The blogosphere, the fast-growing totality of weblogs or blog-related webs, is a rich source of information for marketing professionals, social psychologists, and others interested in extracting and mining opinions, views, moods, and attitudes. Many blogs function as an online diary, reporting on the blogger's daily activities and surroundings; this leads a large number of bloggers to indicate what their mood was at the time of posting a blog entry.

    The work we present aims at identifying the intensity of moods within the blogging community during given time intervals. We use a large body of blog posts manually annotated (by the bloggers themselves) with their associated mood. Using this annotation, we identify words which are indicative of certain moods, then learn linear models for estimating the mood levels using the frequencies of the words in blog posts, as well as meta-information about the time interval itself. Our models exhibit a strong correlation with the actual moods reported by the bloggers, and significantly improve over a baseline.

    Our main finding is this. While it was known that determining the mood associated with an individual blog post is a very hard task, mainly due to the limited length of posts and the lack of an annotation regime, we have shown that, at the aggregate level, predicting the intensity of moods over a time span can be done with a high degree of accurracy, even without extensive feature engineering or model tuning.

    An online version, demonstrating our mood tracking and estimation work, is available at