BAILANDO Projects: Lindi

 

  Project Overviews

  People

  Publications

 

     

    Lindi Text Data Mining Project

    Note: This project has been superceded by the BioText project.

    We are developing a text data mining system, Lindi, for Linking Information for Novel Discoveries and Insight. The main goal is to help automated discovery of new information from large text collections. As a step towards the goal of text mining, we are developing empirical algorithms for semantic analysis of natural language text.

    An article on text data mining ideas at Mappa Mundi Magazine.

    Publications

      A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text, Ariel Schwartz and Marti Hearst, to appear in the proceedings of the Pacific Symposium on Biocomputing (PSB 2003) Kauai, Jan 2003. pdf

      The Descent of Hierarchy, and Selection in Relational Semantics Barbara Rosario, Marti Hearst, and Charles Fillmore, in ACL-02, July, 2002. pdf   ps

      Classifying the Semantic Relations in Noun Compounds via a Domain-Specific Lexical Hierarchy, Barbara Rosario and Marti Hearst, in the Proceedings of EMNLP '01, Pittsburgh, PA, June 2001.   pdf   ps

      For an introduction to the the ideas behind LINDI, see:

      Untangling Text Data Mining , Marti Hearst in the Proceedings of ACL'99: the 37th Annual Meeting of the Association for Computational Linguistics, University of Maryland, June 20-26, 1999 (invited paper). html

    Talks

      A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text, PSB'03 ppt

      The Descent of Hierarchy, and Selection in Relational Semantics ACL-02, July, 2002. ppt

      Interfaces for Intense Information Analysis, IBM Workshop on The User Experience of Business Intelligence and Knowledge Management, March 2002. (ppt)

      Classifying the Semantic Relations in Noun Compounds via a Domain-Specific Lexical Hierarchy EMNLP '01 (ppt)

      Remarks for the Web Data Mining Panel (html) for KDD '97.

      Text Data Mining: Issues, Techniques, and the Relation to Information Access (html) for the UW/MS workshop on data mining, July 1997.

      See also the Text Data Mining Seminar from Fall 1999.