Directory

Encyclopedia

NodeWorks
                              ENCYCLOPEDIA

Link Checker

Home
Encyclopedia : T : TE : TEX :

Text mining

 

Text mining

Text mining, also known as intelligent text analysis, text data mining or knowledge-discovery in text (KDT), refers generally to the process of extracting interesting and non-trivial information and knowledge from unstructured text. Text mining is a young interdisciplinary field which draws on information retrieval, data mining, machine learning, statistics and computational linguistics. As most information (over 80%) is stored as text, text mining is believed to have a high commercial potential value.

One application of text mining is in bioinformatics, where details of experimental results can be automatically extracted from a large corpus of text and then processed computationally. For example it has been quoted that a support vector machine (SVM) with appropriate training can extract details of protein-protein interaction from the literature with greater than 90 percent accuracy.

Some bioinformaticians have termed the body of literature the textome, which derives its name from the same naming convention which gave us the genome, however this term is far from universal.

One of the largest text mining applications that exist is probably the classified ECHELON surveillance system.

External links

  • Kmining List of text mining, data mining and KDD scientific conferences
  • KDNuggets Data Mining, Web Mining, and Knowledge Discovery Guide
  • Text Mining Encyclopedia of Terms
  • Text mining summit 2005

    See also

  • Data mining
  • Genomics
  • Information Retrieval
  • Natural language processing
  • Computational linguistics
  • Business intelligence
  • Text analysis tool


  • NodeWorks boosts web surfing!
    Page Returned in 0.194 seconds - HTML Compressed 69.5%

    This article is from Wikipedia. All text is available
    under the terms of the GNU Free Documentation License.
     GNU Free Documentation License
    © 2009 Chamas Enterprises Inc.