freeCodeCamp

2.0 KiB

Raw Blame History

title	localeTitle
Natural Language Processing	自然语言处理

自然语言处理（NLP）

正如维基百科所说，“自然语言处理（NLP）是计算机科学，信息工程和人工智能的一个子领域，涉及计算机与人类（自然）语言之间的相互作用，特别是如何对计算机进行编程以处理和分析大量数据自然语言数据。“ 简单来说，这是一个由人类产生的自然语言被计算机感知的过程。

NLP面临的挑战

1.轻松或大部分解决

          *Spam detection 
          *Part of Speech Tagging 
          *Named Entity Recognition

2.中级或取得良好进展

          *Sentiment analysis 
          *Coreference resolution 
          *Word sense disambiguation 
          *Parsing 
          *Machine Translation 
          *Information Translation

3.很难还是还需要很多工作

          *Text Summarization 
          *Machine dialog system

常用技巧

         *Structure extraction 
         *Identify and mark sentence, phrase, and paragraph boundaries 
         *Language identification 
         *Tokenization 
         *Acronym normalization and tagging 
         *Lemmatization / Stemming 
         *Entity extraction 
         *Phrase extraction

常用的图书馆

            *NLTK, the most widely-mentioned NLP library for Python. 
        *SpaCy, an industrial-strength NLP library built for performance. 
        *Gensim, a library for document similarity analysis. 
        *TextBlob, a user-friendly and intuitive NLTK interface. 
        *CoreNLP from stanford group 
        *PolyGlot, a natural language pipeline that supports massive multilingual applications.

2.0 KiB Raw Blame History Unescape Escape

自然语言处理（NLP）

NLP面临的挑战

1.轻松或大部分解决

2.中级或取得良好进展

3.很难还是还需要很多工作

常用技巧

常用的图书馆

更多信息：

2.0 KiB

Raw Blame History