freeCodeCamp/guide/chinese/machine-learning/natural-language-processing/index.md

61 lines
2.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

---
title: Natural Language Processing
localeTitle: 自然语言处理
---
## 自然语言处理NLP
正如维基百科所说“自然语言处理NLP是计算机科学信息工程和人工智能的一个子领域涉及计算机与人类自然语言之间的相互作用特别是如何对计算机进行编程以处理和分析大量数据自然语言数据。“ 简单来说,这是一个由人类产生的自然语言被计算机感知的过程。
### NLP面临的挑战
#### 1.轻松或大部分解决
```
*Spam detection
*Part of Speech Tagging
*Named Entity Recognition
```
#### 2.中级或取得良好进展
```
*Sentiment analysis
*Coreference resolution
*Word sense disambiguation
*Parsing
*Machine Translation
*Information Translation
```
#### 3.很难还是还需要很多工作
```
*Text Summarization
*Machine dialog system
```
### 常用技巧
```
*Structure extraction
*Identify and mark sentence, phrase, and paragraph boundaries
*Language identification
*Tokenization
*Acronym normalization and tagging
*Lemmatization / Stemming
*Entity extraction
*Phrase extraction
```
### 常用的图书馆
```
*NLTK, the most widely-mentioned NLP library for Python.
*SpaCy, an industrial-strength NLP library built for performance.
*Gensim, a library for document similarity analysis.
*TextBlob, a user-friendly and intuitive NLTK interface.
*CoreNLP from stanford group
*PolyGlot, a natural language pipeline that supports massive multilingual applications.
```
#### 更多信息:
进一步阅读:
* 点击[此处](https://medium.com/@gon.esbuyo/get-started-with-nlp-part-i-d67ca26cc828)查看有关NLP介绍的文章。
* 单击[此处](https://en.wikipedia.org/wiki/Natural_language_processing)查看Wikipedia参考。