Intro to NLP 笔记

Reading time ~5 minutes

之前接触的NLP 知识都不系统借着学习《Introduction to Natural Language Processing》 的机会,系统的了解了解。另外可参考的课程包括:

Intro

做语音识别时特别需要 IPA Chart ,这个上面也配有发音。

NLP 的难点,除了理解语法语义,歧义是其另一大难点,而歧义种类分如下几种:

  • Morphological: Joe is quite impossible. Joe is quite important.
  • Phonetic: Joe’s finger got number.
  • Part of speech: Joe won the first round.
  • Syntactic: Call Joe a taxi.
  • Pp attachment: Joe ate pizza with a fork / with meatballs / with Samantha / with pleasure.
  • Sense: Joe took the bar exam.
  • Modality: Joe may win the lottery.
  • Subjectivity: believes that stocks will rise.
  • Negation: likes his pizza with no cheese and tomatoes.
  • Referential: yelled at Mike. He had broken the bike. yelled at Mike. He was angry at him.
  • Reflexive: John bought him a present. John bought himself a present.
  • Ellipsis and parallelism: gave Mike a beer and Jeremy a glass of wine.
  • Metonymy: called and left a message for Joe.

除去上面提到的问题,不标准的语言(俚语、新词等)、语法错误、字词错误、计算机解析、复杂句、幽默讽刺、指代、潜在意思等。等也是NLP 做起来比较困难的地方。

然后介绍了语言学的一些知识( Linguistic Knowledge):

  • Phonetics and phonology - the study of sounds
  • Morphology - the study of word components
  • Syntax - the study of sentence and phrase structure
  • Lexical semantics - the study of the meanings of words
  • Compositional semantics - how to combine words
  • Pragmatics - how to accomplish goals
  • Discourse conventions - how to deal with units larger than utterances

接着介绍了一些语言学的知识,比如PIE 及衍生语簇及子各语支。 还有比较重要的是语音演变规则:

Grimm’sLaw

  • Voiceless stops turn into voiceless fricatives
  • Voiced stops become voiceless stops
  • Voiced aspirated stops change to voiced stops or fricatives Examples

  • Ancient Greek: πούς, Latin: pēs, Sanskrit: pāda – English: foot, German: Fuß, Swedish: fot
  • Ancient Greek: κύων, Latin: canis, Welsh: ci – English: hound, Dutch: hond, German: Hund

接着介绍了世界语言的几个链接:

Mathjax was not loaded successfully

Original post: http://blog.josephjctang.com/2015-10/notes-of-intro-to-NLP/

2016 記

2015 年做的和沒做的,也大體記錄在了[這裏]({% post_url 2016-01-01-annual-review-and-planning %})。匆匆一年已逝,幾多慨嘆,幾多欣喜。後面列列過去已經做的,以及相對的未來一年的TODO list。主要也是從工作上的個人提升,以及生活上的...… Continue reading

《神经网络》课程笔记

Published on November 06, 2016

搜索广告机制设计

Published on November 02, 2016