NATURAL LANGUAGE

Communication -- the intentional exchange of information brought about
by the production and perception of signs drawn from a shared system
of conventional signs.

	many types of animals can communicate through a 
	relatively small set of signs

Language -- a complex, structured system of signs

	is considered to be unique to humans

	question: did intelligence or language emerge first?
		was it the result of our small noses?


Natural language vs. Formal language

	English, German, ... vs. C, Java, LISP, SQL, ...

	formal languages are invented and rigidly defined

	natural languages emerge and evolve over time


Formal language defined as a set of strings (sequences of 
terminal symbols

Phrase structure - strings are composed of phrases

	 ->  

	nonterminal symbols represent phrase categories



			CHOMSKY LANGUAGE HIERARCHY

Recursively enumerable		A B -> C
(any combination of symbols on right and left)

Context-sensitive		A B -> B A
(right-hand side must have >= symbols of left)

Context-free			S -> a S b
(left hand side must be single non-terminal symbol)

Regular				S -> a S
(lhs nonterminal, rhs terminal with optional nonterminal)


Component Steps of Communication

intention 		- S wants H to believe P
generation 		- S chooses words W
synthesis 		- S presents W (most likely directly to H)
perception		- H perceives W'
			  (hopefully W = W', but not always)
analysis		- H infers possible meanings P1, ...  Pn
disambiguation		- H infers S wanted to convey Pi
			  (ideally P = Pi)
incorporation		- H decides to believe Pi'
			  (might reject Pi as misinformation)

Encoded-message vs. situated language views



				FORMAL GRAMMAR

lexicon -- allowed vocabulary
	grouped into categories for parts of speech
	open classes (nouns, verbs, adjectives, adverbs)
	closed classes (pronoun, article, preposition, conjunction)

grammar -- combining vocabulary into sentences
	rule-based approach common

Example lexicon:
	Noun -> stench breeze nothing wumpus gold east ...
	Verb -> is see smell shoot feel stinks go grab carry kill ...
	Adjective -> right left east south back smelly ...
	Adverb -> here there nearby ahead right left east ...
	Pronoun -> I me you it ...
	Name -> John Mary Boston Texas (College Station) ...
	Article -> the a an ...
	Preposition -> to in on near ...
	Conjunction -> and or but ...

Example grammar:
	S -> NP VP | S Conjunction S
	NP -> Name | Noun | Pronoun | NP PP | Article Adj* Noun
	VP -> Verb | VP Adverb | VP PP | VP NP | VP Adjective
	Adj* -> Adj Adj* | nil
	PP -> Preposition NP
	RelClause -> that VP



			PROBLEMS, SEMANTICS, AMBIGUITY

Problem:

current grammar does not distinguish cases of verbs and 
pronouns ...
	Me grab gold.

Each distinction could be represented using rules but for n 
distinctions, each with m possible values, there are mn 
rules required.

Semantics:

When a phrase, sentence, or set of sentences has been 
parsed and interpreted, the interpretation can be placed in 
the system's knowledge base.

Ambiguity:
	the boy saw the girl on the hill with the telescope

deictic references: it is over there

Famous mistranslation:
	"the spirit is willing but the flesh is weak"
	-> "the vodka is good but the meat is rotten"

Attaching semantic claims regarding sentence to 
disambiguate (unification).



			LIMITS OF FORMAL GRAMMARS

Simple formal grammar will not match natural language
	not all speakers agree on what the language is
	language evolves
	some ungrammatical sentences are understandable
	grammatically of sentences ranges from good to bad

Also, simple formal language grammars do not represent 
semantics and grammar is easier than semantics.

Even more, inferred goals of speaker, their trustworthiness, 
and the believability of statement will affect 
interpretation.

But we do this all day, every day, for most of our lives.

Non-literal language
	Metaphor
		"The batter put the ball in orbit."
	Analogy
		"Turning in that final was like scoring the winning
		run in the last game of the world series."
	Sarcasm
		"No, your other right!"

<\pre>