Showing posts with label semantics. Show all posts
Showing posts with label semantics. Show all posts

Tuesday, April 12, 2011

Parsing and questions

The usual declarative sentences are considered simple by formal semantics. In fact they are not, but anyway they're the simplest what there is. They even deceive people to believe that the first-order logic is adequate for expressing their meaning. John loves Mary is loves(john, mary). Simple. It gets more interesting with quantifiers, especially when there is more than one (can you spot an ambiguity in Every man loves some woman?), but that's not my point. Remember, this was the first Kharms' sentence:
Удивительный случай случился со   мной: я вдруг    забыл, что  идет раньше - 7 или 8
Amazing case happened with me I suddenly forgot what goes earlier 7 or 8
An amazing thing happened to me today, I suddenly forgot what comes first - 7 or 8

And the part of it remaining uncommented is the very last one, starting with what. It looks suspiciously like a question. I can imagine saying to myself What comes first 7 or 8? Damn, I forgot that! Yes, that's definitely a question. So, we have to step outside the comfortable world of declarative sentences and enter the darkness of what the advanced topics of formal semantics are about: interrogatives. Man, they even have a semester seminar on questions, only questions and nothing but questions!

But a quick look at their analyses is sufficient for me to realize that I don't like them. I don't want to implement that for the first 10 years (well, maybe less: they provide some code in Haskell), and then spend the rest of my life analyzing the resulting sets-of-sets-of-possible-worlds kinds of structures just to understand that it only means What comes first?. I don't seek an absolute truth, for me the simpler the structure, the closer it is to the surface, the better. Of course, if it still is acceptable as a true interlingua.

That said, I don't have much choice on how to represent the question from above. It's a clause containing a verb and a subject, and all these three entities are unusual in their own ways.

The unusual thing about verb is that it consists of two words - come first. Actually, it's a more generic verb come X, where X can be of any scalar value: first, next, previous, last, 42th, etc.

The subject is also unusual since it's what, a typical wh-word which many questions start with. It also comes with variants at the end of the clause - 7 or 8. I consider this a special construction, characteristic of questions. Those 7 and 8 are just listed in the semantics as the variants slots of the what frame.

Finally, the clause is unusual since it has to mark in some way its questionness. It would also be nice if it could specify which part of the clause is asked about (here it's the subject what). These two things are solved by one means: the situation corresponding to this clause has a questioned attribute pointing to what. Simple.

Finally, there should be a way of linking the question clause to the verb it depends on: forgot. It would be also nice to distinguish between the different things one can forget: real things (I forgot my cell phone), facts (I forgot that 2x2=4), some values (I forgot the area of Africa) and, finally, the answers to the questions (I forgot what comes first). At least two of these variants employ clauses: facts and questions. Luckily, a fact's clause definitely won't have questioned attribute, while in our case it will definitely be there. So indeed, we can just say that forgot's theme is the whole situation corresponding to the question and seems to be sufficient for the current purposes.

So, now I'm finally ready to present the semantics built for the complete sentence. Well, almost ready. There remains that or in 7 or 8. That's a conjunct, and, as conjuncts are my favorite and very interesting phenomena, I'll discuss them later in great detail. So, the interlingual representation for the first Sonnet sentence is this:

A.property=AMAZING
A.type=THING
A.given=false
B.type=HAPPEN
#1.time=PAST
B.arg1=A
B.experiencer=C
C.type=ME
#1.elaboration=#2
--
A.type=ME
B.manner=SUDDENLY
B.type=FORGET
#2.time=PAST
B.arg1=A
B.theme=#3
--
#3.questioned=A
#3.time=PRESENT
B.type=COME_SCALARLY
B.order=EARLIER
B.arg1=A
C.type=7
C.number=true
A.variants=D
D.member=C
D.conj=or
E.type=8
E.number=true
D.member=E

Wednesday, March 30, 2011

New model

What happens when you hear a sentence? Nobody knows for sure, but there are some guesses on the market. Many people think you build a logical formula, and do something with it (perhaps, store it and use for inference). However, I find huge logical expressions terribly hard to analyze, and as I surely need some analysis to translate correctly, this idea doesn't suit me.

So I prefer to think in terms of objects and relations between them. Luckily, I don't have to invent anything, since some psychologists have similar ideas. They argue that during sentence comprehension people don't operate with abstract logical symbols, but mentally simulate the input, and this is what we call meaning. You hear The eagle in the sky, you imagine an eagle in the blue sky with few clouds and airplanes, and it's natural that the eagle in this picture has wide-spread wings, and not, say, folded.

Many experiments seem to support this theory. But actually that doesn't matter: I don't care that much what really happens in the brain. What I care about are ideas that may be useful for parsing. And I find the idea of mental simulations quite useful. The words don't have to determine the meaning, they act like cues which invoke the memories of previous situations where you met them. These memories are put together and a new situation is simulated. The situation may involve objects or actions that are not mentioned at all, but they'll be highly accessible in the following discourse, without a complex logical inference. The important things that the hearer learned through simulation are stored in the memory for future retrieval, also as parts of other simulations.

This is a completely informal theory. That's great because I may treat it in any way useful to me. In particular, I assume the simulation engine to be a black box that communicates with the parser and generator via frames. Frames are just objects, with attributes and values. Values may be strings or other frames. The parser creates cues like Frame A has attribute 'actor' pointing to frame B or Frame C is described as 'man'. And the notation is:

A.actor=B
C.type=man

The parser gives such cues to the simulator as soon as it encounters new words. It also receives feedback which may help choosing between several competitive parses. But it doesn't know what the simulator does internally, it only sees those frames and attributes. That allows me to mock the simulator in any way I want. In the end, my main object of interest is parsing, so I leave a well-behaving simulator to someone else.

The target language generator has access to both the sequence of the cues fed by the parser to the simulator, and to the simulator itself. The cue sequence is needed to provide as close translation as possible, so that what was said in the source will be more or less the same as what was generated. The simulator is needed to ensure that the information inferred from the cues in both languages is similar as well. The simulation results may also be used in cases when the generator needs something not explicitly specified in the source but obligatory in the result, for example, pronoun gender, when translating from Finnish to Russian.

So far, the model is quite simple. A discourse is a sequence of (numbered, #1,#2,...) situations. Each situation has a sequence of constraints on frames (referred to by variables: A,B,C,...) and their attribute values. Situations are also variables, and also have properties that may be assigned.

For example, let's consider the first part of Kharms' Sonnet:

An amazing thing happened to me today, I suddenly forgot....

It's currently analyzed as:

----- Situation #1 -----
A.property=AMAZING
A.type=THING
A.given=false
B.type=HAPPEN
#1.time=PAST
B.arg1=A
B.experiencer=C
C.type=ME
#1.elaboration=#2
----- Situation #2 -----
A.type=ME
B.manner=SUDDENLY
B.type=FORGET
#2.time=PAST
B.arg1=A
B.theme=#3
----- Situation #3 -----
...

Sunday, January 17, 2010

Operational Text=>Meaning model

The model has several levels, though they are almost completely different from Melchuk's:

Level 1: text/phonology.
The parser receives it as an input and then produces

Level 2: constructions.
The contiguous text is separated into constructions of all sizes, from morphological to complex syntactical ones. Basic constructions are (mostly) word stems, complex constructions combine the simpler ones. The structure is most probably a tree though not necessarily. Each construction may have information structure. Constructions are ordered and considered to be executable in that order. Being executed, they give us

Level 3: semantics: a program.
In some general-purpose (programming) language with well-defined semantics and model. The model probably consists of local variables (max. 4-7), salient context and general memory, which is organized as a graph of frames and represents the listener's knowledge of the world. The language is imperative, object-oriented and almost lacks any control flow, being very linear. The basic instructions are frame creation and assignments to local variables or frame slots. If we execute this semantics program, we'll get

Level 4: change in the frame structure in the memory.
Which means change in the listener's knowledge of the world. Actually, the change may be greater than we expect looking at the semantics program. Every its simple instruction may cause significant changes that are not directly encoded in that program. This is what we call pragmatics.

We've done! Though it's worth noting that the built frames might well encode a sequence of actions (e.g. cooking recipe), which can also be executed via interpretation and result in real listener's actions in the real world.

Saturday, December 26, 2009

Natural language programming

Programming languages (PL) are for computers, natural languages (NL) — for humans. Computers execute programs, we execute texts. So perhaps it would be useful for NL processing to look for more inspiration in PL processing? I don't mean syntax which seems much more complicated in NL. I'm talking about semantics and pragmatics.

In NL, semantics is the literal meaning of what's written, and pragmatics is how a human will actually understand it in discourse, what the response will be, which changes will the text cause in the world. There's something similar in PL. Semantics is about how each construction (assignment, addition, loop, etc.) is executed. It's usually found in the language specification and can be formalized. Pragmatics, OTOH, is what the program does, is it quick-sort, tic-tac-toe game or Linux. PL pragmatics explores how the separate statement meanings are working together to get the job done.

PL semantics is clearly defined and expressible in terms of the same PL (assuming it's Turing-complete). There are plenty of interpreters and compilers proving this. At the same time, PL pragmatics is different, it can't be captured by means of any PL. Generally speaking, you can't tell by looking at a program, what it does. You even can't tell if it'll ever stop. Nevertheless, there are static code analysis tools that do capture some pragmatic aspects of a program and they actually help to find some bugs.

So, if we believe Church and Turing, then there are two news for NLP. The good one is that NL semantics can be fully defined in the terms of that NL, by human beings. And the bad one is that you can write tons of papers analyzing the hidden meanings and ideas in Hamlet and never come up with a complete description. It's pragmatics.

Sunday, October 25, 2009

A solution for John

It appears that I've been too pessimistic claiming that I can't assemble the meaning of 'John has two sisters' from the word meanings. Blackburn&Bos have taught me:

John: λv (v JOHN)
has: λsubj λobj (subj (λs (obj s id)))
two: λn λy λv (v (∃S (|S|=2 & ∀x∊S (n x y))))
sisters: λx λy SISTER(x,y)

Let's evaluate:

(two sisters) = λy λv (v (∃S (|S|=2 & ∀x∊S SISTER(x,y))))
(has John) = λobj (obj JOHN id)
((has John) (two sisters)):
   ∃S (|S|=2 & ∀x∊S SISTER(x,JOHN))

Note that this semantics also doesn't assume that John has only 2 sisters, so it really looks like an appropriate one.

So, the applicative text structure is restored in its rights. Although I don't much like this kind of solution, because the semantics of 'has' and 'two' are too dependent of the fact that SISTER predicate takes 2 arguments. I'd have to change them both to express 'John has two dogs', thus making 'two' (and every other numeric quantifier) ambiguous in semantics. But it's still better than nothing.

Friday, October 23, 2009

Asking right questions

When I said that semantics of 'John has two sisters' was |{ x | SISTER(x, JOHN) }|=2, I wasn't quite correct. In fact there's nothing in the text that preventing John to have 5 or 42 sisters. It's the Maxim of Quality which may limit the sister count to 2. Being not an absolute rule, this maxim can be easily flouted and the sentence could actually mean that John has more than 2 sisters in a right context.

Things get even more interesting if we just add one word: John has two beautiful sisters. There just isn't a default meaning here! John may have 2 sisters that are beautiful, but he may have 2 beautiful sisters and another 3 who are not so beautiful.

The question is, what should computer do in such situations. Should it apply pragmatic knowledge and disambiguate everything immediately after syntactic analysis using whole context? Or should it maintain an intermediate semantic representation and give it to some pragmatics module who could infer everything from the semantics? I clearly prefer modularization, i.e. the latter possibility. Of course I don't suppose any sequentiality, the modules may run in parallel interactively.

If we separate semantics from pragmatics, the representation problem arises again, even harder now. The semantic structure should be very generic, it should be interpretable in all the ways that were possible with the original text (minus the resolved lexical/syntactic ambiguities). And at the same time there should be no way of understanding it in any other way. If we just replace = with >= in the John has two sisters meaning, the pragmatics module still won't be able to apply the Quality Maxim. Such a meaning could well be produced from John has at least two sisters which is unambiguous with respect to sister count. So it still should be some kind of =2, but in a form open for interpretation. What a format could it be? I don't know. Yet.