Showing posts with label nlp. Show all posts
Showing posts with label nlp. Show all posts

Monday, August 27, 2012

Suddenly filler-gap dependencies

Theoreticians say: in sentences like I know who you saw the deep structure is in fact I know [you saw who]. And in the surface structure there's an invisible gap after saw which is filled by who. Many psycholinguistic studies also seem to confirm that: upon seeing who people start to wait for a right place for it and only settle down after finding it.

Due to Russian free word order, I've had a luxury to ignore this complex thing for a while and treat wh-words as normal verbal arguments, like pronouns. But then two surprises have come.

One surprise was that implementing filler-gap dependencies was the easiest way to resolve a nasty ambiguity. Russian has a word что which can be either a complementizer (я знаю, что ты видел его; I know that you saw him) or a wh-word (я знаю, что ты видел; I know what you saw). The first one is higher in structure than the verb, the second one is lower. This made my parser suffer, it still doesn't like visibility ambiguities very well. Now что is no longer a verbal argument directly, it's a filler and is also higher in the hierarchy, just like in many syntactic theories.

Another surprise was that all this was actually very easy to add in the current parser architecture (given that there's no pied-piping yet). The filler is just a special construction which listens for what the incoming words contribute. If a contribution looks as a right head for the filler's grammatical functions, the contribution is enriched accordingly.

Example: Russian wh-word что can be in nominative or accusative case. For normal nouns that would mean it should generate nom and acc construction mites with noun attribute defined, pointing to a frame with some special wh semantic type. In the filler-gap approach it generates a filler construction instead which then sits and waits until it sees a contribution with nom or acc mites with head attribute defined. E.g. saw as a verb can be a head to both nominative and accusative arguments. The filler construction then adds a nom/acc mite having both head and noun attributes, where the noun points to a frame with wh type, and the head comes from the verb.

So how my parser works in this aspect now is quite similar to human sentence processing: a wh-word creates an active filler that finds a gap when there comes a verb with suitable argument requirements.

Tuesday, November 8, 2011

NLP using graphs, actually


The reason for complete rewrite was simple:

Это  отвлекло   нас от   нашего спора.
That distracted us  from our    argument.

Every word makes a contribution to the parsing state by saying that it participates in a number of constructions, and provides some attributes for each of them. So a contribution consists of pairs of constructions and attribute sets. I'll call such pairs mites.

That (это) is capable to add noun attributes to nom(inative) and acc(usative) constructions, and these mites contradict each other. Distracted (отвлекло) is a transitive verb, therefore it defines head for nom and acc constructions. But these two mites don't contradict each other, they're both very welcome.

Earlier, I unified mites as soon as they came, and I only kept the results of the unification, not the mites themselves. After the word that, there were nom and acc constructions with noun defined, and nom was (randomly) chosen as active. Active constructions had higher priority, and therefore the verb's nom mite was integrated first. Two nom mites were unified, and the nom construction with both head and noun defined linked the values of these attributes semantically. That's all great. But what to do with acc?

You can't just unify it as well, because there's already a nom construction and it contradicts acc in its first mite. So you have to drop either of the mites completely and leave the other. In this sentence, it made sense to drop the acc.noun mite — the accusative alternative of that. The acc.head mite would survive and unify nicely with the real object that comes next — us. But that's a hack and that didn't work when the first это was in fact accusative. A more right way would be to preserve both acc mites but only mark the second one as active. That clearly required per-mite active status, not per-construction.

So now I don't remove any mites automatically when processing word contributions, they all survive. The matter is, are they active or not. Active mites are guaranteed to be non-contradicting. Only active mites about the same construction are unified and passed to this construction, it then may make some semantic changes based on that information. 

When another word comes, the parser solves a constraint satisfaction task on graphs. Some of the new mites are chosen to be active, the old mites may change their active status to accommodate that. The reanalysis tries to 
  • minimize the number of active constructions
  • to maximize the number of mites in those constructions (more mites mean more information)
  • prefer more recent mites
  • to prefer semantically more plausible variants
Surprisingly, it works, though twice as slow as before. And the algorithm now seems so clean and logical that I can't find anything to remove! But I definitely should optimize, as I just get bored waiting for long 10 seconds until 171 tests pass.

Tuesday, October 25, 2011

NLP using graphs

Could natural language parsing be a task on graphs? Maybe, at least partly.

Now, during parsing, every word contributes to some constructions. Some of these contributions are mutually exclusive. For example, in Russian a noun typically can't be nominative and accusative at the same time. So this noun's contributions for nom and acc constructions are incompatible.

Different words also may have contradicting contributions. Two nouns in the same clause can't both be accusative even if their forms are compatible with such an alternative (i.e. they both contribute to acc construction).

So here's the idea. Consider a graph whose vertices are formed by each construction contribution for each word. And where those contributions contradict each other in any way, there's an edge. The task is then to find a maximal subset of vertices which are not connected to each other.

There are, of course, other constraints. The resulting graph should make sense from the semantic viewpoint. The subset should be constructed incrementally and conservatively: if the parser can proceed without reanalysis of the already built structures, it should do so. Finally, this graph task has a very limited scope, e.g. it doesn't apply in different clauses.

That's just an idea. Currently my parser doesn't track individual contributions per construction, it just unifies them eagerly. But the test data suggests that might change.

Monday, July 26, 2010

Frames, constructs and the log

Some time has passed. My ideas of the internal representation of natural language have changed. Here are the new ones.

Every word in the input is taken as a whole and all the information about it is stored in a single frame. The frame may have different aspects: morphological, syntactic, semantic, discourse, etc. The aspect names are called roles, the objects corresponding to these roles are called constructs.

The lexical ambiguities are handled frame-internally. In case it's syntactically ambiguous, it has several possible syntactic constructs. A polysemous word may have different semantic constructs. The frame should eventually choose one construct for each role, and the chosen constructs have not to contradict with each other. Here's how a frame for the famously ambiguous bank could look like:
Once frames with embedded constructs appear in the model, they remain activated for some time and may perform various actions. They may add constructs to their own or another frame. They may establish links to constructs of other roles in the same frame. Or they may create connections to other frames.

Establishing a connection between frames means creating a new frame whose child frames are the ones that are connected. The new frame may also have several aspects. For example, in the usual John loves Mary two extra frames are created to link the predicate with the subject and the object respectively. These frames host two constructs both: one representing syntactic relation, another - semantic one (in experiencer, state and theme terms):
As I've already stated, not only the final structure is important, but also the sequence in which it was constructed. For this, a simple program-like log can be used:

frame: syn=noun, sem=JOHN
frame: syn=verb;transitive, sem=love
frame ^2 ^1: syn=subject+predicate, sem=experiencer+state
frame: syn=noun, sem=MARY
frame ^3 ^1: syn=predicate+object, sem=state+object

Each line here talks about some new frame. The frame's children are referred to as ^x where x is how many lines back was the one talking about the mentioned frame. ^1 means the previous line, ^2 - the one before the previous. When a frame is created, it has no aspects. Their subsequent assignments are reflected in the log.

This log reflects the dynamics of the parsing process and can thus be used in the applications where the order of operations is important, like machine translation.

Monday, February 8, 2010

Trees in NL parser

Consider you're a parser analyzing the famous The horse raced past the barn fell. Additionally, you're operating in an incremental left-to-right fashion, considering single active parse variant at each moment. Independently of the theory, you'll probably combine the horse and past the barn into some kind of units, say NP1 and PP2. Before you see the last word, you'll probably consider that NP1 is the subject of the finite verb raced, and PP2 is its adjunct. But then the fell comes. At this moment, you'll understand that the single active parse variant you have is wrong, return to raced and reanalyze it as a passive participle (The horse [that was] raced past the barn fell). This is all well known and boring.

What's not boring is the implications for a computational parser operating in a similar way. First, it should be able somehow to restore its state right after NP1 and reparse from there, choosing the passive participle alternative of raced. Second, it would be nice not to reparse the PP2 again. This word sequence will anyway produce the same result and function as an adjunct, though now in the relative clause.

Since we're going to store the parsing state and analyze it in various ways (e.g. find slots to fill), we'd better store it as a data. What kind of data? The standard answer is, an (incomplete) tree. I'll show that's not convenient.

Let's consider mutable trees first. Nodes can be added or removed in such trees to any positions at any time. Therefore to store anything we need first to copy it completely and put aside, so that subsequent changes in the active tree aren't reflected in the copy. This applies both to parser state after NP1 and to the i'd-like-it-cached PP2. But we don't want to copy the parser state at each moment just in case we'll have to backtrack to here in future: we'll run out of memory very soon.

An alternative approach I see is not to copy anything, but to log thoroughly every change you make to the tree in an undoable manner. When a need for backtracking arises, we could just go back through the log undoing every single change and restart. This solves the state-saving problem, but doesn't offer caching for what's undone. We'll have to reparse PP2 anew.

So, mutable trees don't appear to be useful for our particular parser. Let's look at the immutable trees. Their nodes are created once and fixed forever. No nodes can be added or removed. The only way to attach a constituent to a tree is to produce a new tree with the constituent already in the right place. The parts of the old tree not directly dominating the attachment place can be reused in the new tree, everything else has to be recreated. Luckily, the reused part is usually much bigger, so not very much memory is consumed. So, we could easily just save current parsing state at each moment without fear. And we can also cache the PP2 before reparse since no one can hurt it anymore, it's immutable.

But all this is not for free. You should keep track of all the places in the partial tree where it's possible to attach. For example, in 'I gave the ...' there are 2 slots awaiting to be filled: the noun after the and second argument of gave. In the tree they should be somehow represented, and at least one of them won't be on the top level.

The slots to be filled may also not be marked explicitly. In I saw a girl with a telescope the PP with a telescope part may be attached either to noun girl or to the verb saw. Since it's not required by any of them, the parser should walk down the whole tree and find all the constituents touching the right side, and then trying to combine them with the PP. Moreover, psycholinguists observe that most often such an adjunct PP is attached to the downmost suitable constituent. Probably it's easier that way for humans, but this would be very resource-consuming operation in the computational parser.

Finally, the attaching place may even be located not on the right boundary at all. It's the infrequent but still important case of discontinuous constituents, usually occurring in the free word order languages like Russian, Latin or Walpiri. English also has such examples:

The question has been asked, if there's life on Mars.
"Tomorrow," he said, "I'll return".
Yesterday, however, they left.

In the first example, we should attach the subordinate clause if there's life on Mars to the NP the question, which doesn't touch the right boundary at all. The tree we have at that moment is something like

[S: [NP: the question] [VP: has been asked]]

and represents a complete grammatical sentence. So, do we have to search the whole tree to find the constituent to attach to? Nonsense. But I have no idea on what the reasonable attachment strategy should be, that uses immutable trees and covers the presented data.

Of course, all these problems are solvable. It's just me that doesn't want to implement those solutions due to their clumsiness, complexity and possible bug-riddenness. I'd better just use another intermediate parsing representation for the serial parser. Which one? That's another story.

Wednesday, December 30, 2009

Separate sentence meaning

Many natural language semantic formalisms are dividing the meaning into predicates and their arguments. They may be called different names, but John loves Mary's representation is anyway something like LOVES(John, Mary). Almost everyone seems to agree that this captures this sentence's meaning quite well, though details of representation may vary significantly.

What's wrong with it? Imagine this idea being uttered in different contexts (bold indicates logical stress):

(1) Who loves Mary? John loves Mary.
(2) Whom loves John? John loves Mary.
(3) Does John like Mary? John loves Mary.
(4) Alice loves Bob, John loves Mary.
(5) Why is John so happy? John loves Mary.
(6) John loves Mary. //the very beginning of a text
(7) Do you know who loves Mary? It's John!

After hearing any of these 7 examples the listener knows that LOVES(John, Mary). But does it mean that in each example there's a sentence with that meaning? Actually, only (6) has exactly that meaning, while in the other examples it's split across two clauses in various ways.

A natural definition of sentence semantics would be the difference between listener's knowledge before and after hearing the sentence. In this case, the meanings of John loves Mary are completely different, because we hear this clause with different background knowledge:

(1) We know LOVES(X, Mary). X := John
(2) We know LOVES(John, X). X := Mary
(3) We know X(John, Mary) and even wonder if X=LIKES. But X := LOVES
(4) We know LOVES(X, Y). X := John, Y := Mary.
(5) We know X(John). X := λy LOVES(y, Mary).
(6) We know nothing. LOVES(John, Mary).
(7) We know LOVES(X, Mary). X := John. //same as (1)

We now see 6 very different semantics for just one sentence, pronounced with different intonation. And only (6) is canonical, where we just have no background (although the listener probably knows John and Mary). So it appears that the traditional logical approach describes just the sentences that start a text/discourse. But there are very few of them compared to the number of all the sentences in the language! What's the point of analyzing only a small fraction of the material?

So, to describe a sentence meaning, you should always pay attention to what the reader/listener knew before conceiving it. Otherwise you just can't call it sentence meaning. Isn't that obvious? Fortunately, there are modern dynamic semantics approaches that seem to understand the problem. It's just a pity that for such a long time it wasn't widely appreciated.

Saturday, December 26, 2009

Natural language programming

Programming languages (PL) are for computers, natural languages (NL) — for humans. Computers execute programs, we execute texts. So perhaps it would be useful for NL processing to look for more inspiration in PL processing? I don't mean syntax which seems much more complicated in NL. I'm talking about semantics and pragmatics.

In NL, semantics is the literal meaning of what's written, and pragmatics is how a human will actually understand it in discourse, what the response will be, which changes will the text cause in the world. There's something similar in PL. Semantics is about how each construction (assignment, addition, loop, etc.) is executed. It's usually found in the language specification and can be formalized. Pragmatics, OTOH, is what the program does, is it quick-sort, tic-tac-toe game or Linux. PL pragmatics explores how the separate statement meanings are working together to get the job done.

PL semantics is clearly defined and expressible in terms of the same PL (assuming it's Turing-complete). There are plenty of interpreters and compilers proving this. At the same time, PL pragmatics is different, it can't be captured by means of any PL. Generally speaking, you can't tell by looking at a program, what it does. You even can't tell if it'll ever stop. Nevertheless, there are static code analysis tools that do capture some pragmatic aspects of a program and they actually help to find some bugs.

So, if we believe Church and Turing, then there are two news for NLP. The good one is that NL semantics can be fully defined in the terms of that NL, by human beings. And the bad one is that you can write tons of papers analyzing the hidden meanings and ideas in Hamlet and never come up with a complete description. It's pragmatics.

Monday, November 9, 2009

Communicative fragments by example

Communicative fragments (CF) are presented as overlapping template sequences that cover the text. They consist of fixed parts (e.g. words) and placeholders which may be filled by certain expressions. A limerick example:

There was an Old Man in a pew,
Whose waistcoat was spotted with blue;
But he tore it in pieces,
To give to his nieces,--
That cheerful Old Man in a pew.


The resulting fragments will be:

there was X[NP] in Y[NP] -> Clause
//variables are uppercase
//sample placeholder constraints are in brackets
//NP=Noun Phrase
//substituted template forms a clause

an X[NP, Starts with a vowel] -> NP
old X[NP] -> NP
man -> NP
//one-word fragment
X[NP] in Y[NP] -> NP
//no one has promised that CF will form a tree
//cf. the first fragment

a X[NP, Starts with a consonant] -> NP
pew -> NP
X[NP] whose Y[NP] Z[Clause, finite] -> NP
//V=Verb
//note a non-projective construct
was X[Participle] -> Clause
//a rule for passive
spotted with X[Adj] -> Participle
but X[Clause] -> Clause
X[NP] tore Y[NP] -> Clause
X[=tear] in pieces -> Clause
//any form of 'tear'
X[Clause] to Y[Clause, infinitive] -> Clause
X[=give] to X[NP] -> Clause, finite
his X[NP] -> NP
that X[NP] -> NP
cheerful X[NP] -> NP

That's a way of parsing: considering syntax as a set of CF patterns where every pattern contains at least one lexical entry that triggers the rule. It should be also much easier to extract such a set from a corpus than to induce a typical generative grammar with movements.

It isn't specified if anything can occur between template components assuming that everything can. Hence free word order languages are supported, but English syntax rules are weakened very much, allowing ungrammatical sentences. So there needs to be a tight parser integration constraining the fragments' freedom.

Sunday, October 25, 2009

A solution for John

It appears that I've been too pessimistic claiming that I can't assemble the meaning of 'John has two sisters' from the word meanings. Blackburn&Bos have taught me:

John: λv (v JOHN)
has: λsubj λobj (subj (λs (obj s id)))
two: λn λy λv (v (∃S (|S|=2 & ∀x∊S (n x y))))
sisters: λx λy SISTER(x,y)

Let's evaluate:

(two sisters) = λy λv (v (∃S (|S|=2 & ∀x∊S SISTER(x,y))))
(has John) = λobj (obj JOHN id)
((has John) (two sisters)):
   ∃S (|S|=2 & ∀x∊S SISTER(x,JOHN))

Note that this semantics also doesn't assume that John has only 2 sisters, so it really looks like an appropriate one.

So, the applicative text structure is restored in its rights. Although I don't much like this kind of solution, because the semantics of 'has' and 'two' are too dependent of the fact that SISTER predicate takes 2 arguments. I'd have to change them both to express 'John has two dogs', thus making 'two' (and every other numeric quantifier) ambiguous in semantics. But it's still better than nothing.

Friday, October 23, 2009

Asking right questions

When I said that semantics of 'John has two sisters' was |{ x | SISTER(x, JOHN) }|=2, I wasn't quite correct. In fact there's nothing in the text that preventing John to have 5 or 42 sisters. It's the Maxim of Quality which may limit the sister count to 2. Being not an absolute rule, this maxim can be easily flouted and the sentence could actually mean that John has more than 2 sisters in a right context.

Things get even more interesting if we just add one word: John has two beautiful sisters. There just isn't a default meaning here! John may have 2 sisters that are beautiful, but he may have 2 beautiful sisters and another 3 who are not so beautiful.

The question is, what should computer do in such situations. Should it apply pragmatic knowledge and disambiguate everything immediately after syntactic analysis using whole context? Or should it maintain an intermediate semantic representation and give it to some pragmatics module who could infer everything from the semantics? I clearly prefer modularization, i.e. the latter possibility. Of course I don't suppose any sequentiality, the modules may run in parallel interactively.

If we separate semantics from pragmatics, the representation problem arises again, even harder now. The semantic structure should be very generic, it should be interpretable in all the ways that were possible with the original text (minus the resolved lexical/syntactic ambiguities). And at the same time there should be no way of understanding it in any other way. If we just replace = with >= in the John has two sisters meaning, the pragmatics module still won't be able to apply the Quality Maxim. Such a meaning could well be produced from John has at least two sisters which is unambiguous with respect to sister count. So it still should be some kind of =2, but in a form open for interpretation. What a format could it be? I don't know. Yet.

Thursday, October 22, 2009

Shallow vs. structural analysis

Let's look at a very simple sentence, namely 'John has two sisters'. I'm now interested in its semantics, or, more precisely, its computer representation. The truth condition is actually very simple, it says that the number of those who happen to be sisters of John equals to 2:

|{ x | SISTER(x, JOHN) }|=2

(let the uppercase letters denote some semantic meaning here).

A question arises, how can we assemble this semantics from meanings of sentence components? The constituent structure for this sentence would be:

[S[NPJohn] [VPhas [QPtwo sisters]]]

The dependency structure:

John <- has -> two -> sisters

The beloved one, applicative structure:

(has John (two sisters))

Lexical Functional Grammar-style:

| PRED 'has'
|
| SUBJ | PRED 'John'
|
| OBJ | PRED 'sisters'
| | SPEC | NUM 2

In any of these variants has has two arguments: John and the combined two sisters. So it appears that we should combine the word meanings in this order, getting something like f(HAS, JOHN, g(2, SISTER)). And this formula should somehow be equivalent to |{x | SISTER(x, JOHN)}|=2. The question is, what are f and g? I see no direct structural answer. The best variant I've come to is that we should change the structure, replace it with another one which contains only one predicate:

HAS_N_SISTERS(Who,N)

which would translate to

|{ x | SISTER(x, Who) }|=N

This can be generalized a bit (take sibling instead of sister), but not further. A similar sentence 'John has two dogs' would have a different semantics, e.g. |{x | DOG(x) & BELONGS(x, John)}|=2. A two-place 'sister'-like 'dog' predicate would be funny.

So it seems that all the structures I know of are of no use with this sentence. That's one of the reasons I prefer shallow parsing based on patterns with wildcards: it appears to map better onto semantics. And a probable sad consequence is that the applicative structure, though being so beautiful, will remain unapplied.

Wednesday, October 21, 2009

What Science Underlies Natural Language Engineering?

Quote:

A superficial look at the papers presented in our main conferences reveals that the vast majority of them are engineering papers, discussing engineering solutions to practical problems. Virtually none addresses fundamental issues in linguistics.


Couldn't have said it better.

Tuesday, October 20, 2009

Linear Unit Grammar

Predominant natural language syntax theories usually deal with one sentence, which is carefully selected to be well-formed, so called 'grammatical'. But the live language seems to be different, more complex, especially the spoken one. It contains lots of repetitions, false starts, hesitations, reformulations, speaker changes and so on.

I therefore like meeting theories aiming to describe spoken discourse 'as is' instead of labelling it as an incorrect. One of those theories is Linear Unit Grammar by John Sinclair and Anna Mauranen.

The main notion here is 'chunk'. It's a fuzzy pre-theoretical concept, with no precise definition. Basically it's a linear fragment of input signal (text, speech, etc.) which people tend to comprehend all at once. It may be unfinished. Usually it's formed of closely connected words like a noun phrase with adjectives (a small dog) or verb with its obligatory arguments (to love Mary). Relative clauses, of course, form separate chunks. Moreover, the auxiliary words (like 'that' in 'the dog that loves Mary') are separated from anything else and go to single-word chunks. The chunks are very small, their size rarely exceeds 5 words.

Here's a sample chunking made by me from a spoken English corpus. The analyzed fragment is:

so an example of this s- s- second rule i mean the second rule probably is the easiest one to look at cuz you have the, the f- the the four-six-seven type of relationship

I divide it according to my intuition only. Whenever I have doubts, I put a chunk boundary. And so the result will be:

1. so
2. an example of this
3. s- s-
4. second rule
5. i mean
6. the second rule
7. probably
8. is
9. the easiest one
10. to look at
11. cuz
12. you have the
13. the f-
14. the
15. the four-six-seven type
16. of relationship

Sinclair & Mauranen classify the chunks into organizational fragments (1,5,11) and message fragments (the others). These groups are also divided into subgroups according to organizational function or message completeness. There's a non-deterministic algorithm that translates any kind of text into a well-formed one. In this example it would be something like 'an example of this second rule is probably the easiest one to look at cuz you have the four-six-seven type of relationship'.

That's a bit surprising! How can anyone claim that 'grammatical sentences are unnatural, let's analyze the real ones' and then analyze spoken discourse by first making it grammatical? The answer is that the authors in fact don't aim to compete with major syntactic theories, they strive to co-exist with them, at least in the beginning. The described algorithm may be just a first step in complex language analysis. Authors also suggest chunking could help in second language teaching/learning.

What I personally like about Linear Unit Grammar is precisely the chunking. It's so simple! And, in contrast to Gasparov's approach when the text is divided into communicative fragments, the LUG chunks are contiguous and non-overlapping. Therefore the LUG chunking can be done by simple regular expressions, or Markov processes. A great part of the syntax lies inside chunks so there's no need to analyze it the same way as the 'bigger' structures like clause subordination. NLTK seems to provide chunking out of the box, so I guess I gotta try it.

Sunday, March 22, 2009

Topic-Focus articulation

As it is known, every sentence can be divided into two logical parts. One (Topic) is what the sentence is about, another (Focus) is what the sentence actually says about its Topic. Topic is usually something given or presupposed to be given, Focus is something new or emphasized. To me the Topic seems like data from the knowledge base whilst the Focus seems like an operation on this data, a function on the Topic. Since Topic-Focus opposition is one of the most important distinctions for me in the text structure, I can't omit it from the functional representation.

We know one thing that can separate anything from anything in functional composition: a Lambda abstraction. So we'll use it also to separate Topic from Focus.

Let's introduce a special auxiliary function (actually it's more like a macro) SENT, which will take both Topic and Focus as its arguments and then apply function-Focus to argument-Topic, probably marking them in some special problem-dependent way. Thus every sentence would look like a call of SENT function with Focus and Topic arguments.

Example sentence:

Father loves son.

Its core functional structure:

(loves father son)

By default Topic is 'father', Focus is 'loves son'.

(SENT (λx loves x son) father)

We can stress FATHER, promoting it to the Focus and 'loves son' - to the Topic:

(SENT (λx x father) (λy loves y son))

Here we have two abstractions. If we suppose that SENT does nothing besides applying its first argument (Focus) to its second argument (Topic), we'll get

((λx x father) (λy loves y son))
<= replace x =>
((λy loves y son) father)
<= replace y =>
(loves father son)

We can stress SON:

(SENT (λx x son) (λy loves father y))
which can be replaced by shorter
(SENT (λx x son) (loves father))

We can stress LOVES:

(SENT loves father son)

Here the Topic consists of two entities rather than one. This is perfectly normal, we'll just apply 'loves' to 'father', and then the result - to 'son'. We'll get ((loves father) son) which is by definition of currying equal to the core structure.

I don't know yet how to represent multiple parallel contrastive sequences, like 'Father loves son, while mother hates him'. But such an approach feels promising to me.

Lambda abstraction in text structure

Just like Chomsky, we'll introduce exactly one transformation. From Lambda calculus it's known like Lambda abstraction. Let's say (λx E1 E2) is an application of function (λx E1) to E2, and the result of this application will be E3 which is E1 with all occurrences of x replaced with E2. Abstraction is a reverse operation. Having a complex expression E3 where E2 occurs, we'll replace the occurrences of E2 with x and append λx to the beginning. Of course, any variable name could be in place of x.

This abstraction may be used when there are several functions applied on one argument. The most frequent example here is predicate coordination:

They stopped and looked across the river.

Here we have both 'stopped' and 'looked' applied to 'they'. Plain functional composition can't apply two functions to one argument simultaneously, so we'll need something that can. Let's suppose that 'and' function does it. It receives several component functions, merges them somehow and produces a single function as a result. This function will be a Composite, meaning that its action on an argument basically would be to apply all components to that argument. Here the merged functions will be 'stopped' and '(λx looked x across (the river))'. The second one will receive the argument and put it into correct position in its argument structure. So the result will be

((and stopped (λx looked x across (the river))) they)