Monday, July 26, 2010

Frames, constructs and the log

Some time has passed. My ideas of the internal representation of natural language have changed. Here are the new ones.

Every word in the input is taken as a whole and all the information about it is stored in a single frame. The frame may have different aspects: morphological, syntactic, semantic, discourse, etc. The aspect names are called roles, the objects corresponding to these roles are called constructs.

The lexical ambiguities are handled frame-internally. In case it's syntactically ambiguous, it has several possible syntactic constructs. A polysemous word may have different semantic constructs. The frame should eventually choose one construct for each role, and the chosen constructs have not to contradict with each other. Here's how a frame for the famously ambiguous bank could look like:
Once frames with embedded constructs appear in the model, they remain activated for some time and may perform various actions. They may add constructs to their own or another frame. They may establish links to constructs of other roles in the same frame. Or they may create connections to other frames.

Establishing a connection between frames means creating a new frame whose child frames are the ones that are connected. The new frame may also have several aspects. For example, in the usual John loves Mary two extra frames are created to link the predicate with the subject and the object respectively. These frames host two constructs both: one representing syntactic relation, another - semantic one (in experiencer, state and theme terms):
As I've already stated, not only the final structure is important, but also the sequence in which it was constructed. For this, a simple program-like log can be used:

frame: syn=noun, sem=JOHN
frame: syn=verb;transitive, sem=love
frame ^2 ^1: syn=subject+predicate, sem=experiencer+state
frame: syn=noun, sem=MARY
frame ^3 ^1: syn=predicate+object, sem=state+object

Each line here talks about some new frame. The frame's children are referred to as ^x where x is how many lines back was the one talking about the mentioned frame. ^1 means the previous line, ^2 - the one before the previous. When a frame is created, it has no aspects. Their subsequent assignments are reflected in the log.

This log reflects the dynamics of the parsing process and can thus be used in the applications where the order of operations is important, like machine translation.

2 comments:

Muhammad Alzaidi said...

Hi Peter,

this approach seems interesting at least it conveys somewhat new ideas. on of them is the cooperation of syntax, semantics, ets...

but I am wondering how this approach deals with one constituent that plays two different thematic roles in a structure because you mentioned that 'The frame should eventually choose one construct for each role'
Suppose that we have this sort of coordination
{[[A B ]and[C D]F]}
suppose that we have a coordination of two conjuncts and the two conjuncts share a constituent F which is in each conjunct plays a different thematic role.. Now if we assume that the frame should choose one constuct for one role (=Thematic roles) then we are forced to have in the surface structure of the two different constructs which are phonologically the same but syntactically (i.e. in terms of thematic roles) they are different. However the surface structure above, there is only one overt construct so ....
what that inidactes to me is that the frame-approach will stipulate that there is 'deletion process' somewhere occuring the frame that affects one of the phonologically same construct to be deleted somewhere in the frame.....?

Peter Gromov said...

Muhammad,

I didn't quite understand your point about {[[A B ]and[C D]F]} and different thematic roles. Could you provide a concrete example in some real language (preferably, English)?

And, to clarify. Frames are not constituents. Frames are their heads and the connections between the heads and their dependents. This is closer to Dependency Grammar than to Phrase Structure grammar. And I'm not sure there will be any Minimalism-like deletion, copy, movement etc.