next up previous
Next: A Systemic Grammar Up: Reducing Complexity in A Previous: Introduction

Unification with Disjunctive Descriptions

Parsing with a systemic grammar involves much unification of disjunctive descriptions. The usual way to unify such is as follows:

  1. Expand out the disjunctive descriptions to Disjunctive Normal Form (DNF) -- a form with all disjunction at the top level of the description -- a disjunction of non-disjunctive forms.

  2. Unify each term of the first DNF form with each term of the other.

DNF expansion of a description is however an expensive task -- the process takes exponential time in the worst case (Kasper and Rounds, 1986). Space is also a problem -- DNF expansion is a transformation whereby a disjunctive description is replaced with a set of descriptions each of which contains no disjunction. For a description containing a high level of disjunction, the size of the DNF form can be excessive.

Space has not however been a problem in our processing, but time has. Systemic parsing is very slow. We thus focus on means for speeding up, or avoiding, the unification process.

Avoiding Expansion

There have been proposals for unification without DNF expansion. Karttunen, for instance, has proposed an algorithm which ``uses constraints on disjuncts which must be checked whenever the disjunct is modified'' (Kasper, 1987, p81). However, as noted by Kasper (1987, p61), Karttunen's unification algorithm works only for a limited type of disjunctive description, and not for general disjunction as is needed in the present work.

Kasper has proposed a method of re-representing disjunctive descriptions which in some cases avoids the need for expansion. His approach separates a disjunctive description into two parts -- a definite component (which contains no disjunction), and an indefinite component (containing the disjunctive information of the description). A unification process can first check whether the definite components of two descriptions unify, and only proceeds to unify the indefinite components if the definite components unify successfully. The unification of the indefinites is avoided if the unification of the definites fails.

Delaying disjunctive expansion until necessary

The Kasper-Rounds form also allows us to delay expansion until a later time. When two descriptions are unified, only the definite components need to be checked for compatibility. The result of a Kasper-Rounds unification contains the indefinite descriptions from both descriptions without expansion. At some point in the processing it may be necessary to resolve the indefiniteness, and the disjunctive components are then expanded. However, in many cases, the definite component of the description may become inconsistent before this is necessary, expansion is thus avoided.

When expansion is necessary, expand efficiently

If DNF-expansion is required, then it should be performed as efficiently as possible. We here discuss some methods to achieve this goal:

  1. Reducing the disjunctiveness of the description: By reducing the extent of the description, we reduce the amount of disjunction to be expanded, and thus speed up the expansion process. We use two methods to reduce the size of descriptions:

    1. Extracting descriptions for special-purpose: we segment the grammar description into sub-descriptions for particular purposes. We found that different parsing processes drew upon only subsets of the grammar. Rather than working with the full grammar, sub-descriptions tailored for particular purposes can be compiled-out. These sub-descriptions are less complex to expand than the full description

    2. Register Specific Pruning: parts of the grammar which are not expected to be used in a particular set of target texts are `pruned-out' before processing begins.

  2. Expanding Disjunctions Efficiently: a disjunctive description may contain a number of disjunctions. Ordering the expansion of these disjunctions in particular ways can result in improved expansion times:

    1. Multiplying together disjunctions with high likelihood of inconsistency first, thus reducing the number of terms which we continue with.
    2. Spotting inconsistent unifications with minimum of work e.g., checking for inconsistencies between single terms before checking for inconsistencies between combinations of terms.
    3. Using some form of structure sharing in the expansion process: in the expansion process, the same terms may be multiplied together a number of times. A form of structure-sharing, such as a parse chart, can reduce the redundancy in the expansion process.

Caching and precompilation: avoiding repeating the same expansion.

The parser makes extensive use of caching -- when any expansion is calculated which is likely to be used again, the result is stored away for later re-use.

Precompilation has also been a useful technique to improve parsing efficiency. Precompilation is basically a pre-caching of all the values which might be used in the parsing process. By performing most of the DNF expansion of the grammar as a precompilation step, we avoid doing that calculation during the parsing of a sentence.

next up previous
Next: A Systemic Grammar Up: Reducing Complexity in A Previous: Introduction

Mick O'Donnell
Fri Jan 26 19:21:43 GMT 1996