Higher-Order Grammar:
   a Categorical Foundation for Type-Logical Constraint-Based Grammar*

                              Carl Pollard
                       The Ohio State University
                FGMOL Conference, Helsinki, August 2001

                               TALKSCRIPT

1. Introduction

The work I am going to talk about is part of a long-term project whose
goal is to develop a formal grammar framework that allows the linguist
the freedom to analyze phenomena using either feature-constraint
techniques or type-theoretic techniques, or even a combination of the
two. For example, one may want an HPSG-style analysis of parasitic
gaps and of binding principles A and B, but a categorial-style
analysis of syntactic valence and of nonconstituent coordination.  In
logical terms, the idea is to have a logic for syntax that encompasses
both feature logic and type logic. Of course many people, from
Uszkoreit and Zeevat et al. in the mid-80's down through more recent
investigators like Doerre and Manandhar and Heylen have had similar
ideas, so of course the burden is on me to show why my proposal for
how to do this is worth looking at.

The basic idea behind my proposal is that the right logic has been
staring us in the face for a long time, namely higher-order logic
along the lines of Church 1940, Henkin 1950, and Gallin 1975.  More
specifically, I use the elaboration of higher-order logic due to
Lambek and Scott 1986. This is a nice logic for formal linguists,
because most of them are already familiar with it via Montague, so
there is no need to learn special-purposes formalisms like multimodal
Lambek calculus or relational speciated reentrant logic. Also the
model theory---basically Henkin models like the ones used in Montague
semantics---are much more familiar and intuitive than totally
well-typed feature structures or the residuated lattice of sets of
strings of phonological words. It's pure serendipity that what I'm
proposing is an exemplar par excellence of model-theoretic syntax. I
swear I never heard this term till I read the program for this
conference!

Of course, you can object that higher-order grammar is too
expressive. I think worrying about that is a little like refusing to
use a programming language that can encode undecidable problems. As
Bob Carpenter (1999) has pointed out, there is a marked historical
tendency for pristine linguistic formalisms to acquire full turing
power once they get beefed up enough to be of any practical use.

In this talk I will focus on purely syntactic aspects, leaving the
interfaces to phonology and semantics for another time.  The full
paper gives a detailed account of the subject matter of this talk,
including its historical and philosophical background and the
mathematical formalism, in terms of both logic and category theory,
but in 20 minutes I can only hope to convey something of the flavor of
the framework, in the hope of enticing you to look at the paper.

This line of work started out as an attempt to rescue HPSG from an
array of foundational problems, alas most of them my own damned

                                 -2-

fault. For those of you familiar with HPSG, a nonexhaustive list of
the kinds of problems I've been worried about is given in (1):

(1) Some foundational problems for the HPSG framework

   o Under total-well typing, what values should be given to selectional
     features that never get saturated?

   o Do lexical rules exist, and if so, how should they be formalized?

   o How can we eliminate spurious ambiguity as to whether isomorphic
     substructures of a common structure are token-identical?

   o How can the implementation of set values be made simpler?

   o How can the constraint logic and its model theory be made less
     idiosyncratic and more familiar?

   o What is the best way to handle the distinction between ambiguity
     and and neutrality?

   o What is the best way to handle coordination of unlikes?

   o How should we analyze constructions such as gapping, right node
     raising, and comparative ellipsis, which can be seen as essentially
     involving lambda-abstraction in the syntax?

   o How should we analyze constructions such as WH-questions and
     TOO/ENOUGH constructions which essentially require lambda
     abstraction in the semantics?

Of course I realize that the majority of you are not HPSG
practicioners and so aren't overly concerned with repairing its
foundational problems, so I won't go into them here.

Those of you who are Lambek categorial grammarians doubtless wonder
why some linguists aren't. In some ways I think I am typical of formal
linguists who admire Lambek categorial grammar for what it can do, but
don't find the intuition behind it compelling. For many years I was
unable to articulate why this might be the case, but last spring I
stumbled across a paper that seemed to me to really hit the nail on
the head.  The paper I'm referring to, "Some logical aspects of
grammatical structure", is by Haskell Curry, and it appeared in 1961,
in fact in the same volume as Lambek's celebrated paper "On the
calculus of syntactic types." In this paper, Curry introduced the
terms PHENOGRAMMATICS (basically, phonology), and TECTOGRAMMATICS (his
term for categorial combinatorics), and politely took Lambek to task
for failing to distinguish between them. Curry made his own proposal
for handling natural language syntax, focusing on the tectogrammatics.
The type theory Curry used was not Lambek's, but in a different
tradition dating back to work done in the 1920's, 1930's, and 1940's
by Schoenfinkel, Curry, and Church. This type theory, the same one
that higher order logic is an elaboration of, resembles Lambek's in
the respect that complex types are generated from primitive types by a
binary type constructor.

The two kinds of type theory differ, however, in a fundamental

                                 -3-

way. The difference is best understood in terms of relationships
between type theories and categories (in the sense of mathematical
category theory) discovered in the 1970's by such mathematicians as
D. Scott (1980) and Lambek (1980), building in large part on
pioneering work by Lawvere (1964, 1971).  That is: in both kinds of
type theory, the type constructors for forming so-called functional
types are right-adjoint to a product type constructor. A practical
consequence of this abstract similarity is that in Curry grammar, as
in Lambek grammar, syntactic selection can be handled, roughly
speaking, in terms of cancellation.  But in Curry-style type theories,
the product in question is the categorical product, so that the
functional types are actually exponentials; while in Lambek's type
theory the product in question is a tensor product. Correspondingly,
the categorical models of the two kinds of type theory are different:
for Curry-style type theories they are essentially the same thing as
the models for the typed lambda calculii, namely cartesian closed
categories (Lawvere 1964). By contrast, the categorical models of
Lambek-style type theory are monoidal closed bicartesian (but not
cartesian closed) categories, which are the nonsymmetric
generalization of Troelstra's (1992) intuitionistic linear categories.

In terms of logic and algebra, the difference can be understood by
limiting attention to the categorical models which are also
preorders. (In fact, in Lambek categorial grammar these are the only
ones of interest, since there all deductions from one fixed type to
another are considered equivalent.) For Curry type theory, these are
the heyting semilattices, which correspond to positive
(conjunctive-implicative) intuitionistic propositional logic; while
for Lambek type theory they are residuated lattices, which correspond
to the multiplicative-additive fragment of noncommutative
intuitionistic propositional linear logic. These relationships are
summarized in (2), with the last column anticipating my own proposal.

(2) Comparison of Type-Theoretical Bases of Grammar Formalisms

                   Lambek            Curry             This Proposal

     Logics        noncommutative    conjunctive-      higher-order
                   multiplicative-   implicative       two-valued
                   additive          intuitionistic    classical
                   intuitionistic    propositional     logic
                   propositional     logic
                   linear logic

     Algebras      residuated        heyting           heyting
                   lattices          semilattices      algebras with
                                                       additional
                                                       structure

     Categories    monoidal-closed   cartesian-closed  two-valued
                   bicartesian       categories        boolean topoi
                   categories

The connections with category theory aren't necessary for an
understanding of the main linguistic ideas of this paper, though they
are essential for a full understanding of the mathematical
underpinnings.  For present purposes, though, it is important to be

                                 -4-

aware that Curry-style type theories have set-theoretic models of the
familiar kind, in which types are modelled by sets and functional
types are modelled as sets of functions: these are the so-called
Henkin models. But this is not the case with Lambek type
theory. However much we may speak informally of ``functor'' categories
in Lambek categorial grammar, model-theoretically they are not really
functions but rather residuals of sets of strings. In fact, this fact
is really at the heart of Curry's objection to Lambek-style syntax,
namely that it mixes the syntax up with the phonology (this is made
more precise in the full paper).  By contrast, in Curry-style grammar
we can literally model syntactic entities of nonprimitive types
set-theoretically as functions, or, in more general categorical
models, as elements of exponential types.

The full paper gives a detailed logical and categorical comparison of
Lambek's syntactic calculus vs. Lambek and Scott's higher-order
logic. Since most of you are familiar with the Lambek categorial
grammar, I'll not say much about it here other than to note
differences from higher-order grammar in passing. On the other hand,
the similarities will be pretty obvious. For easy reference, the
a comparison of the type constructors is given in (3).

(3) The type constructors in three type theories

                   Lambek         Curry             This Proposal
     Categoricals
                     _
       Product      | |             x                x

       Right        none            =>               =>
       Adjoint

       Unit         _|_             1                1

       Coproduct    |_|             none             +
                    ___
       Counit        |              none             0

       Subobject    none            none             true: 1 -> _bool_
       Classifier                                    (_bool_ = 1 + 1)

     Tensors

       Product      o               none             none

       Right        / and \         none             none
       Adjoints

Now I mentioned that Curry limited his attention to
tectogrammatics. The best-known example of Curry-style tectogrammatics
is the system of syntactic categories employed in Montague's PTQ,
which was based on the three primitive types t, CN and IV. Note that
I'm are speaking here of Montague's syntactic categories, not the
types of the intensional logic IL! In the type theory for Montague's
syntax, t just counts as another primitive type on a par with IV and
CN, not as a logical type of truth-values.  In fact, as we will see
below, one of the key ingredients of higher-order grammar is the

                                 -5-

addition of a logical type of truth-values (categorically, a subobject
classifier) to the type theory for the syntax. Essentially the same
move was proposed in the mid-1990's in a series of papers by Drew
Moshier, which have been a major source of inspiration for the present
proposal. However Moshier's approach differs from Curry's and from the
one proposed here in not employing exponential types. As a
consequence, Moshier's setup cannot treat syntactic selection in terms
of cancellation, but instead retains certain constraints that have
been HPSG mainstays, such as the Head Feature Principle and the
Subcategorization Principle.

As for phenogrammatics, Curry apparently took no interest in it. At
the opposite extreme, Montague grammar is notorious for the evident
lack of constraint with respect to what counts as a permissible
phenogrammatic operation, and as a result Montague's syntax was never
taken very seriously by linguists. In fact, it seems fair to say that
not even Montague took it very seriously, since for him the real
action was in the semantics. It's an unfortunate consequence of
Montague's cavelier attitude toward phenogrammatics that it also had
the side effect of discrediting the use of Curry-style type theory in
syntax, and in subsequent categorial grammar, Curry's approach was
almost eclipsed by Lambek's.  Almost, but not completely. Among
post-Montague linguists, I think the first to take Curry's critique
seriously was David Dowty in his 1989 paper at the Tilburg
Conference on Discontinuous Constituency.  Some other proposals that
maintain Curry's distinction between tectogrammatics and
phenogrammatics are those due to Mike Reape and to Andreas Kathol. And
I think the distinction in LFG between f-structure and c-structure is
esentially the same. However, none of this work uses Curry's type
theory.  The present proposal seeks to vindicate Curry by recasting
syntax within an elaborated version of his type theory. Yet Lambek is
vindicated too, since the version of Curry-style type theory we use is
Lambek and Scott's (1986) higher-order intuitionistic logic, with the
addition of the boolean axiom.

2. HOG Basics

In higher-order grammar, we're concerned with specifying a set of
signs, where a sign is an ordered triple such as the lexical sign
in (4)a:


(4) Architecture of Higher-Order Grammar

    a.  Signs are ordered triples, e.g. the lexical sign

                            <pIg, PIG, pig'>
        where
                 pIg is a string of phonological words
                 PIG is a closed term of the syntactic logic
                 pig' is a closed term of the semantic logic

    b. The syntactic logic is that of Lambek and Scott 1986 plus
       the boolean axiom and two-valuedness, with primitive types
       being the types in which atomic morphosyntactic features
       take their values


                                 -6-

(4)c. The semantic logic is similar to Lambek and Scott's higher-order
       intuitionistic logic plus the boolean axiom and minus the axiom
       of extensionality of entailment, with primitive types being e
       and p.  (cf. Lappin and Pollard 1999, 2000, Fox and Lappin 2001)

    d. Other signs are deduced from the lexical signs by labelled
       schemata whose instances have as their syntactic component a
       deduction from one type to another. The other components of
       the schemata constitute the syntax-phonology and syntax-
       semantics interfaces.


(5) Interfaces

     a. The syntax-phonology (= tectogrammar-phenogrammar) interface
        specifies a relation between word-strings and terms of the
        syntactic logic; neither is derived from the other.

        [By comparison, in Lambek categorial grammar this interface is
        a homomomorphism of residuated lattices, viz. the "frame
        semantics." This in essence is what Curry objected to.]

     b. The syntax-semantics interface, however specified, e.g. via an
        underspecification scheme (Richter and Sailer 1999) or a glue
        semantics (Asudeh 2001), will specify a relation between
        word-string/syntactic-term pairs and semantic terms, where
        both kinds of terms are in (different) higher-order logics.
        [By comparison, semantic interpretation in Lambek categorial
        grammar is a functor from a monoidal closed category to a topos
        (Lambek 1988), despite the misleading slogan (Dowty 1997)
        that in categorial grammar syntactic form = logical form.].

    c. In both the proposed architecture and in Lambek categorial
       grammar, there is a pronounced asymmetry between phonology
       and semantics.

(6) The underlying deductive system of HOG consists of a set of types,
     and for any two types A and B, a set of arrows (deductions,
     morphisms) from A to B.


                                 -7-

(7) We require that the deductive system be a CATEGORY, i.e.

     a. for each type A there is a distinguished IDENTITY arrow

                              id_A: A -> A;

     b. for any two arrows f: A -> B, g: B -> C there is a
        distinguished COMPOSITION arrow

                             g o f: A -> C

     such that the laws of identity hold:

                         id_B o f = f = f o id_A

     and if moreover h: C -> D then the law of composition holds:

                   h o (g o f) = (h o g) o f: A -> D

     c. In set-theoretic interpretations of the syntactic logic
        (Henkin models), the types are sets and the arrows from
        A to B are a subset of the functions from A to B.

     d. In the Curry-Howard isomorphism, identities are deductions of a
        formula from itself, and composition is concatenation of
        deductions.

(8) Primitive types correspond to sorts for values of atomic
     morphosyntactic features in HPSG, e.g.  _case_, _agr_, _vform_.

In most of the remainder of this talk, I'll survey the type
constructors of the syntactic logic and illustrate some of their
uses. For easy reference, the table in (10) summarizes these
constructors in terms of the Curry-Howard isomorphism and their
interpretation in the Henkin models.

(9) Type Constructors of Two-Valued Classical Higher-Order Logic

                       Curry-Howard          Interpretation in
      Constructor      Isomorphism           Henkin Models

           x           conjunction           cartesian product

           =>          implication           subset of the functions
                                             from antecedent to
                                             consequent

           1           tautology             1 (i.e. {0})

           +           disjunction           disjoint union

           0           contradiction         0 (empty set)

           _bool_      two-valued logic      2 (i.e. {0,1})


                                 -8-

(10) The Product Constructor X

     a. There are arbitrary finite products, i.e. for any finite
        indexing set I and any indexed family of types <A_i: i in I>
        there is an indexed product type

                             X<A_i: i in I>

        In the binary case, i.e. I = {0, 1}, we write

                               A_0 x A_1

     b. In Henkin models, an indexed product is a cartesian product,
        the set of functions f with domain I such that for each i in I,
        f(i) is in A_i. We can think of such functions as feature
        structure, as explained below.


     c. For each index i in I there is a PROJECTION arrow

                       p_i: X<A_i: i in I> -> A_i

        Thinking of indexed products as feature structure types, the
        projections are the features (but usually we abuse notation and
        write i rather than p_i as the name of the feature).

     d. In the Curry-Howard isomorphism, projections are deductions
        from a conjunction to a conjunct.


(11) a. We model nonatomic feature structure types as indexed
         products.


      b. Example: Parts of Speech

         We model parts of speech as indexed products of
         morphosyntactic feature types, e.g.

          i.          _noun_ = X<A_i: i in I> where

                            I = {CASE, AGR};
                            A_CASE = _case_;
                            A_AGR = _agr.


         ii.         _verb_ = X<B_j: j in J> where

                         J = {VFORM, AUX, INV};
                         B_VFORM = _vform_;
                         B_AUX = _bool_;
                         B_INV = _bool_.


                                 -9-

(11) c. We can use AVM notation for indexed products, e.g.

                   _noun_ = (CASE _case_, AGR _agr_)

            _verb_ = (VFORM _vform_, AUX _bool_, INV _bool_)

         with types as the "values".

(12) Features

      We model features by the projection arrows of indexed products
      onto their factors, e.g.

                        p_CASE: _noun_ -> _case_

      By a mild abuse of notation, we usually just write CASE instead
      of p_CASE.

(13) Indexed Tuples

      a. For any finite indexed family of arrows

                         <f_i: A -> B_i>_i_in_I

         there is an arrow

               f = (f_i)_i_in_I:  A -> X<B_i:i in I>

         called the indexed tuple of the family, such that for
         each i in I, the following diagram commutes:

                                   A
                                .  |        p_i o f = f_i
                               .   |
                          f_i .    f        P = X<B_i: i in I>
                             .     |
                           |_     \|/
                        B_i <------P
                             p_i

     b. As explained below, feature structures will be modelled by
        indexed tuples.

     c. In the binary case, we have the familiar ordered pairs:

                                   A
                                .  |  .           p_0 o (f,g) = f
                               .   |   .
                            f .  (f,g)  . g       p_1 o (f,g) = g
                             .     |     .
                           |_     \|/     _|
                          B <--- B x C ---> C
                             p_0        p_1

     d. In the Curry-Howard isomorphism, an ordered pair is the
        deduction of a conjunction from its conjuncts.

     e. As explained below, ambiguity is modelled by ordered pairs.
                                 -10-
(14) 0-ary Product

      a. The product of the empty family of types is the unit type 1.

      b. For every type A there is a unique arrow

                             []_A : A -> 1

      c. An element f of type A is defined to be an arrow

                               f: 1 -> A

         In a Henkin model this are just functions from {0} to A, which
         are in one-to-one correspondence with the members of A.

      d. Examples

         nom: 1 -> _case_    fin: 1 -> _vform_    3fs: 1 -> _agr_

(15) Feature structures

      a. Feature structures are modelled as elements of indexed product
         types, which are indexed tuples of arrows with source type 1.

      b. We can notate these using AVM notation, with elements (rather
         than types) in the "value" positions, e.g. SHE = (CASE nom,
         AGR 3fs), the syntactic component of the pronoun:


                                   1
                                .  |  .          CASE o SHE = nom
                               .   |   .         AGR o SHE = 3fs
                          nom .   SHE   . 3fs
                             .     |     .
                           |_     \|/     _|
                    _case_ <--- _noun_---> _agr_
                            CASE      AGR


(16) The Coproduct Constructor +

     a. There are arbitrary finite coproducts, i.e. for any finite
        indexing set I and any indexed family of types <A_i: i in I>
        there is an indexed coproduct type:

                             +<A_i: i in I>

        In the binary case, i.e. I = {0, 1}, we write

                               A_0 + A_1

     b. In Henkin models, an indexed product is a disjoint union.

     c. For each index i in I there is an INJECTION arrow

                       i_i: A_i -> +<A_i: i in I>

     d. In the Curry-Howard isomorphism, injections are deductions
        to a disjunction from one of its disjuncts.
                                 -11-


(16) e. Indexed cotuples are the dual notion to indexed tuples. The
        general case is omitted here; the binary case gives ordered
        copairs:

                                _  A  _
                               . |/|\| .          [f,g] o i_0 = f
                              .    |    .
                           f .   [f,g]   . g      [f,g] o i_1 = g
                            .      |      .
                           .       |       .
                         B ----> B + C <---- C
                             i_0        i_1

     f. In the Curry-Howard isomorphism, an ordered copair is a
        disjunctive syllogism.

(17) a. HPSG type partitions, and by extension, type hierarchies, are
         modelled as coproducts.

      b. For example, if _noun_ and _verb_ were the only parts of
         speech, we could model the HPSG type _head_ as

                        _head_ = _noun + _verb_

(18) a. The type _bool_ is defined as the coproduct

                             _bool_ = 1 + 1

         with the injections being

                         1 ----> _bool_ <---- 1
                           true          false

      b. _bool_ provides values for binary features; by convention
         true and false are then written + and - respectively.

(19) 0-ary Coproduct

      a. The coproduct of the empty family of types is the counit 0.

      b. For every type A there is a unique arrow

                     <>_A : 0 -> A

(20) The exponential constructor =>

      a. For any three types A, B, and C there is a natural isomorphism
         from the set of arrows from C x A to B to the set of arrows
         from C to (A => B. Thus we can curry and uncurry:

                         C x A          g = curry(f)
                           |   .
                  g x id   |    . f     f = uncurry(g)
                        A  |     .
                          \|/     _|    f(c,a) = (g(c))(a0
                  (A => B) x A --> B
                              eval
                                 -12-


(20) b. An important special case of this is that any arrow f from A to
        B can be replaced by an element of A => B, called the (LAWVERE)
        NAME of f, here notated f':

                         1 x A           f' = curry(f o p_1)
                           |  .
                           |   . p_1     f = eval o (f' o []_A , id_A )
                           |    .
                           |     _|
                   f' x 1  |      A
                         A |       .
                           |        . f
                           |         .
                          \|/        _|
                  (A => B) x A ------> B
                                eval

    c. Notational convention: If h is an element of A => B and a is an
       element of A, we write h(a) for

                     eval o (h,a): 1 -> B

       Thus
                     f'(a) = eval o (f',a) = f o a.

       i.e. so-called "function application" of f' to a is the
       composition of f with a, where f is the arrow named by f'.


(21) We use exponentials to define the analogs of HPSG's unsaturated
      types or categorial grammar's slash categories. For example:

      a. Define S and NP to be _verb_ and _noun_ respectively.

      b. Define VP to be NP => S

                TVP to be NP => VP

        (This will be revised below to allow for coordinate arguments.)

      c. In the Henkin models such types denote sets of functions. This
         has nothing to do with strings or residuation.

      d. A possible elaboration is to model grammatical functions as
         different "modes" of exponential (analogous to different modes
         of slash in multimodal categorial grammar, e.g.

         define VP to be NP =>_SUBJ S

                N' to be DetP =>_SPR NP

(22) The syntactic language

      a. The deductive system thus far is a bicartesian closed
         category.


                                 -13-

(22) b. Thus it is naturally associated with a typed lambda calculus
         (with pairing and copairing) with variables of all types,
         where the closed terms of type A are (lambda-equivalence
         classes of) elements of type A. In particular, the name of any
         arrow can now be thought of as a closed term.

(23) The syntactic component of an unsaturated sign is modelled as a
      lambda term abstracting on the argument type, e.g.

              CAT = lambda      (AGR 3s, CASE nom_acc)
                          x:DetP

      is an element of type DetP => NP. Thus there is no problem
      with instantiating the features of unrealized arguments.


(24) The syntactic components of phrase structure rules are modelled
      as arrows, e.g. unary rules:

                       a: VP_prp -> (PossP => NP)

      takes a present participial to a nominal gerundive;

                       b: (NP => S) -> (N' => N')

      takes a gappy S to a THAT-less relative clause;

                       c: N_pl' -> NP_pl

      takes a plural N' to a plural NP. (Note: the subscripted types
      are defined subtypes, described below.)

(25) Value types for HPSG's set-valued features are defined as
      powertypes (exponentials into _bool_), i.e.

                         Pow(A) = (A => _bool_)

3. The Syntactic Logic

(26) a. The fact that true: 1 -> _bool_ is a subobject classifier
         means that our category is not just bicartesian closed but in
         fact a two-valued boolean topos.

      b. It follows that for any monic arrow (in Henkin models,
         injective function) f: B >-> A has a characteristic arrow
         (arrow from A to _bool_) chi. (Categorically that means the
         diagram
                               true
                            1 ------> _bool_
                           /|\         /|\
                            |           |
                       []_A |           | chi
                            |           |
                            |           |
                            B >-------> A
                                  f
         is a pullback.)

                                 -14-

(27) c. From this it follows that all the usual logical connectives
         are definable, and our syntactic language becomes a higher-
         order logic.

      d. By the Axiom of Comprehension, for any type A and any term phi
         with at most the variable x of type A free, there exists a
         subtype of A, i.e. a monic f: B >-> A, written

                           B = {x in A: phi}

         the name of whose characteristic arrow is lambda   phi:
                                                         x:A
                               true
                            1 ------> _bool_
                           /|\         /|\
                            |           |
                       []_A |           | chi    where chi' =
                            |           |
                            |           |        lambda   phi
                            B >-------> A              x:A
                                  f

(27) Partially specified feature structures (as in Pollard and Sag
      1987) are modelled as defined subtypes, e.g.

               NP_acc = {x in NP: CASE'(x) = acc}

(28) The syntactic logic can be used in place of feature logics or
      fibered categorial languages to impose syntactic constraints, For
      example, in English the only verbs that invert are finite
      auxiliaries. This means we should have defined _verb_ not
      as

                (VFORM _vform_, AUX _bool_, INV _bool_)

      but rather as

          {x in (VFORM _vform_, AUX _bool_, INV _bool_):
                       INV'(x) -> (AUX'(x) & VFORM'(x) = fin)}

(29) a. For any A, the elements of Pow(A) form a boolean algebra.

         Example: if f, g: 1 -> Pow(A), then their union is defined by

             f union_A g def= lambda   (f'(x) \/ g'(x))
                                    x:A

      b. Viewing the elements of Pow(A) as the names of characteristic
         arrows of subtypes of A, we can transfer the boolean structure
         to the set of subtypes of A.

      c. In the Henkin models the boolean operations have their usual
         meanings in Pow(A).


                                 -15-

4. Syncretism.

(30) Syncretism, also called feature value neutrality or
      indeterminacy, consists in the simultaneous satisfaction of
      seemingly inconsistent morphosyntactic selectional requirements
      by a single form.

(31) Examples of case syncretism

      a. Er findet und hilft FRAUEN/*MAENNER/*KINDERN. (coordination)

      b. WAS/*WER/*WEN du mir gegeben hat, ist praechtig. (free
         relative)

      c. This is someone WHO/*WHOM even close friends of __ believe __
         must be watched like a hawk. (parasitic gap)

(32) Various accounts have been given, e.g. Bayer and Johnson 1995,
      Dalrymple and Kaplan 2000, Levine et al. 2001, Levy 2001, Daniels
      2001. Our account abstracts out in categorical form what is
      common to the last four of these, and simulates the first, using
      the (truly boolean) intersection as in (29) instead of the
      (putatively but not really boolean) categorical product (additive
      meet) of Lambek categorial grammar (which is interpreted as set
      intersection in the frame semantics).

(33) a. Instead of treating (German) _case_ as as a primitive type
         with four members

                     nom,acc,gen,dat: 1 -> _case_

         we treat it as the powertype

              _case_ = Pow(_bcase_) = (_bcase_ => _bool_)

         of a primitive type of "basic cases"

                 bnom, bacc, bgen, bdat: 1 -> _bcase_

      b. The singleton elements of _case_ correspond to Levine et al.'s
         (and Daniels') "pure cases", e.g.

                pnom = lambda         (x = bnom)
                             x: _bcase_

      c. The nonempty nonsingleton members of _case_ are syncretic
         values, e.g.

         acc_dat = lambda          ((x = bacc) \/ (x = bdat))
                          x: _bcase_


                                 -16-


(33) d. A verb that governs (say) accusative will accept not just a
         pure accusative but also any syncretic NP whose case value
         contains bnom. Thus we must revise the definition of NP_acc
         from the one given in (27) to

               NP_acc = {x in NP: CASE'(x)'(bacc)}

         In a Henkin model this is just the subset of the set NP each
         of whose members' CASE value contains the basic case bacc.

      e. If two basic cases (say bnom and bacc in Polish) do not
         syncretize, we can either

          i. treat it as an accident by simply having no lexical
          signs that syncretize bnom and bacc; or

         ii. We can treat it as systematic by adding to the grammar
         the constraint

             forall    not (CASE'(x)'(bnom) & CASE'(x)'(bacc))
                   x:NP

         (In this case, the type NP_nom_acc is a type other than 0
         with no elements, so that the grammar has no Henkin model.
         This is one of many reasons to consider not insisting on
         modelling syntax set-theoretically.)


(34) a. The proposed analysis of syncretism has as consequences such
         basic relationships (in English) as

                 NP_acc = NP_pacc union_NP NP_nom_acc

                NP_nom_acc = NP_nom intersect_NP  NP_acc

      b. However, unlike the Bayer-Johnson approach, it does not suffer
         from the conceptual problem that syncretism cannot be formally
         distinguished from "property conjunction", e.g. being both
         nominative and third-plural-feminine. The difference is that
         case syncretism imposes two constraints on a single feature:

         NP_nom_acc = {x in NP: CASE'(x)'(bnom) & CASE'(x)'(bacc)

         as opposed to constraining two features:

         NP_nom_3pf = {x in NP: CASE'(x)'(bnom) & AGR'(x) = 3pf}


                                 -17-

5. Coordination

(35) Coordination

      a. The fundamental pretheoretical generalization about
         coordination is Wasow's Generalization (Pullum and Zwicky
         1986) that, leaving aside cases of "principled resolution"
         (Corbett 1983), the conjuncts of a coordinate structure are
         subject to whatever constraints are imposed on the coordinate
         structure as a whole.

      b. HPSG accounts (prior to Levy 2001 and Daniels 2001) are unable
         to account for the difference between neutralizable
         polyvalency (a single sign satisfying mutually inconsistent
         valence requirements by taking a coordinate argument) and
         "functor ambiguity" (homophonous signs with inconsistent
         valence requirements:

          i. Kim is a Republican and proud of it.

         ii. *You can tuna and say that again.

      c. Our account preserves in abstract categorical form the
         fundamental properties of Dalrymple and Kaplan's (2000)
         account, but:

          i. presents no problem about how to formulate or interpret
          the property distribution schema (there is none); and

         ii. does not require any stipulation about what counts as a
         legitmate "property" (beyond being definable in the syntactic
         logic).

      d. Our account also simulates the account of Bayer and Johnson
         1995, using the (truly boolean) union operation instead of the
         (putatively but not really boolean) categorical coproduct
         (additive join) of Lambek categorial grammar (which is
         interpreted as set union in the frame semantics).

      e. Our account shares with those of Bayer and Johnson 1995, Levy
         2001, and Daniels 2001, that neutralization and coordination
         are algebraic duals.

(36) Proposed Analysis of Coordination

      a. If the conjuncts of a (binary) coordinate structure have
         syntactic components a: 1 -> A and b: 1 -> B and the
         injections into the coproduct are

                             i           j
                         A ---> A + B <--- B

         then the syntactic component of the coordinate structure is
         the doubleton element of Pow(A+B)

                lambda     ((x = i o a) \/ (x = j o b))
                      x:A+B

                                 -18-


(36) b. In a Henkin model this is just the set {a,b}, e.g. A
         REPUBLICAN AND PROUD OF IT has syntactic component {NP, AP}.

      c. Thus the binary coordination schema has as its syntactic
         component the (type-schematized) deduction

         lambda           lambda     ((x = i o y) \/ (x = j o z))
               (y,z):A x B      x:A+B

      d. If a and b are identified with their respective singletons
         in A + B, then coordination amounts to union in A + B.


6. Ambiguity and syntactic selection


(37) Ambiguity

      a. We treat ambiguous signs as pairs, e.g.

                        PIGS: 1 -> N_3p x NP_3p.

         Thus ambiguity is product, while neutralization (of which
         syncretism is a special case) is intersection. No analog
         of this analysis is available in Lambek categorial grammar
         because in the frame semantics there is no intersection
         operation distinct from the categorical product (meet).

      b. The same applies to ambiguous unsaturated signs (as in
         (35) b.ii above)

                  CAN: 1 -> (NP => VP) x (VP => VP)

      c. The Law of Exponents holds in any bicartesian closed
         category:

           For any three types A, B, C, there is a natural isomorphism


                n: (A => C) x (B => C) ->  (A + B) => C

           and we could just as well have said:

                       CAN: 1 -> (NP + VP) => VP

           that is, functor ambiguity is equivalent to disjunctive
           subcategorization (being of an exponential type whose
           antecedent type is a coproduct).


(38) Coordinate structures as syntactic arguments

      a. The above treatment of functor types (e.g. VP = NP => S) does
         not permit coordinate structures to be arguments.


                                 -19-


(38) b. We therefore revise the treatment of functor types as
         follows:

                           VP = Pow(NP) => S

                         TVP = (Pow(NP) => VP

         etc. That is, predicates syntactically select ONLY coordinate
         arguments; noncoordinate arguments are treated as singletons.

      c. The same thing applies to ambiguous functors
         (non-neutralizable polyvalency):

                 CAN: 1 -> ((Pow(NP) + Pow(VP)) => VP)

         Note that the antecedent type here is a coproduct of
         powertypes.


(39) Neutralizable Polyvalency

      a. By contrast with functor ambiguity, where the antecedent type
         is a coproduct of powertypes, in cases of neutralizable
         polyvalency the antecedent type is the powertypes of a
         coproduct.

      b. Example: making the simplifying assumption that English BE
         selects only NP and AP (but not PP, VP_pas, or VP_prp), we
         specify the syntactic component of BE as:

                      BE: 1 -> Pow(NP + AP) => VP

         Therefore BE takes as complement a (possibly singleton) set of
         syntactic entities each of which is either an NP or a VP. Thus
         the effect of Wasow's Generalization (in this case, the
         distribution of the disjunctive requirement onto each of the
         conjuncts) is achieved type-theoretically, with no need for
         a Dalrymple-Kaplan style distribution schema.


(40) Neutralizably polyvalent functors are also ambiguous.

      a. In any topos, as in set theory, for any two types A and B
         there is a natural embedding (i.e. monic)

                     Pow(A) + Pow(B) -> Pow(A + B)

      b. By contravariance of the exponential relative to the
         antecedent, for any three types A, B, and C there is a
         natural embedding

             (Pow(A + B) => C) -> ((Pow(A) + Pow(B)) => C)

      c. Thus neutralizable polyvalency implies functor ambiguity
         (but not the converse).


                                 -20-

(41) Conjoined functors with differing (but overlapping) valences

      a. Example (Dyla 1984): A coordination of two Polish verbs that
         govern accusative and genitive respectively can take an
         accusative-genitive syncretic NP (e.g. KOGO) as its object:

         Kogo Janek lubi a   Jerzy nienawidzi?
         who        like and       hate

      b. We model the syntactic components of the verbs as:

           l: 1 -> (NP_acc => VP)     n : 1 -> (NP_gen => VP)

      c. As discussed in (34), we have the following relationship
         among certain subtypes of NP:

                NP_acc_gen = NP_acc intersect_NP NP_gen;

         let a and b be the embeddings of NP_acc_gen into NP_acc
         and NP_gen respectively:
                           a                b
                 NP_acc <---< NP_acc_gen >---> NP_gen

      d. The syntactic component of the coordinate functor is

         lambda             lambda
               x: NP_acc_gen      y: VP
                                 ((y = l(a')(x) \/ (y = n(a')(x)))

         This takes a syncretic object and returns a doubleton set of
         VP's.

      e. More generally, the schema for binary coordination of functors
         has as its syntactic component the type-schematized deduction

             lambda
                   <u,v>: (A => C) x (B => C)
                   lambda
                         x: A intersect B
                         lambda   ((y = u(a')(x)) \/ (y = v(b')(x)))
                               y:C

         where a and b are the emedddings of A intersect B into A and B
         respectively.

      f. We leave as an exercise the further generalization of this
         schema to coordination of functors with different consequent
         types.


7. Conclusion.

(42) a. We propose a framework with an HPSG-like treatment of features
         and constraints but a categorial-grammar-like treatment of
         predicate-argument combination.


                                 -21-

(42) b. The proposed framework eliminates numerous foundational
         problems with the existing HPSG formalism.

      c. The proposed framework is type-theoretical in nature, but by
         using (Lambek and Scott's elaboration of) Curry's type theory
         rather than Lambek's syntactic calculus, it addresses Curry's
         critique of Lambek's failure to distinguish phenostructure
         from tectostructure.

      b. The syntactic language is a higher-order logic "fibred over"
         the underlying type-logic provided by the Curry-Howard
         isomorphism, and can be used for defining subtypes and stating
         syntactic constraints.

      c. The architecture is more like non-Lambek frameworks (CCG,
         grammar, HPSG, LFG, Montague grammar, etc.) than like Lambek
         categorial grammar in deriving nonlexical signs by schemata
         whose syntactic components are (type-schematized) deductions
         in the syntactic logic, but the grammar itself is not the same
         thing as the logic.

(43) Various notions of the syntactic logic correspond directly to
      linguistic notions, e.g.:

         elements of types   syntactic components of signs

         other arrows        syntactic components of phrasal schemata

         primitive types     types for values of atomic features

         indexed products    complex feature structure types

         projections         features

         coproducts          partitions in type hierachies

         exponential modes   grammatical functions

         elements of         syntactic components of coordinate
         powertypes          structures

         binary products     ambiguity

         intersection        neutrality/syncretism

         union               coordination

         antecedent type     neutralizable polyvalence
         is powertype of
         coproduct

         antecedent type     non-neutralizable polyvalence
         is coproduct of     (= functor ambiguity)
         powertypes


                                 -22-

(42) The foregoing represents only a point of departure in the
      development of a higher-order grammar. Among the tasks that
      remain, some of the highest-priority ones are:

      a. determining what further features and type constructors are
         needed to handle a more robust range of constructions
         (e.g. there-insertion, raising, unbounded dependency
         constructions, gapping, comparative ellipsis, etc.)  and
         constraints (e.g. binding principles A and B, island
         constraints, crossover constraints)

      b. determining the range of necessary phrasal schemata
         (i.e. developing a theory of constructions);

      c. establishing a syntax-phonology interface (i.e. determining
         the range of phenogrammatic operations and how to specify
         them);

      d. deciding among numerous options for the syntax-semantics
         interface (any of various underspecification schemes, glue
         semantics, etc.)