Higher-Order Grammar: a Categorical Foundation for Type-Logical Constraint-Based Grammar* Carl Pollard The Ohio State University FGMOL Conference, Helsinki, August 2001 TALKSCRIPT 1. Introduction The work I am going to talk about is part of a long-term project whose goal is to develop a formal grammar framework that allows the linguist the freedom to analyze phenomena using either feature-constraint techniques or type-theoretic techniques, or even a combination of the two. For example, one may want an HPSG-style analysis of parasitic gaps and of binding principles A and B, but a categorial-style analysis of syntactic valence and of nonconstituent coordination. In logical terms, the idea is to have a logic for syntax that encompasses both feature logic and type logic. Of course many people, from Uszkoreit and Zeevat et al. in the mid-80's down through more recent investigators like Doerre and Manandhar and Heylen have had similar ideas, so of course the burden is on me to show why my proposal for how to do this is worth looking at. The basic idea behind my proposal is that the right logic has been staring us in the face for a long time, namely higher-order logic along the lines of Church 1940, Henkin 1950, and Gallin 1975. More specifically, I use the elaboration of higher-order logic due to Lambek and Scott 1986. This is a nice logic for formal linguists, because most of them are already familiar with it via Montague, so there is no need to learn special-purposes formalisms like multimodal Lambek calculus or relational speciated reentrant logic. Also the model theory---basically Henkin models like the ones used in Montague semantics---are much more familiar and intuitive than totally well-typed feature structures or the residuated lattice of sets of strings of phonological words. It's pure serendipity that what I'm proposing is an exemplar par excellence of model-theoretic syntax. I swear I never heard this term till I read the program for this conference! Of course, you can object that higher-order grammar is too expressive. I think worrying about that is a little like refusing to use a programming language that can encode undecidable problems. As Bob Carpenter (1999) has pointed out, there is a marked historical tendency for pristine linguistic formalisms to acquire full turing power once they get beefed up enough to be of any practical use. In this talk I will focus on purely syntactic aspects, leaving the interfaces to phonology and semantics for another time. The full paper gives a detailed account of the subject matter of this talk, including its historical and philosophical background and the mathematical formalism, in terms of both logic and category theory, but in 20 minutes I can only hope to convey something of the flavor of the framework, in the hope of enticing you to look at the paper. This line of work started out as an attempt to rescue HPSG from an array of foundational problems, alas most of them my own damned -2- fault. For those of you familiar with HPSG, a nonexhaustive list of the kinds of problems I've been worried about is given in (1): (1) Some foundational problems for the HPSG framework o Under total-well typing, what values should be given to selectional features that never get saturated? o Do lexical rules exist, and if so, how should they be formalized? o How can we eliminate spurious ambiguity as to whether isomorphic substructures of a common structure are token-identical? o How can the implementation of set values be made simpler? o How can the constraint logic and its model theory be made less idiosyncratic and more familiar? o What is the best way to handle the distinction between ambiguity and and neutrality? o What is the best way to handle coordination of unlikes? o How should we analyze constructions such as gapping, right node raising, and comparative ellipsis, which can be seen as essentially involving lambda-abstraction in the syntax? o How should we analyze constructions such as WH-questions and TOO/ENOUGH constructions which essentially require lambda abstraction in the semantics? Of course I realize that the majority of you are not HPSG practicioners and so aren't overly concerned with repairing its foundational problems, so I won't go into them here. Those of you who are Lambek categorial grammarians doubtless wonder why some linguists aren't. In some ways I think I am typical of formal linguists who admire Lambek categorial grammar for what it can do, but don't find the intuition behind it compelling. For many years I was unable to articulate why this might be the case, but last spring I stumbled across a paper that seemed to me to really hit the nail on the head. The paper I'm referring to, "Some logical aspects of grammatical structure", is by Haskell Curry, and it appeared in 1961, in fact in the same volume as Lambek's celebrated paper "On the calculus of syntactic types." In this paper, Curry introduced the terms PHENOGRAMMATICS (basically, phonology), and TECTOGRAMMATICS (his term for categorial combinatorics), and politely took Lambek to task for failing to distinguish between them. Curry made his own proposal for handling natural language syntax, focusing on the tectogrammatics. The type theory Curry used was not Lambek's, but in a different tradition dating back to work done in the 1920's, 1930's, and 1940's by Schoenfinkel, Curry, and Church. This type theory, the same one that higher order logic is an elaboration of, resembles Lambek's in the respect that complex types are generated from primitive types by a binary type constructor. The two kinds of type theory differ, however, in a fundamental -3- way. The difference is best understood in terms of relationships between type theories and categories (in the sense of mathematical category theory) discovered in the 1970's by such mathematicians as D. Scott (1980) and Lambek (1980), building in large part on pioneering work by Lawvere (1964, 1971). That is: in both kinds of type theory, the type constructors for forming so-called functional types are right-adjoint to a product type constructor. A practical consequence of this abstract similarity is that in Curry grammar, as in Lambek grammar, syntactic selection can be handled, roughly speaking, in terms of cancellation. But in Curry-style type theories, the product in question is the categorical product, so that the functional types are actually exponentials; while in Lambek's type theory the product in question is a tensor product. Correspondingly, the categorical models of the two kinds of type theory are different: for Curry-style type theories they are essentially the same thing as the models for the typed lambda calculii, namely cartesian closed categories (Lawvere 1964). By contrast, the categorical models of Lambek-style type theory are monoidal closed bicartesian (but not cartesian closed) categories, which are the nonsymmetric generalization of Troelstra's (1992) intuitionistic linear categories. In terms of logic and algebra, the difference can be understood by limiting attention to the categorical models which are also preorders. (In fact, in Lambek categorial grammar these are the only ones of interest, since there all deductions from one fixed type to another are considered equivalent.) For Curry type theory, these are the heyting semilattices, which correspond to positive (conjunctive-implicative) intuitionistic propositional logic; while for Lambek type theory they are residuated lattices, which correspond to the multiplicative-additive fragment of noncommutative intuitionistic propositional linear logic. These relationships are summarized in (2), with the last column anticipating my own proposal. (2) Comparison of Type-Theoretical Bases of Grammar Formalisms Lambek Curry This Proposal Logics noncommutative conjunctive- higher-order multiplicative- implicative two-valued additive intuitionistic classical intuitionistic propositional logic propositional logic linear logic Algebras residuated heyting heyting lattices semilattices algebras with additional structure Categories monoidal-closed cartesian-closed two-valued bicartesian categories boolean topoi categories The connections with category theory aren't necessary for an understanding of the main linguistic ideas of this paper, though they are essential for a full understanding of the mathematical underpinnings. For present purposes, though, it is important to be -4- aware that Curry-style type theories have set-theoretic models of the familiar kind, in which types are modelled by sets and functional types are modelled as sets of functions: these are the so-called Henkin models. But this is not the case with Lambek type theory. However much we may speak informally of ``functor'' categories in Lambek categorial grammar, model-theoretically they are not really functions but rather residuals of sets of strings. In fact, this fact is really at the heart of Curry's objection to Lambek-style syntax, namely that it mixes the syntax up with the phonology (this is made more precise in the full paper). By contrast, in Curry-style grammar we can literally model syntactic entities of nonprimitive types set-theoretically as functions, or, in more general categorical models, as elements of exponential types. The full paper gives a detailed logical and categorical comparison of Lambek's syntactic calculus vs. Lambek and Scott's higher-order logic. Since most of you are familiar with the Lambek categorial grammar, I'll not say much about it here other than to note differences from higher-order grammar in passing. On the other hand, the similarities will be pretty obvious. For easy reference, the a comparison of the type constructors is given in (3). (3) The type constructors in three type theories Lambek Curry This Proposal Categoricals _ Product | | x x Right none => => Adjoint Unit _|_ 1 1 Coproduct |_| none + ___ Counit | none 0 Subobject none none true: 1 -> _bool_ Classifier (_bool_ = 1 + 1) Tensors Product o none none Right / and \ none none Adjoints Now I mentioned that Curry limited his attention to tectogrammatics. The best-known example of Curry-style tectogrammatics is the system of syntactic categories employed in Montague's PTQ, which was based on the three primitive types t, CN and IV. Note that I'm are speaking here of Montague's syntactic categories, not the types of the intensional logic IL! In the type theory for Montague's syntax, t just counts as another primitive type on a par with IV and CN, not as a logical type of truth-values. In fact, as we will see below, one of the key ingredients of higher-order grammar is the -5- addition of a logical type of truth-values (categorically, a subobject classifier) to the type theory for the syntax. Essentially the same move was proposed in the mid-1990's in a series of papers by Drew Moshier, which have been a major source of inspiration for the present proposal. However Moshier's approach differs from Curry's and from the one proposed here in not employing exponential types. As a consequence, Moshier's setup cannot treat syntactic selection in terms of cancellation, but instead retains certain constraints that have been HPSG mainstays, such as the Head Feature Principle and the Subcategorization Principle. As for phenogrammatics, Curry apparently took no interest in it. At the opposite extreme, Montague grammar is notorious for the evident lack of constraint with respect to what counts as a permissible phenogrammatic operation, and as a result Montague's syntax was never taken very seriously by linguists. In fact, it seems fair to say that not even Montague took it very seriously, since for him the real action was in the semantics. It's an unfortunate consequence of Montague's cavelier attitude toward phenogrammatics that it also had the side effect of discrediting the use of Curry-style type theory in syntax, and in subsequent categorial grammar, Curry's approach was almost eclipsed by Lambek's. Almost, but not completely. Among post-Montague linguists, I think the first to take Curry's critique seriously was David Dowty in his 1989 paper at the Tilburg Conference on Discontinuous Constituency. Some other proposals that maintain Curry's distinction between tectogrammatics and phenogrammatics are those due to Mike Reape and to Andreas Kathol. And I think the distinction in LFG between f-structure and c-structure is esentially the same. However, none of this work uses Curry's type theory. The present proposal seeks to vindicate Curry by recasting syntax within an elaborated version of his type theory. Yet Lambek is vindicated too, since the version of Curry-style type theory we use is Lambek and Scott's (1986) higher-order intuitionistic logic, with the addition of the boolean axiom. 2. HOG Basics In higher-order grammar, we're concerned with specifying a set of signs, where a sign is an ordered triple such as the lexical sign in (4)a: (4) Architecture of Higher-Order Grammar a. Signs are ordered triples, e.g. the lexical sign where pIg is a string of phonological words PIG is a closed term of the syntactic logic pig' is a closed term of the semantic logic b. The syntactic logic is that of Lambek and Scott 1986 plus the boolean axiom and two-valuedness, with primitive types being the types in which atomic morphosyntactic features take their values -6- (4)c. The semantic logic is similar to Lambek and Scott's higher-order intuitionistic logic plus the boolean axiom and minus the axiom of extensionality of entailment, with primitive types being e and p. (cf. Lappin and Pollard 1999, 2000, Fox and Lappin 2001) d. Other signs are deduced from the lexical signs by labelled schemata whose instances have as their syntactic component a deduction from one type to another. The other components of the schemata constitute the syntax-phonology and syntax- semantics interfaces. (5) Interfaces a. The syntax-phonology (= tectogrammar-phenogrammar) interface specifies a relation between word-strings and terms of the syntactic logic; neither is derived from the other. [By comparison, in Lambek categorial grammar this interface is a homomomorphism of residuated lattices, viz. the "frame semantics." This in essence is what Curry objected to.] b. The syntax-semantics interface, however specified, e.g. via an underspecification scheme (Richter and Sailer 1999) or a glue semantics (Asudeh 2001), will specify a relation between word-string/syntactic-term pairs and semantic terms, where both kinds of terms are in (different) higher-order logics. [By comparison, semantic interpretation in Lambek categorial grammar is a functor from a monoidal closed category to a topos (Lambek 1988), despite the misleading slogan (Dowty 1997) that in categorial grammar syntactic form = logical form.]. c. In both the proposed architecture and in Lambek categorial grammar, there is a pronounced asymmetry between phonology and semantics. (6) The underlying deductive system of HOG consists of a set of types, and for any two types A and B, a set of arrows (deductions, morphisms) from A to B. -7- (7) We require that the deductive system be a CATEGORY, i.e. a. for each type A there is a distinguished IDENTITY arrow id_A: A -> A; b. for any two arrows f: A -> B, g: B -> C there is a distinguished COMPOSITION arrow g o f: A -> C such that the laws of identity hold: id_B o f = f = f o id_A and if moreover h: C -> D then the law of composition holds: h o (g o f) = (h o g) o f: A -> D c. In set-theoretic interpretations of the syntactic logic (Henkin models), the types are sets and the arrows from A to B are a subset of the functions from A to B. d. In the Curry-Howard isomorphism, identities are deductions of a formula from itself, and composition is concatenation of deductions. (8) Primitive types correspond to sorts for values of atomic morphosyntactic features in HPSG, e.g. _case_, _agr_, _vform_. In most of the remainder of this talk, I'll survey the type constructors of the syntactic logic and illustrate some of their uses. For easy reference, the table in (10) summarizes these constructors in terms of the Curry-Howard isomorphism and their interpretation in the Henkin models. (9) Type Constructors of Two-Valued Classical Higher-Order Logic Curry-Howard Interpretation in Constructor Isomorphism Henkin Models x conjunction cartesian product => implication subset of the functions from antecedent to consequent 1 tautology 1 (i.e. {0}) + disjunction disjoint union 0 contradiction 0 (empty set) _bool_ two-valued logic 2 (i.e. {0,1}) -8- (10) The Product Constructor X a. There are arbitrary finite products, i.e. for any finite indexing set I and any indexed family of types there is an indexed product type X In the binary case, i.e. I = {0, 1}, we write A_0 x A_1 b. In Henkin models, an indexed product is a cartesian product, the set of functions f with domain I such that for each i in I, f(i) is in A_i. We can think of such functions as feature structure, as explained below. c. For each index i in I there is a PROJECTION arrow p_i: X -> A_i Thinking of indexed products as feature structure types, the projections are the features (but usually we abuse notation and write i rather than p_i as the name of the feature). d. In the Curry-Howard isomorphism, projections are deductions from a conjunction to a conjunct. (11) a. We model nonatomic feature structure types as indexed products. b. Example: Parts of Speech We model parts of speech as indexed products of morphosyntactic feature types, e.g. i. _noun_ = X where I = {CASE, AGR}; A_CASE = _case_; A_AGR = _agr. ii. _verb_ = X where J = {VFORM, AUX, INV}; B_VFORM = _vform_; B_AUX = _bool_; B_INV = _bool_. -9- (11) c. We can use AVM notation for indexed products, e.g. _noun_ = (CASE _case_, AGR _agr_) _verb_ = (VFORM _vform_, AUX _bool_, INV _bool_) with types as the "values". (12) Features We model features by the projection arrows of indexed products onto their factors, e.g. p_CASE: _noun_ -> _case_ By a mild abuse of notation, we usually just write CASE instead of p_CASE. (13) Indexed Tuples a. For any finite indexed family of arrows B_i>_i_in_I there is an arrow f = (f_i)_i_in_I: A -> X called the indexed tuple of the family, such that for each i in I, the following diagram commutes: A . | p_i o f = f_i . | f_i . f P = X . | |_ \|/ B_i <------P p_i b. As explained below, feature structures will be modelled by indexed tuples. c. In the binary case, we have the familiar ordered pairs: A . | . p_0 o (f,g) = f . | . f . (f,g) . g p_1 o (f,g) = g . | . |_ \|/ _| B <--- B x C ---> C p_0 p_1 d. In the Curry-Howard isomorphism, an ordered pair is the deduction of a conjunction from its conjuncts. e. As explained below, ambiguity is modelled by ordered pairs. -10- (14) 0-ary Product a. The product of the empty family of types is the unit type 1. b. For every type A there is a unique arrow []_A : A -> 1 c. An element f of type A is defined to be an arrow f: 1 -> A In a Henkin model this are just functions from {0} to A, which are in one-to-one correspondence with the members of A. d. Examples nom: 1 -> _case_ fin: 1 -> _vform_ 3fs: 1 -> _agr_ (15) Feature structures a. Feature structures are modelled as elements of indexed product types, which are indexed tuples of arrows with source type 1. b. We can notate these using AVM notation, with elements (rather than types) in the "value" positions, e.g. SHE = (CASE nom, AGR 3fs), the syntactic component of the pronoun: 1 . | . CASE o SHE = nom . | . AGR o SHE = 3fs nom . SHE . 3fs . | . |_ \|/ _| _case_ <--- _noun_---> _agr_ CASE AGR (16) The Coproduct Constructor + a. There are arbitrary finite coproducts, i.e. for any finite indexing set I and any indexed family of types there is an indexed coproduct type: + In the binary case, i.e. I = {0, 1}, we write A_0 + A_1 b. In Henkin models, an indexed product is a disjoint union. c. For each index i in I there is an INJECTION arrow i_i: A_i -> + d. In the Curry-Howard isomorphism, injections are deductions to a disjunction from one of its disjuncts. -11- (16) e. Indexed cotuples are the dual notion to indexed tuples. The general case is omitted here; the binary case gives ordered copairs: _ A _ . |/|\| . [f,g] o i_0 = f . | . f . [f,g] . g [f,g] o i_1 = g . | . . | . B ----> B + C <---- C i_0 i_1 f. In the Curry-Howard isomorphism, an ordered copair is a disjunctive syllogism. (17) a. HPSG type partitions, and by extension, type hierarchies, are modelled as coproducts. b. For example, if _noun_ and _verb_ were the only parts of speech, we could model the HPSG type _head_ as _head_ = _noun + _verb_ (18) a. The type _bool_ is defined as the coproduct _bool_ = 1 + 1 with the injections being 1 ----> _bool_ <---- 1 true false b. _bool_ provides values for binary features; by convention true and false are then written + and - respectively. (19) 0-ary Coproduct a. The coproduct of the empty family of types is the counit 0. b. For every type A there is a unique arrow <>_A : 0 -> A (20) The exponential constructor => a. For any three types A, B, and C there is a natural isomorphism from the set of arrows from C x A to B to the set of arrows from C to (A => B. Thus we can curry and uncurry: C x A g = curry(f) | . g x id | . f f = uncurry(g) A | . \|/ _| f(c,a) = (g(c))(a0 (A => B) x A --> B eval -12- (20) b. An important special case of this is that any arrow f from A to B can be replaced by an element of A => B, called the (LAWVERE) NAME of f, here notated f': 1 x A f' = curry(f o p_1) | . | . p_1 f = eval o (f' o []_A , id_A ) | . | _| f' x 1 | A A | . | . f | . \|/ _| (A => B) x A ------> B eval c. Notational convention: If h is an element of A => B and a is an element of A, we write h(a) for eval o (h,a): 1 -> B Thus f'(a) = eval o (f',a) = f o a. i.e. so-called "function application" of f' to a is the composition of f with a, where f is the arrow named by f'. (21) We use exponentials to define the analogs of HPSG's unsaturated types or categorial grammar's slash categories. For example: a. Define S and NP to be _verb_ and _noun_ respectively. b. Define VP to be NP => S TVP to be NP => VP (This will be revised below to allow for coordinate arguments.) c. In the Henkin models such types denote sets of functions. This has nothing to do with strings or residuation. d. A possible elaboration is to model grammatical functions as different "modes" of exponential (analogous to different modes of slash in multimodal categorial grammar, e.g. define VP to be NP =>_SUBJ S N' to be DetP =>_SPR NP (22) The syntactic language a. The deductive system thus far is a bicartesian closed category. -13- (22) b. Thus it is naturally associated with a typed lambda calculus (with pairing and copairing) with variables of all types, where the closed terms of type A are (lambda-equivalence classes of) elements of type A. In particular, the name of any arrow can now be thought of as a closed term. (23) The syntactic component of an unsaturated sign is modelled as a lambda term abstracting on the argument type, e.g. CAT = lambda (AGR 3s, CASE nom_acc) x:DetP is an element of type DetP => NP. Thus there is no problem with instantiating the features of unrealized arguments. (24) The syntactic components of phrase structure rules are modelled as arrows, e.g. unary rules: a: VP_prp -> (PossP => NP) takes a present participial to a nominal gerundive; b: (NP => S) -> (N' => N') takes a gappy S to a THAT-less relative clause; c: N_pl' -> NP_pl takes a plural N' to a plural NP. (Note: the subscripted types are defined subtypes, described below.) (25) Value types for HPSG's set-valued features are defined as powertypes (exponentials into _bool_), i.e. Pow(A) = (A => _bool_) 3. The Syntactic Logic (26) a. The fact that true: 1 -> _bool_ is a subobject classifier means that our category is not just bicartesian closed but in fact a two-valued boolean topos. b. It follows that for any monic arrow (in Henkin models, injective function) f: B >-> A has a characteristic arrow (arrow from A to _bool_) chi. (Categorically that means the diagram true 1 ------> _bool_ /|\ /|\ | | []_A | | chi | | | | B >-------> A f is a pullback.) -14- (27) c. From this it follows that all the usual logical connectives are definable, and our syntactic language becomes a higher- order logic. d. By the Axiom of Comprehension, for any type A and any term phi with at most the variable x of type A free, there exists a subtype of A, i.e. a monic f: B >-> A, written B = {x in A: phi} the name of whose characteristic arrow is lambda phi: x:A true 1 ------> _bool_ /|\ /|\ | | []_A | | chi where chi' = | | | | lambda phi B >-------> A x:A f (27) Partially specified feature structures (as in Pollard and Sag 1987) are modelled as defined subtypes, e.g. NP_acc = {x in NP: CASE'(x) = acc} (28) The syntactic logic can be used in place of feature logics or fibered categorial languages to impose syntactic constraints, For example, in English the only verbs that invert are finite auxiliaries. This means we should have defined _verb_ not as (VFORM _vform_, AUX _bool_, INV _bool_) but rather as {x in (VFORM _vform_, AUX _bool_, INV _bool_): INV'(x) -> (AUX'(x) & VFORM'(x) = fin)} (29) a. For any A, the elements of Pow(A) form a boolean algebra. Example: if f, g: 1 -> Pow(A), then their union is defined by f union_A g def= lambda (f'(x) \/ g'(x)) x:A b. Viewing the elements of Pow(A) as the names of characteristic arrows of subtypes of A, we can transfer the boolean structure to the set of subtypes of A. c. In the Henkin models the boolean operations have their usual meanings in Pow(A). -15- 4. Syncretism. (30) Syncretism, also called feature value neutrality or indeterminacy, consists in the simultaneous satisfaction of seemingly inconsistent morphosyntactic selectional requirements by a single form. (31) Examples of case syncretism a. Er findet und hilft FRAUEN/*MAENNER/*KINDERN. (coordination) b. WAS/*WER/*WEN du mir gegeben hat, ist praechtig. (free relative) c. This is someone WHO/*WHOM even close friends of __ believe __ must be watched like a hawk. (parasitic gap) (32) Various accounts have been given, e.g. Bayer and Johnson 1995, Dalrymple and Kaplan 2000, Levine et al. 2001, Levy 2001, Daniels 2001. Our account abstracts out in categorical form what is common to the last four of these, and simulates the first, using the (truly boolean) intersection as in (29) instead of the (putatively but not really boolean) categorical product (additive meet) of Lambek categorial grammar (which is interpreted as set intersection in the frame semantics). (33) a. Instead of treating (German) _case_ as as a primitive type with four members nom,acc,gen,dat: 1 -> _case_ we treat it as the powertype _case_ = Pow(_bcase_) = (_bcase_ => _bool_) of a primitive type of "basic cases" bnom, bacc, bgen, bdat: 1 -> _bcase_ b. The singleton elements of _case_ correspond to Levine et al.'s (and Daniels') "pure cases", e.g. pnom = lambda (x = bnom) x: _bcase_ c. The nonempty nonsingleton members of _case_ are syncretic values, e.g. acc_dat = lambda ((x = bacc) \/ (x = bdat)) x: _bcase_ -16- (33) d. A verb that governs (say) accusative will accept not just a pure accusative but also any syncretic NP whose case value contains bnom. Thus we must revise the definition of NP_acc from the one given in (27) to NP_acc = {x in NP: CASE'(x)'(bacc)} In a Henkin model this is just the subset of the set NP each of whose members' CASE value contains the basic case bacc. e. If two basic cases (say bnom and bacc in Polish) do not syncretize, we can either i. treat it as an accident by simply having no lexical signs that syncretize bnom and bacc; or ii. We can treat it as systematic by adding to the grammar the constraint forall not (CASE'(x)'(bnom) & CASE'(x)'(bacc)) x:NP (In this case, the type NP_nom_acc is a type other than 0 with no elements, so that the grammar has no Henkin model. This is one of many reasons to consider not insisting on modelling syntax set-theoretically.) (34) a. The proposed analysis of syncretism has as consequences such basic relationships (in English) as NP_acc = NP_pacc union_NP NP_nom_acc NP_nom_acc = NP_nom intersect_NP NP_acc b. However, unlike the Bayer-Johnson approach, it does not suffer from the conceptual problem that syncretism cannot be formally distinguished from "property conjunction", e.g. being both nominative and third-plural-feminine. The difference is that case syncretism imposes two constraints on a single feature: NP_nom_acc = {x in NP: CASE'(x)'(bnom) & CASE'(x)'(bacc) as opposed to constraining two features: NP_nom_3pf = {x in NP: CASE'(x)'(bnom) & AGR'(x) = 3pf} -17- 5. Coordination (35) Coordination a. The fundamental pretheoretical generalization about coordination is Wasow's Generalization (Pullum and Zwicky 1986) that, leaving aside cases of "principled resolution" (Corbett 1983), the conjuncts of a coordinate structure are subject to whatever constraints are imposed on the coordinate structure as a whole. b. HPSG accounts (prior to Levy 2001 and Daniels 2001) are unable to account for the difference between neutralizable polyvalency (a single sign satisfying mutually inconsistent valence requirements by taking a coordinate argument) and "functor ambiguity" (homophonous signs with inconsistent valence requirements: i. Kim is a Republican and proud of it. ii. *You can tuna and say that again. c. Our account preserves in abstract categorical form the fundamental properties of Dalrymple and Kaplan's (2000) account, but: i. presents no problem about how to formulate or interpret the property distribution schema (there is none); and ii. does not require any stipulation about what counts as a legitmate "property" (beyond being definable in the syntactic logic). d. Our account also simulates the account of Bayer and Johnson 1995, using the (truly boolean) union operation instead of the (putatively but not really boolean) categorical coproduct (additive join) of Lambek categorial grammar (which is interpreted as set union in the frame semantics). e. Our account shares with those of Bayer and Johnson 1995, Levy 2001, and Daniels 2001, that neutralization and coordination are algebraic duals. (36) Proposed Analysis of Coordination a. If the conjuncts of a (binary) coordinate structure have syntactic components a: 1 -> A and b: 1 -> B and the injections into the coproduct are i j A ---> A + B <--- B then the syntactic component of the coordinate structure is the doubleton element of Pow(A+B) lambda ((x = i o a) \/ (x = j o b)) x:A+B -18- (36) b. In a Henkin model this is just the set {a,b}, e.g. A REPUBLICAN AND PROUD OF IT has syntactic component {NP, AP}. c. Thus the binary coordination schema has as its syntactic component the (type-schematized) deduction lambda lambda ((x = i o y) \/ (x = j o z)) (y,z):A x B x:A+B d. If a and b are identified with their respective singletons in A + B, then coordination amounts to union in A + B. 6. Ambiguity and syntactic selection (37) Ambiguity a. We treat ambiguous signs as pairs, e.g. PIGS: 1 -> N_3p x NP_3p. Thus ambiguity is product, while neutralization (of which syncretism is a special case) is intersection. No analog of this analysis is available in Lambek categorial grammar because in the frame semantics there is no intersection operation distinct from the categorical product (meet). b. The same applies to ambiguous unsaturated signs (as in (35) b.ii above) CAN: 1 -> (NP => VP) x (VP => VP) c. The Law of Exponents holds in any bicartesian closed category: For any three types A, B, C, there is a natural isomorphism n: (A => C) x (B => C) -> (A + B) => C and we could just as well have said: CAN: 1 -> (NP + VP) => VP that is, functor ambiguity is equivalent to disjunctive subcategorization (being of an exponential type whose antecedent type is a coproduct). (38) Coordinate structures as syntactic arguments a. The above treatment of functor types (e.g. VP = NP => S) does not permit coordinate structures to be arguments. -19- (38) b. We therefore revise the treatment of functor types as follows: VP = Pow(NP) => S TVP = (Pow(NP) => VP etc. That is, predicates syntactically select ONLY coordinate arguments; noncoordinate arguments are treated as singletons. c. The same thing applies to ambiguous functors (non-neutralizable polyvalency): CAN: 1 -> ((Pow(NP) + Pow(VP)) => VP) Note that the antecedent type here is a coproduct of powertypes. (39) Neutralizable Polyvalency a. By contrast with functor ambiguity, where the antecedent type is a coproduct of powertypes, in cases of neutralizable polyvalency the antecedent type is the powertypes of a coproduct. b. Example: making the simplifying assumption that English BE selects only NP and AP (but not PP, VP_pas, or VP_prp), we specify the syntactic component of BE as: BE: 1 -> Pow(NP + AP) => VP Therefore BE takes as complement a (possibly singleton) set of syntactic entities each of which is either an NP or a VP. Thus the effect of Wasow's Generalization (in this case, the distribution of the disjunctive requirement onto each of the conjuncts) is achieved type-theoretically, with no need for a Dalrymple-Kaplan style distribution schema. (40) Neutralizably polyvalent functors are also ambiguous. a. In any topos, as in set theory, for any two types A and B there is a natural embedding (i.e. monic) Pow(A) + Pow(B) -> Pow(A + B) b. By contravariance of the exponential relative to the antecedent, for any three types A, B, and C there is a natural embedding (Pow(A + B) => C) -> ((Pow(A) + Pow(B)) => C) c. Thus neutralizable polyvalency implies functor ambiguity (but not the converse). -20- (41) Conjoined functors with differing (but overlapping) valences a. Example (Dyla 1984): A coordination of two Polish verbs that govern accusative and genitive respectively can take an accusative-genitive syncretic NP (e.g. KOGO) as its object: Kogo Janek lubi a Jerzy nienawidzi? who like and hate b. We model the syntactic components of the verbs as: l: 1 -> (NP_acc => VP) n : 1 -> (NP_gen => VP) c. As discussed in (34), we have the following relationship among certain subtypes of NP: NP_acc_gen = NP_acc intersect_NP NP_gen; let a and b be the embeddings of NP_acc_gen into NP_acc and NP_gen respectively: a b NP_acc <---< NP_acc_gen >---> NP_gen d. The syntactic component of the coordinate functor is lambda lambda x: NP_acc_gen y: VP ((y = l(a')(x) \/ (y = n(a')(x))) This takes a syncretic object and returns a doubleton set of VP's. e. More generally, the schema for binary coordination of functors has as its syntactic component the type-schematized deduction lambda : (A => C) x (B => C) lambda x: A intersect B lambda ((y = u(a')(x)) \/ (y = v(b')(x))) y:C where a and b are the emedddings of A intersect B into A and B respectively. f. We leave as an exercise the further generalization of this schema to coordination of functors with different consequent types. 7. Conclusion. (42) a. We propose a framework with an HPSG-like treatment of features and constraints but a categorial-grammar-like treatment of predicate-argument combination. -21- (42) b. The proposed framework eliminates numerous foundational problems with the existing HPSG formalism. c. The proposed framework is type-theoretical in nature, but by using (Lambek and Scott's elaboration of) Curry's type theory rather than Lambek's syntactic calculus, it addresses Curry's critique of Lambek's failure to distinguish phenostructure from tectostructure. b. The syntactic language is a higher-order logic "fibred over" the underlying type-logic provided by the Curry-Howard isomorphism, and can be used for defining subtypes and stating syntactic constraints. c. The architecture is more like non-Lambek frameworks (CCG, grammar, HPSG, LFG, Montague grammar, etc.) than like Lambek categorial grammar in deriving nonlexical signs by schemata whose syntactic components are (type-schematized) deductions in the syntactic logic, but the grammar itself is not the same thing as the logic. (43) Various notions of the syntactic logic correspond directly to linguistic notions, e.g.: elements of types syntactic components of signs other arrows syntactic components of phrasal schemata primitive types types for values of atomic features indexed products complex feature structure types projections features coproducts partitions in type hierachies exponential modes grammatical functions elements of syntactic components of coordinate powertypes structures binary products ambiguity intersection neutrality/syncretism union coordination antecedent type neutralizable polyvalence is powertype of coproduct antecedent type non-neutralizable polyvalence is coproduct of (= functor ambiguity) powertypes -22- (42) The foregoing represents only a point of departure in the development of a higher-order grammar. Among the tasks that remain, some of the highest-priority ones are: a. determining what further features and type constructors are needed to handle a more robust range of constructions (e.g. there-insertion, raising, unbounded dependency constructions, gapping, comparative ellipsis, etc.) and constraints (e.g. binding principles A and B, island constraints, crossover constraints) b. determining the range of necessary phrasal schemata (i.e. developing a theory of constructions); c. establishing a syntax-phonology interface (i.e. determining the range of phenogrammatic operations and how to specify them); d. deciding among numerous options for the syntax-semantics interface (any of various underspecification schemes, glue semantics, etc.)