Formal Grammars
The first endeavours to give a formal notation of
linguistic strings were due to Axel Thue and Emil Post1
who adapted their theory to data processing in Turing machines. An
epoch-breaking attempt in their formalisation was launched by Y. Bar-Hillel, who devised a quasi-arithmetical
notation for deciphering syntactic phrases.2
In the mid-1950s it developed into
categorial grammars that offered a recognoscative apparatus for evaluating
the grammatical correction of sentences and lexical strings. His analysis of language structures arose as a
by-product of the earliest research in rewriting systems designed for
machine-processing. It
provided a decision-making counterpart to Noam Chomsky’s generative
phrase-structure grammars that made the greatest contribution to modern
techniques of artificial intelligence. His results influenced generations of
young researchers and became a headstone of theoretical
computer science. Formal Presuppositions of Semantic DescriptionLinguistic studies wend their way in two directions, one
devoted to visible language form and another to the invisible sphere of
meaning-oriented semantics. Their interrelations are not linked by strict
mathematical homomorphism but allow us to speak informally about approximate
mappings. Let there be a natural language L composed from its alphabet
A, vocabulary V and the apparatus G of grammatical
rules. Then it is possible to define algebraic semantics as a formal system
dealing with their mapping f into the realm of semantic referents. The
vocabulary is mapped into the set S of semantic meanings (sememes), while the
grammatical apparatus G is projected upon the schematic layout C of logical
and ontological categories. f(words)
= meanings f:
V ® S f(grammar) =
categories f:
G ® C Natural languages involve much polysemy so it is necessary
to restrain the reference of words to the kernel vocabulary of primary
literal meanings. This methodological step presupposes abstracting from
infinite varieties of figurative meanings implied by numerous secondary
connotations. This is why algebraic semantics directly switches from
linguistic form to the realm of meaning. For simplicity sake it treats words
as basic sememes in their basic primary elementary sense. When dealing with
the modal meaning of must, may, will, shall, it
considers them right away as sememes and resigns from mentioning the
irrelevant intricacies of their formal lexemes. The present state of human cognition may be summed up by
concluding that theoretical logic and mathematics give an exact formalisation
of the most essential fields of human thought but cover only a small part of
semantic fields. Even if they give a precise logical treatment of basic
elementary concepts, they do not care to render an integral description of the
whole layout of a given semantic area. They are engrossed too much in their
special internal technicalities that hinder them from joining their
subtheories into an all-inclusive picture of the outer world. Algebraic
semantics works with less rigorous theoretical apparatus but relentlessly
strives to ensure mutual convertibility between semantic, logical,
mathematical and algebraic calculi. Constituency and Dependency GrammarsModern advances of formal grammars have
devised two elementary types of formal linguistic analysis. One was based on
Chomsky’s phrase-structure grammars and their close predecessor, the immediate
constituent analysis proposed by Rulon Wells3.
Both approaches treated linguistic structures as linear sequences of words
made up from the vocabulary of a natural language and put forward useful
methods of their hierarchical segmentation. Their chief weakness was seen in
low sensitivity to the mutual subordination of constituents. This drawback
was partly removed by L. Tesnière’s project of dependency grammars4. His verb-centred system focused on
semantic actants and syntactic pairs relating heads and dependents. Their
mutual advantages are elucidated by the comparison5
of two ways of analysing the sentence We are trying to understand the
difference given below. Table 1. Dependency and constituency
grammars The chief asset of
grammatical trees is that they give a vivid illustrative representation of
syntactic structures for common laic observers but this is debased by
difficulties, which it brings about in automatic word processing. Hence, a
convenient remedy is provided by parenthetical and fractional grammars. Parenthetical GrammarIn formal linguistics it is essential to realise
that the laws of associativity hold neither in lexical nor in syntactic
strings. Their lack and absence advances a strong argument for parenthetisation.
The structuring and inner hierarchy in the following German and English expressions
is much easier to understand from the use of parentheses. Parenthetical grammar
is a formal rewriting system that applies parentheses for expressing the
grammatical relations of dependency and semantic subordination. It provides
the simplest method of syntactic parsing without requiring very demanding
means of visual representation. It employs a simple apparatus of left
brackets (‘{’, ‘[’ or ‘(’) in order to demark the initial boundary of
linguistic expressions and right brackets (‘}’, ‘]’ or ‘)’) that delimit
their end. As seen in the phrase a
ladies’ dress parenthetisation induces
considerable differences in meaning: a ladies’
dress = a (ladies’ dress) ¹ (a
lady’s) dress = a lady’s dress . The expression on the left describes a dress
for ladies, whereas the phrase structure on the
right refers to a particular lady’s garment. A simple example of sentence analysis is given
by the collocation Such an extremely long journey exhausted our energy. Its
parenthetical articulation grammar segments couples of heads and dependents
into the ensuing hierarchy: ((((Such (an
((extremely long) journey))) (exhausted (our energy))). When rendered in terms of phrase structures,
its decomposition proceeds as follows: S ® NP VP ® ((AP NP)
VP) ® ((Adv AP NP)
VP) ® ((D A
NP) VP) ® ((D A
NP) (V NP)) . Another telling
illustration is supplied by the string
Little Red Riding-Hood went to her grandmother
in another village: ((Little (Red Riding-Hood))) (went (to (((her
grandmother)) (in (another village)))). The main reason for
introducing such adjustments in syntactic theory is not only that it saves
space and simplifies analysis. Its most important
theoretical facility consists in opening the second dimension of syntactic
hierarchy. Parenthetical grammars turn linear sequences into 2D-patterns
embedding strings into a two-dimensional Cartesian space. Its basic
horizontal axis x depicts the linear sequencing of symbols, while the
second vertical axis y plots strings with the scaled hierarchy of
phrase-structures according to different levels of syntactic validity.6
Concatenative and Decatenative GrammarIn current string theory individual symbols and string are
treated as immediate constituents linked by the binary operation of concatenation.
InformalIy speaking, it is a procedure joining two strings of shorter length
into a concatenation whose length is the sum of both segments. Given two
arbitrary strings S1 = x1...xn
and S2 = y1...ym, their concatenation S1S2 results in the following formula: S1S2 = x1...xny1...ym
. If x and y are basic symbols, their logical connective is
written in different algebraic symbols such as xy = x * y = x × y . Bar-Hillel’s theoretical apparatus made use of analogies
to arithmetical multiplication, division and cancellation but such
conventions represented only a formal and artificial apparatus. In fact, they
have little to do with properties of rational numbers featuring in
arithmetical fractions. He may have applied also additive formalism that
renders concatenating strings as a sum of two addends. An elementary case of
additive binary concatenation can be illustrated by joining two
lexical strings composed of several letters as in the formula below: town + hall = townhall . An inverse operation to concatenation may be denoted as decatenation –
and defined as unlinking chains into short fragments. A simple illustration
of decatenative cancellation is provided by townhall – hall = town . Neither concatenation nor decatenation is a commutative operation. This means that
the order of addends and subtrahends cannot be switched: town + hall = townhall ¹ hall + town . This inconveniency makes us introduce a special symbol Ø for left subtraction: -town + townhall = town Ø townhall = hall . The chief argument for
giving preference to additive notation for concatenative
strings is that the slash sign for right and left division can be employed
for other purposes such as syntactic dependence. Some theoretical
contributions have developed the idea of ‘right cancellation’ conceived as a
string operation that deletes some symbols on the right end of the string:
“The right
cancellation of a letter a from a string s is the removal
of the first occurrence of the letter a in the string s, starting from
the right hand side. The empty string is always
cancellable: Clearly, right cancellation and projection commute.”7
However, it cannot be regarded as identical to the concept of right decatenation. Fractional GrammarsThe formal apparatus of parenthetical grammars shares many
inadequacies encountered in immediate constituent analysis. It chains
subsequent neighbouring words into pairs but does not specify their
grammatical interrelations expressed by their mutual syntactic dependency. A
convenient solution is offered by the so-called fractional grammars. They combine the convenient properties of
constituency and dependency by indicating the subordinate position of
dependents by slash signs ‘/’ and ‘\’. This is how
it is possible to analyse a simple sentence The extremely long
journey exhausted our energy: (((The\((extremely\long)\journey))\(exhausted/(our\energy))). S ®
NP\VP ® ((AP\NP)\VP) ® ((Adv\AP))\NP)\VP)
® ((D\(A\NP))\VP) ® ((D
((Adv\A)\NP))\(V/(D\NP))) . The right slash in V/NP means that in
accusative object constructions the noun phrase the NP
functions as a dependent of the head V (verb). It is efficient especially in
indicating the syntactic status of incongruent attributes following the
governing nominal head. Its treatment of attribute constructions is
illustrated by the phrase structure the flower of many colours: (the\flower)/(of(different\colours)) . NP® (D\N)/NP ® (D\N)/(A\N) . The replacement
of cancellation by subtraction seems convenient since it permits exploiting
slash marks for designating other important string operations. One possible
usage might serve for designating relations of syntactic dependency. The
inner structure of a word would be comprehensible if we combined dependency with
parenthetisation. The afore-mentioned
phrases would beam with clarity and explicitness if they were segmented
neatly by parentheses determining the hierarchy of terms: Rücksichtslosigkeit » ‘inconsiderateness’ , ((((Rück\sichts)\los)\ig)\keit)
» ‘(in\((consider)\ate)\ness)’
. In such lexical derivations suffixes act as the
governing head because they explicitly give the whole expression its categorial and part-of-speech standing. If a lexical root
is preceded by a few prefixes and appended by several suffixes, we do not
consider the order of its etymological composition but the hierarchy of
syntactic values. Etymologically speaking, in ‘boldness’ the adjective
‘bold’ is primary but in lexical analysis it is secondary because the
part-of-speech value of ‘boldness’ is determined by the suffix
‘-ness’. |
1 Emil Post : Recursive Unsolvability
of a Problem of Thue. The Journal of Symbolic
Logic, vol. 12, 1947: 1–11.
2 Y. Bar-Hillel:
A quasi-arithmetical notation for syntactic description. Language, 29 (1), 1953: 47–58.
3 Rulon S. Wells: Immediate Constituents. Language, 23,
1947: 81–117.
4 L. Tesnière: Éléments de syntaxe structurale.
5
https://en.wikipedia.org/wiki/Dependency_grammar#Dependency_vs._constituency.
6 Pavel Bělíček:
Systematic Poetics III. Formal Poetics and Rhetoric. Prague 2017, 357p., p. 36,
40.
7
https://en.wikipedia.org/wiki/String_operations#String_substitution.