Some musings on corpora

🕑 9 min • 👤 Thomas Graf • 📆 September 05, 2019 in Discussions • 🏷 syntax, corpus linguistics, Minimalist grammars, Combinatory categorial grammar

Pro tip: Don’t start a multi-part series of posts on locality right before the beginning of the semester and when you have a pile of papers to review. On the upside, this will give you guys some extra time to digest all the concepts in the three previous posts. In the meantime, here’s a quick and dirty post on corpus linguists and why it should be part of our syntax curriculum. I didn’t even proofread it, so beware.


Continue reading

KISSing semantics: Subregular complexity of quantifiers

🕑 9 min • 👤 Thomas Graf • 📆 July 26, 2019 in Discussions • 🏷 subregular, strictly local, tier-based strictly local, monotonicity, quantifiers, semantics, typology

I promised, and you shall receive: a KISS account of a particular aspect of semantics. Remember, KISS means that the account covers a very narrowly circumscribed phenomenon, makes no attempt to integrate with other theories, and instead aims for being maximal simple and self-contained. And now for the actual problem:

It has been noted before that not every logically conceivable quantifier can be realized by a single “word”. Those are very deliberate scare quotes around word as that isn’t quite the right notion — if it can even be defined. But let’s ignore that for now and focus just on the basic facts. We have every for the universal quantifier \(\forall\), some for the existential quantifier \(\exists\), and no, which corresponds to \(\neg \exists\). English is not an outlier, these three quantifiers are very common across languages. But there seems to be no language with a single word for not all, i.e. \(\neg \forall\). Now why the heck is that? If language is fine with stuffing \(\neg \exists\) into a single word, why not \(\neg \forall\)? Would you be shocked if I told you the answer is monotonicity? Actually, the full answer is monotonicity + subregularity, but one thing at a time.


Continue reading

I'm not done KISSing yet

🕑 3 min • 👤 Thomas Graf • 📆 July 25, 2019 in Discussions • 🏷 methodology, semantics

I just got back from MOL (Mathematics of Language) in Toronto, and much to my own surprise I actually got to talk some more about KISS theories there. As you might recall, my last post tried to made a case for more simple accounts that try to handle only one phenomenon but do so exceedingly well without the burden of machinery that is needed for other phenomena. My post only listed two examples for syntax as I was under the impression that this is a rare approach in linguistics, so I didn’t dig much deeper for examples. But at MOL I saw Yoad Winter give a beautiful KISS account of presupposition projection (here’s the paper version). That’s when it hit me: in semantics, KISS is pretty much the norm!


Continue reading

KISSing syntax

🕑 7 min • 👤 Thomas Graf • 📆 July 12, 2019 in Discussions • 🏷 methodology, syntax

Here’s a question I first heard from Hans-Martin Gärtner many years ago. I don’t remember the exact date, but I believe it was in 2009 or 2010. We both happened to be in Berlin, chowing down on some uniquely awful sandwiches. Culinary cruelties notwithstanding the conversation was very enjoyable, and we quickly got to talk about linguistics as a science, at which point Hans-Martin offered the following observation (not verbatim):

It’s strange how linguistic theories completely lack modularity. In other sciences, each phenomenon gets its own theory, and the challenge lies in unifying them.

Back then I didn’t share his sentiment. After all, phonology, morphology, and syntax each have their own theory, and eventually we might try to unify them (an issue that’s very dear to me). But the remark stuck with me, and the more I’ve thought about it in the last few years the more I have to side with Hans-Martin.


Continue reading

Vision on P vs. NP

🕑 3 min • 👤 Thomas Graf • 📆 July 03, 2019 in Discussions • 🏷 fun allowed, complexity theory

Come and listen to the Vision of the Avengers, who has saved this planet thirty-seven times. Listen to his story of P vs. NP. No, seriously, the following is an excerpt on complexity theory from Tom King’s Vision.


Continue reading

News from the MG frontier

🕑 3 min • 👤 Aniello De Santo • 📆 June 24, 2019 in Discussions • 🏷 MGs, parsing, NLP

True to my academic lineage, I’m a big fan of Minimalist grammars (MGs): they are a pretty malleable formalism, their core mechanisms are very easy to grasp on an intuitive level, and they are close enough to current minimalist syntax to allow for interesting computational insights into mainstream syntax. However, I often find that MGs’ charms don’t work that well on my more NLP-oriented colleagues — especially when compared to some very close cousins like TAGs or CCGs. There are very practical reasons for this, of course, but two in particular come to mind right away: the lack of any large MG corpus (and/or automatic ways to generate such corpora) and, relatedly, the lack of efficient, state-of-the-art, probabilistic parsers.

This is why I’m very excited about this upcoming paper by John Torr and co-authors (henceforth TSSC), on a (the first ever?) wide-coverage MG parser. The parser is implemented by smartly adapting the \(A^*\) search strategy developed by Lewis and Steedman (2014) for CCGs to MGs (basically, a CKY chart + a priority queue), and coupling it with a complex neural network supertagger trained on an MG treebank.


Continue reading

The anti anti missile missile argument argument

🕑 7 min • 👤 Thomas Graf • 📆 June 21, 2019 in Discussions • 🏷 formal language theory, generative capacity, morphology, semantics

Computational linguists overall agree that morphology, with the exception of reduplication, is regular. Here regular is meant in the sense of formal language theory. For any given natural language, the set of well-formed surface forms is a regular string set, which means that it is recognized by a finite-state automaton, definable in monadic second-order logic, a projection of a strictly 2-local string set, has a right congruence relation of finite index, yada yada yada. There’s a million ways to characterize regularity, but the bottom line is that morphology defines string sets of fairly limited complexity. The mapping from underlying representations to surface forms is also very limited as everything (again modulo reduplication) can be handled by non-deterministic finite-state transducers. It’s a pretty nifty picture, though somewhat loose in my subregular eyes that immediately pick up on all the regular things you don’t find in morphology. Still, it’s a valuable result that provides a rough approximation of what morphology is capable of; a decent starting point for further inquiry. However, there is one empirical argument that is inevitably brought up whenever I talk about the regularity of morphology. It’s like an undead abomination that keeps rising from the grave, and today I’m here to hose it down with holy water.


Continue reading

More observations on privative features

🕑 7 min • 👤 Thomas Graf • 📆 June 17, 2019 in Discussions • 🏷 features, privativity, phonology, syntax, transductions

In an earlier post I looked at privativity in the domain of feature sets: given a collection of features, what conditions must be met by their extensions in order for these features to qualify as privative. But that post concluded with the observation that looking at the features in isolation might be a case of the dog barking up the wrong tree. Features are rarely of interest on their own, what matters is how they interact with the rest of the grammatical machinery. This is the step from a feature set to a feature system. Naively, one might expect that a privative feature set gives rise to a privative feature system. But that’s not at all the case. The reason for that is easy to explain yet difficult to fix.


Continue reading

Who watches the NEG-raisers?

🕑 2 min • 👤 Thomas Graf • 📆 June 13, 2019 in Discussions • 🏷 fun allowed, negative concord, NEG raising

I reread Alan Moore’s Watchmen today. Still amazing, not one bit overrated, and whenever I pick it up I can’t help but finish it in one sitting. But did you know that Watchmen actually challenges the very foundations of syntactic theory?


Continue reading

Some observations on privative features

🕑 9 min • 👤 Thomas Graf • 📆 June 11, 2019 in Discussions • 🏷 features, privativity, phonology, syntax

One topic that came up at the feature workshop is whether features are privative or binary (aka equipollent). Among mathematical linguists it’s part of the general folklore that there is no meaningful distinction between the two. Translating from a privative feature specification to a binary one is trivial. If we have three features \(f\), \(g\), and \(h\), then the privative bundle \(\{f, g\}\) is equivalent to \([+f, +g, -h]\). In the other direction, we can make binary features privative by simply interpreting the \(+\)/\(-\) as part of the feature name. That is to say, \(-f\) isn’t a feature \(f\) with value \(-\), it’s simply the privative feature \(\text{minus} f\). Some arguments add a bit of sophistication to this, e.g. the Boolean algebra perspective in Keenan & Moss’s textbook Mathematical Structures in Language. So far so good unsatisfactory.


Continue reading