Mellow musings on peer review

🕑 8 min • 👤 Thomas Graf • 📆 June 07, 2021 in Discussions • 🏷 publishing, journals, peer review

My last post (yes, ages ago) reflected on two issues that come up quite a bit on Martin Haspelmath’s blog, and in both cases I did not really agree with his conclusions. But there is a third issue he mentions a lot: peer review, be it for conferences, journals, or grants. Haspelmath has an impressive number of posts on the topic. The tldr is that reviews are a waste of time for reviewers, do not improve the final paper or proposal (e.g. because authors have to tack on extraneous stuff to please reviewers), and incentivize flashy presentation over substance. Moreover, reviewers are frequently forced into the role of gatekeepers who have to defend the fair maidens of Publisher Island and Conference Valley from the ravaging hordes of sub-par submissions. I do not want to directly argue for or against these points, I’m sure the prestigious outdex readership can make up its own mind. But I will say that my own experience has been a lot more positive, largely because of the field I’m in. So the following are some reflections on what I think mathematical linguistics as a field gets right with peer review (and there’s also a tiny bit about how it can sometimes go wrong).

Peer-review in mathematical linguistics is pleasantly mellow

I have published many computational papers, and the few times I got a nasty or unhelpful review was when the paper was submitted to a theoretical linguistics venue. The reviews from mathematical linguists, on the other hand, are among the most professional and level-headed I’ve ever seen. There’s four reasons for that, and I’m gonna talk about each one in painstaking detail.

Nah, not really, we can mostly ignore the first two because they don’t provide much of a learning opportunity for theoretical linguistics. First, mathematical linguistics is a small field, and that instills a very personal sense of community. Second, mathematical work is easier to evaluate objectively: are the definitions internally consistent, are the proofs correct, does the paper provide enough motivation for the work done, can you read it without going stark-raving mad. Suggestions for improvement are straight-forward, too: tighten up the writing here, change the notation a bit there, rework proof 3 to account for this special case. It’s all very cut-and-dry and, crucially, devoid of emotion or dogma. Math provides a framework where every paper could have plenty useful in it even if its priors do not match yours. You just won’t be inclined to, say, dismiss a TAG paper because you know for a fact that CCG is the only true formalism. It’s all very much in the spirit of letting a thousand flowers bloom because we know how to pick parts from one flower and crossbreed them with another one.

That’s all nice and dandy, and I wouldn’t wanna have it any other way, but it isn’t really something that theoretical linguistics can hope to emulate. We can’t really kick out 90% of all linguists to make the field feel a bit cozier (it’s still pretty cozy compared to most other fields). And as I’ve said many times before, many linguists believe that there is one right theory, usually their own, in which case any deviation from that is problematic. And that creates a very different reviewing dynamic. But there are two other factors that make mathematical linguistics reviewing different, and those can be emulated in other fields: the length of submissions, and the streamlined review process.

Submission length

Mathematical linguistics largely avoids 2-page abstracts or papers with 30+ pages, two formats that are very popular in other subfields and, coincidentally, a total pain to review. One is too short, the other too long, although journal papers at least aren’t as pointless as abstracts. Abstracts are a complete waste of everybody’s time:

The authors have to squeeze tons of information into two pages or less, and then spend hours tweaking and polishing. All that effort is single-use because abstracts cannot be easily retooled into more useful formats, like a poster, slides, or a paper. An abstract can only produce more abstracts.
The reviewers, on the other hand, have to read the damn thing at least three times to make sense of it. Basically, they have to spend time and energy decompressing all the compressing the authors had to do because for some reason the abstract had to be two pages instead of 4. And even after all that decompressing they still don’t have much to go on for feedback. Maybe some missing references or some problematic data points, but who knows if those references are missing because of ignorance or space constraints, and who knows whether a data point is actually problematic or just too complicated to be covered in two pages.

Quite simply, abstracts are too short to be easily written, read, or evaluated. The authors learn little more from the reviews than how to rewrite the abstract, and the reviewers learn even less from the abstract itself.

Btw, this also means that abstracts are an atrocious gatekeeping mechanism for conferences. If you know how to bullshit, an abstract is the perfect format for bullshitting your way past the reviewers, whereas the best research project won’t get a pass if you haven’t figured out how to condense it to two pages. This goes doubly so for work that’s not part of the mainstream, for if the reviewer doesn’t already know where you’re coming from, an abstract doesn’t give you enough space to get them ready for the ride. Net gain for authors, reviewers, and the community at large: 0 at best, negative (countable) infinity at worst.

In mathematical linguistics we don’t submit abstracts, we submit papers. But those are short papers, 8 to 16 pages depending on the venue. That’s long enough to avoid all the problems with abstracts, yet without going too far in the other direction. It’s not some 50 page monster where just opening the PDF makes your reviewer heart sink. You can actually read such a short paper two or three times in a reasonable amount of time. There is plenty of substance to sink your teeth into. There is a chance for you to meaningfully improve a piece of research, a piece of academic writing that may be consumed by others many years from now. And eventually those short papers can be compiled into a long paper that makes it into a journal — a long paper that has seen tons of feedback along the way and rests on a very solid foundation, which again makes the reviewing stage a lot more pleasant.

Single round reviewing

Now all of that by itself would already be great, but the real kicker is that if you are asked to review one of those short 8 or 16 page papers, you only review it once. Just as with abstracts, there is no revise-and-resubmit.

That one review you write is your final word, and it isn’t even binding. You rate the paper on several criteria, and you provide detailed written feedback, and then the ball is in the editor’s corner. They decide whether the paper makes the cut or not, and if it does, it is the authors’ judgment call how much they want to change. The editors always include some boilerplate request to keep the reviewer’s remarks in mind when producing the camera-ready version, but there are no mechanisms in place to enforce that exactly because the quality requirements of the venue have already been met. If the paper had issues that absolutely required revisions, it would have been rejected. And hence it is up to the authors to use the feedback they got as they see fit. There is no need to please the reviewers, no need for lengthy responses where you justify why and how you incorporated some remarks while ignoring others. You made the cut, you’re given an opportunity to make the paper as good as you can based on the feedback you got, and that’s it.

What can be copied

To sum up, the secret reviewing sauce in mathematical linguistics has four ingredients:

It’s a small field, so it’s all very intimate and sociable.
Reviewing math-heavy papers, albeit challenging on a technical level, is fairly straight-forward. It’s more craft than art.
Conferences are paper-reviewed, not abstract-reviewed.
Reviewing is a one-round affair for conference papers. Since most papers are conference papers, not journal papers, most reviews are one-round affairs.

I think the last two points actually do most of the heavy lifting, and those are exactly the ones that need not be limited to mathematical linguistics. Of course that would take quite some convincing. Reviewers are hard to get by as is, and apparently most linguists believe that reviewing 3 short papers would take a lot more time than reviewing 3 abstracts. So if anyone wanted to bring the wonders of mathematical linguistics reviewing to other subfields, it’d be an uphill battle. And I’m not saying that this is the only right way of doing things. All I’m saying is that if you share Martin Haspelmath’s concerns about peer review and quite frankly, find it a drag no matter which end of it you’re on, there are existing systems that can be emulated.

Where there’s light, there’s a tiny bit of shadow

Now I’m not gonna pretend that absolutely everything is all sweet and gooey over here in the cotton candy land of mathematical linguistics. If you have to review a paper that’s way outside our area of expertise, that’s more painful than a comparable abstract review because you’re looking at 8 pages of gibberish instead of 2, and you have to write more than two sentences for your review. And that’s not a hypothetical: in the neighbouring crazy funky party town of computational linguistics, where the recent NLP boom means that there’s way more submissions than qualified reviewers, you will probably have to review a paper on a topic you know embarrassingly little about. In order to fix that, the ACL and other conferences have moved to increasingly elaborate, multi-stage review systems that try to make the reviewers vet each other’s reviews in order to filter out obviously sub-par reviews. It’s all very clunky and bureaucratic and, as far as I can tell, only improves the situation insofar as the organizers get to feel like they’re doing something to improve the situation. And don’t even get me started on the ACL’s rolling review initiative (well, alright, you can get me started in the comments section). So there’s no perfect solution, every system has both strengths and weaknesses. But, speaking for myself, I greatly prefer how mathematical linguistics handles things.

Next time: some actual language and computation on this blog devoted to language and computation.