Archive for grammar

Tá mé sa bhaile – Biden’s Irish

Posted in Irish Language, Politics with tags , , , , , , on April 14, 2023 by telescoper

TRIGGER WARNING: CONTAINS PRONOUNS!

Yesterday, President of the United States of America, Joe Biden, addressed a joint sitting of the houses of Oireachtas in Dublin. Predictably he included an attempt at Irish in his speech to the obvious appreciation of those attending. I was a bit confused by the way what he said was reported in the Irish media, however, e.g.

My confusion was that I didn’t think he said tá mé seo abhaile as widely reported. For one thing, even I as a beginner could see that phrase means “I am this home”, which doesn’t make any sense (not to me, anyway). There are various possibilities for what Joe Biden did say. For what it’s worth I thought it was tá mé sa bhaile which, loosely, means “I am at home”. I note that the news media have generally changed their accounts (e.g. here) to reflect this, although other forms of words are possible. I’m not surprised that Biden struggled with the pronunciation – most of us beginners do, but I think the writers and editors of the newspapers above might at least have corrected his grammar.

The phrase illustrates a couple of interesting curiosities about the Irish language. Expressing the verb “to be” in Irish isn’t as straightforward as it is English. There are two grammatically distinct ways of doing this. The two Irish forms are , which is like the English verb “to be” and the so-called copula, is, which is sometimes called a defective verb. It’s admittedly a bit confusing that the copula looks like the third-person singular of the verb “to be” in English, but there you go.

Going back to, it is frequently referred to as tá (its present tense form as in the phrase above). It can be fully conjugated in all tenses and persons but it is highly irregular. Grammatically, is also just like any other verb, coming first in the sentence, followed by a subject (either a separate noun or pronoun or a suffix, depending on the tense and person, as shown in the conjugations), and then its predicate and any remaining adverbial information. Thus tá mé is “I am” with the pronoun . The accents (síneadh fada)  mean that this is pronounced taw-may.

The copula, however, is not fully conjugated for different subjects, which are always expressed by separate nouns or pronouns, and it only has two forms for different tenses: is can be used for present or future meaning, and ba (with lenition) is used for past or conditional meanings.

Among the specific situations in which the copular is must be used instead of but the main one is to be followed by a noun. You can’t say “I am a Professor” using – it has to be Is Ollamh – but to say “I am old” it is Tá mé sean.

One final remark. If you’re scared of pronouns please look away now. There are over 120 different pronouns in the Irish language. There’s a special version of the pronoun written mise which has two uses that I am aware of. One is when the copular is used for identification – so “I am Peter” is Is mise Peadar – and the other is for emphasis, when it is roughly equivalent to “myself” in English.

P.S. The Irish word for “pedantry” is pedantraí

The Perfect Afters

Posted in Irish Language with tags , on June 18, 2022 by telescoper

When I first arrived in Ireland, one thing I noticed about the way Irish people use the English language is a construction using the word “after” and the present participle of a verb. I first heard it in the context of a football match on the television, actually, during which the commentator said “the ball is after going out for a corner” or words to that effect.

This construction is basically an alternative way of constructing what is called in Latin called (past) perfect tense of a verb, indicating an action which is now completed. In Latin this would be formed by a particular ending of the verb but when translated into English it would either be a simple past verb form (usually ending in -ed) or using the auxiliary verb “to have”. For instance, in the football example above you would interpret the meaning as “the ball has gone out for a corner” or the “the ball went out for a corner”.

(Now I’m regretting using the irregular verb “to go” in the football example but I hope you catch my drift…)

The “after” construction is not just an alternative way of writing the past tense, however, as it can (and usually does) specifically imply an action that has been completed in the very recent past, something you might express in English by inserting the word “just”. This is sometimes called the immediate perfective. It can also be used to form the pluperfect tense (expressing an action already completed at some time in the past) by using the past of the verb “to be”, though in modern Irish it seems to be more-or-less exclusively used for actions only recently completed.

Examples include:

  • He is after writing a letter – He has (just) written a letter
  • I’m only after getting here – I’ve just got here
  • He was after walking the dog – He had walked the dog
  • I’m after reading James Joyce’s Ulysses for the second time – I have just read James Joyce’s Ulysses for the second time…

In the book English As We Speak It In Ireland, the author P.W. Joyce writes that no such form ‘would be understood by an Englishman, although they are universal in Ireland, even among the higher and educated classes’.

It’s certainly the case that I didn’t really understand it when I first heard it, but I have heard it used on countless occasions by friends and neighbours since then. I think I was initially confused because “he is after..” can appear in English. phrases such “he is after a new job” expressing something like “looking for” (i.e. with intent) but that is not suggested in the examples above.

I think poll my readers on this, which will probably demonstrate how few Irish readers I have. If someone were to say “I’m after getting a cup of tea” would this mean:

It’s reasonable to wonder how this construction came about. The answer is that in Irish the verb “to be” is very peculiar, existing in two distinct forms, and there is no direct equivalent of the verb “to have” as it is used in the formation of verb tenses in English. There is a simple past in Irish that basically works like the English equivalent but tenses involving “have” or “had” as an auxiliary verb are impossible to render word for word. For example, translating I have just done it into Irish could give you  Tá mé tar éis é a dhéanamh or Tá mé i ndiaidh é a dhéanamh, both of which literally mean I am after doing it. (Tá mé means “I am” and the underlined phrases essentially mean after).

I suppose you can think of this interesting construction as being a relic of the Irish language surviving after the imposition of English on the population. Whatever its origins, though, I’m after concluding that this construction, although not standard in British English, is perfectly sound from a grammatical point of view.

Finally, and incidentally, the lack of an appropriate verb “to have” causes some other interesting expressions in Irish. One of my favourites is exemplified by the phrase “I have a cold” which, translated into Irish is “Tá slaghdán orm” which means, literally, “A cold is on me”…

Anyway, I’m after finishing.

Pronouns for Yous

Posted in Biographical, GAA, Television with tags , , , , , on August 21, 2021 by telescoper

Last night I was watching a very interesting television programme on the Irish language channel TG4. It was about the origins and history of ice hockey, which began as ice hurling as a sport played by Irish immigrants in Canada. The word “puck” comes from the Irish word poc which means to stroke or hit; in hurling the “puck out” is a free hit from the goal area by the goalkeeper much like a goal kick in soccer. The programme was called Poc na nGael, which roughly translates as “The Puck of the Irish”. I think it was repeated last night because this Sunday sees the biggest event of the year in the hurling calendar: the Final of the All-Ireland Senior Hurling Championship at Croke Park, which this year is between Limerick and Cork.

While watching that programme I got thinking about Irish language lessons and whether I will have time to continue them next academic year and then onto wider issues about differences between Irish and English. One thing that struck me was the second person pronoun, so I thought I’d do the following rambling post about it.

In English the personal pronouns I (first person) and he/she (third person) are unusual in that they change depending on their grammatical role. This isn’t unusual in other languages especially Latin where it is the rule rather than the exception. In English we use “I” in the nominative case (“I hit the dog”) but “me” in the accusative case (“the dog bit me”) or when following a preposition (“the dog gave the stick to me”). The same goes with he/him and she/her.

In the example “the dog gave the stick to me”, “me” is really in the dative case but there is no distinct word for that in English; we can only really distinguish between the nominative (subject) and “other” (non-subject) cases. The words “my”, “our”, etc are often called pronouns but they are really of adjectival form, e.g. “this is my cat” and are more correctly called determiners. There are possessive pronouns (“mine”, “ours”, etc) which are in some sense genitive cases of the personal pronouns (meaning “of me”, “of us”, etc) but I digress.

Notice also that the first person and third person plural also have distinct plural forms (we/us and they/them).

The funny one is the second person “you”, which has neither an accusative nor prepositional form nor a distinct plural: “You hit the dog”, “the dog bit you” and “the dog gave the stick to you” all employ the same word although each is in a different grammatical case.

This is by no means the only oddity in modern English, and I have no idea why it developed. In older forms of English there were distinct forms: “thou/thee” in the singular and “ye/you” in the plural. These forms persist in dialects such as Yorkshire.

For some reason, though, as English evolved these four distinct forms merged into one, i.e. “you”. One can usually tell from the context whether “you” is singular or plural or can emphasize it by adding extra words (e.g. in the American “y’all” which is a contraction of “you all”) but there is no single word in standard English that expresses the difference between singular and plural or between subject and non-subject.

Incidentally, in Irish the second person singular is in the nominative case and thú in the non-nominative cases; the second person plural is sibh which is like “ye” in that it has no distinct non-nominative form.

I was brought up on Tyneside and it is a feature of the Geordie dialect that people use the word “yous” to denote the second person plural. It’s definitely a working-class slang, and was very much frowned upon at school, but it was very commonplace when and where I was grew up. I thought it was only in Newcastle that people used this form but when I worked at Sussex a while ago my boss, originally from Glasgow, also on occasion used “yous”. When I asked here about it she explained that it was common usage in Glasgow but didn’t think it was widespread in other parts of Scotland. Geordie and Glaswegian are thus two regional dialects I know that use this form but there may be others. I’d be interested to know so please feel free to comment via the box below!

Anyway the reason for going off on this tangent was that I’d already noticed that a few Irish people use “ye” in Hiberno-English for the second person plural, it was only yesterday that I noticed some using “yous”. I wonder how widespread that is in Ireland and is it regional or more of a class divide?

Would any of yous like to comment?

Singular Shenanigans

Posted in Pedantry with tags , , , , , on March 31, 2019 by telescoper

I used the word `shenanigans’ in a recent post, after which I wondered to myself whether there’s such a thing as a single `Shenanigan’. The Oxford English Dictionary says yes, defining it thusly:

I was a little surprised by this as I’ve only ever heard this word in the plural, shenanigans, but there we are. Another thing that surprised me is the `Origin obscure’; even the One True Chambers says `Origin unknown’. I’d always assumed that this was a word of Irish origin like, e.g., `slogan’. The oldest uses given in the OED are all American, from the mid-19th Century which does not refute the possibility that it is based on an Irish word because of the huge Irish diaspora in the United States, especially after the Great Famine of the 1840s, but I’m surprised the main English dictionaries have been unable to locate the connection.

The best I’ve been able to do using Google is the Irish word sionnachulghim,meaning `to play tricks, to be foxy’ (from sionnach, `fox’). That seems to me to be a plausible idea, but not it’s conclusive. If anyone has any further thoughts on the origin of shenanigans I’d be very interested to hear them through the comments box below.

To return to my original thought that shenanigans was a noun that that only exists in the plural, if it were so it would belong to the class of Plurale Tantum (which I blogged about a long time ago, here in the context about whether `data’ is singular or plural). Other examples of English nouns that exist in the `plural only’ include: suds, entrails, outskirts, odds, tropics, riches, surroundings, thanks, heroics, faeces and genitalia.

To my mind you should treat your data the same way you treat your genitalia. Grammatically speaking, I mean.

Synesis, Metonymy and the World Cup

Posted in Football with tags , , , , , , on June 27, 2018 by telescoper

The shock defeat of Germany by South Korea this afternoon means that the world champions fail to progress from the group stage and are eliminated from the competition. In other words, Germany are out. Or should that be Germany is out?

Strictly speaking, the singular form is correct (as was Nelson with his “England expects..” message at Trafalgar) but that doesn’t mean that the English plural is necessarily wrong. This is an example of a figure of speech called a metonymic shift, whereby a thing or concept is referred to not by its own name but by the name of something associated with it. An example is found in the phrase “to boil a kettle”: obviously it is not the kettle that gets boiled, but the water within it, but this isn’t an error as such, merely a grammatical device. Metonymic shifts also take place when we refer to the Government as “Westminster” or the film-making industry as “Hollywood”.

When we come to the “Germany is ” versus “Germany are” debate, the noun “Germany” can be taken to mean “The German team” (singular) but in British English the metonymic shift takes this to mean a collection of individual players (plural), i.e. the meaning is transferred from the “German team” to the “German players”. The use of a verb indicating a singular subject constitutes “formal agreement” with “team” whereas the plural form would be “notional agreement”.

I know that this usage is regarded as incorrect by American colleagues I have discussed it with, to the extent that it actually grates on them a bit. But I think “the team are fighting amongst themselves” is a better construction than any I can think of that includes formal rather than notional agreement. Moreover this kind of construction is correct in languages with more precise grammatical rules than English.

The Greek term synesis refers to a grammatical alteration in which a word takes the gender or number not of the word with which it should regularly agree, but of some other word implied by that word, a device much used in both Greek and Roman poetry and also in rhetoric. The distinction between “the Government is united” and “the Government are divided” offers a particularly interesting example.

Related to this difference is the fact that American sports teams tend to have names that are themselves plural, e.g. the Cubs, the Dolphins, the Jets, the Broncos etc, whereas in Britain they are more often singular (though with exceptions, such as Wolverhampton Wanderers).

Anyway, here’s a quick poll to see what you think:

UPDATE: Just to prove, as if it were needed, that I don’t have a life, I had a look at the English Football League teams for the 2018/9 season, with the the following results as to how many names are plural:

Premiership: 1/20 (Wolverhampton Wanderers)

Championship: 3/24 (Blackburn Rovers, Bolton Wanderers, Queens Park Rangers)

League One: 3/24 (Bristol Rovers, Wycombe Wanderers, Doncaster Rovers)

League Two: 3/24 (MK Dons, Forest Green Rovers, Tranmere Rovers)

In Scotland there are:

Premiership 1/12 (Rangers)

In the lower divisions there are a further four: out of thirty teams: Aidrieonians, Raith Rovers,Albion Rovers, Berwick Rangers.

Synesis, Metonymy and the FIFA World Cup

Posted in Football with tags , , , , , , on June 23, 2014 by telescoper

I was asleep during last night’s dramatic World Cup game between Portugal and USA which ended in a 2-2 draw thanks to an equaliser in injury time from Portugal. That’s why I found out about the result from Twitter when I woke up this morning. I was struck by the fact that virtually all comments from Americans talked about their team in the singular (e.g. “USA has drawn against Portugal”) whereas on this side of the Atlantic we almost always refer to a team in the plural (e.g. “England have lost against everyone”).

Strictly speaking, the singular form is correct (as was Nelson with his “England expects..” message at Trafalgar) but that doesn’t mean that British English is necessarily wrong. This is an example of a figure of speech called a metonymic shift, whereby a thing or concept is referred to not by its own name but by the name of something associated with it. An example is found in the phrase “to boil a kettle”: obviously it is not the kettle that gets boiled, but the water within it, but this isn’t an error as such, merely a grammatical device. Metonymic shifts also take place when we refer to the Government as “Westminster” or the film-making industry as “Hollywood”.

When we come to the “England lose” verses “England loses” debate, the noun “England” can be taken to mean “The England team” (singular) but in British English the metonymic shift takes this to mean a collection of individual players (plural), i.e. the meaning is transferred from the “England team” to the “England players”. The use of a verb indicating a singular subject constitutes “formal agreement” with “team” whereas the plural form would be “notional agreement”.

I know that this usage is regarded as incorrect by American colleagues I have discussed it with, to the extent that it actually grates on them a bit. But I think “the team are fighting amongst themselves” is a better construction than any I can think of that includes formal rather than notional agreement. Moreover this kind of construction is correct in languages with more precise grammatical rules than English. The Greek term synesis refers to a grammatical alteration in which a word takes the gender or number not of the word with which it should regularly agree, but of some other word implied by that word, a device much used in both Greek and Roman poetry and also in rhetoric. The distinction between “the Government is united” and “the Government are divided” offers a particularly interesting example.

However, having done my best to stick up for “England” as a plural, I can’t help thinking that if they ever learn how to play like a team than as a collection of individuals they might not be so strongly associated with the verb “to lose”…

Should the passive voice be avoided?

Posted in Education with tags , , , , , on May 1, 2013 by telescoper

It’s another very busy day (as well as another lovely one) so I thought instead of sitting indoors this lunchtime writing a typically verbose blog item I’d just pick something out of my back catalogue and give it another airing because it deals with something that’s come up a couple of times recently.

This is the time of year when final-year students are drafting their project reports. Yesterday I was back in Cardiff giving feedback on two such articles.  I usually quite enjoy reading these things, in fact. They’re not too long and I’m usually pretty impressed with how the students have set about the (sometimes quite tricky) things I’ve asked them to do for their project work. I think the project report is quite a challenge for UK physics students because they generally haven’t had much practice in putting together a lengthy piece of writing before or during their university course, so haven’t developed a style that they feel comfortable with and are often unfamiliar with various conventions (such as reference style, punctuation of equations, etc). Some of these are explained in quite a lot of detail in the instructions the students are given, of course, but we all know that only girls read instructions….

The thing that strikes me most forcibly about the strange way students write project reports is that they are nearly always phrased entirely in the passive voice, e.g.

The experiment was calibrated using a phlogiston normalisation widget….

I accept that people disagree about whether the passive voice is good style or not. Some journals actively encourage the passive voice while others go the opposite way entirely . I’m not completely opposed to it, in fact, but I think it’s only useful either when the recipient of the action described in the sentence is more important than the agent, or when the agent is unknown or irrelevant. There’s nothing wrong with “My car has been stolen” (passive voice) since you would not be expected to know who stole it. On the other hand “My Hamster has been eaten by Freddy Starr” would not make a very good headline.

The point is that the construction of a statement in the passive voice in English is essentially periphrastic, in that it almost inevitably involves some form of circumlocution – either using more words than necessary to express the meaning or being deliberately evasive by introducing ambiguity. Both of these failings should be avoided in scientific writing.

Apparently, laboratory instructors generally tell students to write their reports in the passive voice as a matter of course. I think this is just wrong. In a laboratory report the student should describe what he or she did. Saying what “was done” often leaves the statement open to the interpretation that somebody else did it. The whole point of a laboratory report is surely for the students to describe their own actions. “We calibrated the experiment..”  or “I calibrated the experiment…” are definitely to be preferred to the form I gave above.

That brings me to the choice of pronoun in the active voice. One danger is that it can appear very bombastic, but that’s not necessarily the case. I don’t find anything particularly wrong in saying, e.g.

We improve upon the technique of Jones et al. (1848) by introducing a variable doofer in the MacGuffin control, thereby removing gremlins from the thingummy process.

But the main issue is whether to use the singular or plural form. It can be irritating to keep encountering “I did this..” and “I did that..” all the way through a journal paper, and I certainly  would feel uncomfortable writing a piece like that in the first person singular. I think it feels less egotistical to use “we”, even if there is only one author (which is increasingly rare in any case). If it’s good enough for the Queen it’s good enough for me! However, I just looked “we” up in Chambers dictionary and found

..used when speaking patronizingly, esp to children, to mean `you’.

which wasn’t at all what I had in mind!

However in the case of a student project that I’m assessing I actually want to know what the particular student  writing the report did, not what was done by person or persons unspecified or by a group of uncertain composition. So I encourage my students to put, e.g.,

I wrote a computer program in 6502 Assembly Language to solve the Humdinger equation using the Dingbat-Schnitzelgruber algorithm.

I also (sometimes) like “we” when there’s, e.g., a complicated mathematical derivation.  Going  line by line through a lengthy piece or difficult technical argument seems friendlier if you imagine that the reader is trying to do the calculation along with you as you write it:

if we differentiate the right hand side of equation (8), use the expression for x obtained in equation (97), expand y in a power-series and take away the number we first thought of we find…

The “we” isn’t necessarily an  author with delusions of grandeur (or schizophrenia), but instead denotes a joint operation between author and reader.

Anyway, to resume the thread, it seems to me that sometimes it is appropriate to use the passive voice because it is the correct grammatical construction in the circumstances. Sometimes also the text just seems to work better that way too. But having to read an entire document written in the passive voice drives me to distraction. It’s clumsy and dull.

In scientific papers, things are a little bit different but I still think using the active voice makes them easier to read and less likely to be ambiguous. In the introduction to a journal paper it’s quite acceptable to discuss the background to your work in the passive voice, e.g. “it is now generally accepted that…” but when describing what you and your co-authors have done it’s much better to use the active voice. “We observed ABC1234 using the Unfeasibly Large Telescope..” is, to my mind, much better than “Observations of ABC1234 were made using..”.

Reading back over this post I notice that I have jumped fairly freely between active and passive voice, thus demonstrating that I don’t have a dogmatic objection to its use. What I’m arguing is that it shouldn’t be the default, that’s all.

My guess is that a majority of experimental scientists won’t agree with this opinion, but a majority of astronomers and theoreticians will.

This guess will now be tested by reactivating an old poll..

Spare me the Passive Voice!

Posted in Education with tags , , , , , on December 16, 2010 by telescoper

I’ve felt a mini-rant brewing for a few days now, as I’ve been reading through some of the interim reports my project students have written. I usually quite enjoy reading these, in fact. They’re not too long and I’m usually pretty impressed with how the students have set about the sometimes tricky things I’ve asked them to do. One pair, for example, is reanalysing the measurements made at the 1919 Eclipse expedition that I blogged about here, which is not only interesting from a historical point of view but which also poses an interesting challenge for budding data analysts.

So it’s not the fact that I have to read these things that annoys me, but the strange way students write them, i.e. almost entirely in the passive voice, e.g. “The experiment was calibrated using a phlogiston normalisation widget…”.

I accept that people disagree about whether the passive voice is good style or not. Some journals actively encourage the passive voice while others go the opposite way entirely . I’m not completely opposed to it, in fact, but I think it’s only useful either when the recipient of the action described in the sentence is more important than the agent, or when the agent is unknown or irrelevant. There’s nothing wrong with “My car has been stolen” (passive voice) since you would not be expected to know who stole it. On the other hand “My Hamster has been eaten by Freddy Starr” would not make a very good headline.

The point is that the construction of a statement in the passive voice in English is essentially periphrastic in that it almost inevitably involves some form of circumlocution – either using more words than necessary to express the meaning or being deliberately evasive by introducing ambiguity. Both of these failings should be avoided in scientific writing.

Apparently our laboratory instructors tell students to write their reports in the passive voice as a matter of course. I think this is just wrong. In a laboratory report the student should describe what he or she did. Saying what “was done” often leaves the statement open to the interpretation that somebody else did it. The whole point of a laboratory report is surely for the students to describe their own actions. “We calibrated the experiment..” is definitely to be preferred to the form I gave above.

Sometimes it is appropriate to use the passive voice because it is the correct grammatical construction in the circumstances. Sometimes also the text just seems to work better that way too. But having to read an entire document written in the passive voice drives me to distraction. It’s clumsy and dull.

In scientific papers, things are a little bit different but I still think using the active voice makes them easier to read and less likely to be ambiguous. In the introduction to a journal paper it’s quite acceptable to discuss the background to your work in the passive voice, e.g. “it is now generally accepted that…” but when describing what you and your co-authors have done it’s much better to use the active voice. “We observed ABC1234 using the Unfeasibly Large Telescope..” is, to my mind, much better than “Observations of ABC1234 were made using..”.

Reading back over this post I notice that I have jumped fairly freely between active and passive voice, thus demonstrating that I don’t have a dogmatic objection to its use. What I’m arguing is that it shouldn’t be the default, that’s all.

My guess is that a majority of experimental scientists won’t agree with this opinion, but a majority of astronomers and theoreticians will.

This guess will now be tested using a poll…


Share/Bookmark

Pluralia Tantum

Posted in Literature, Pedantry with tags , , , on December 5, 2008 by telescoper

Meanwhile, over on the e-astronomer, Andy Lawrence recently posted an item about the lamentable tendency of astronomers to abuse the English language. The focus of his venom was “extincted”, a word used by many astro-types as an adjective to describe the state of affairs when light from a source (e.g. a quasar) has suffered “extinction” by intervening matter. “Extinction” is formed from the verb “extinguish” in the same way that “distinction” is formed from “distinguish”. Nobody would describe a professor as “distincted” (certainly not if it is Andy Lawrence) so, clearly, “extincted” is inappropriate. Actually if you really want to nit-pick you could object to “extinction” being applied to an object such as a  quasar, when it isn’t actually the object that is suffering from it but the light it has emitted.

But as a gripe, this is fair enough I’d say. Andy went on to encourage his legions of adoring readers to contribute their own pet hates, preferably with an astronomical orientation. My contribution was “decimate” which  means “to remove the tenth part” or “to reduce by ten percent”, from the Roman practice of punishing disobedient legions by killing every tenth man, but is often regrettably now used to mean “annihilate” or “obliterate”. You might think this hasn’t got much to do with astronomy but, sadly, it does. Indeed, a press release from STFC discussing the recent ten percent cuts to its grants budget states that consequent reduction in PDRAS

..will not cause the decimation of physics departments as has been speculated in media reports.

I would expect a civil servant to have done a bit better, so presumably this was written by an astronomer too. At any rate, it is precisely wrong.

You might argue that things like this don’t matter.  Language evolves,  and if modern usage deviates from its previous meanings then we should just let it change. I fully accept the dynamic nature of language and do not by any means object to all such changes. Society changes and so must the words we use. But if a change is (a) a result of sloppiness and (b) results in the loss of a very good use to be replaced by a bad one, then I think educated people should stand their ground and fight it. If we don’t do that language doesn’t just change, it decays.

Most of us practising scientists have to spend a lot of our time writing scientific papers, departmental memos, grant applications and even books. I think many astronomers see this activity as a chore, take no pleasure from it, and invest the minimum care on it. I was fortunate to have a really excellent writer, John Barrow, as my thesis supervisor and he convinced me that it was worth making the effort to write the best prose I could whatever the context. Not only does this attitude eliminate the ambiguity which is the bane of scientific writing. Taking pains over style and grammar also allows one to feel the pleasure of craftsmanship for its own sake. With John’s guidance and encouragement, I learned to enjoy writing through the satisfaction experienced by finding neat forms of words or nice turns of phrase. You never really feel good about what you do if you scrape through at the miminum acceptable level. Try to make the effort and you will be more fulfilled and the long hours of slog you spend putting together a complicated paper will at least be enlivened by a genuine sense of delight when things fall neatly into place, and a warm glow of achievement when you read it back and it sounds not just acceptable but actually good.

But I digress.

One of the other contributors to Andy’s list of examples of bad grammar was a chap called Norman Gray who objected to astronomers’ use of the word “data” as a plural noun, as in “the data indicate” rather than “the data indicates”. I was taken aback by this because I was expecting the opposite objection.

He has a lengthy rant about this on his own blog so I won’t repeat his arguments in detail here, merely a synopsis. The word “data” is formed from the latin plural of the word “datum” (itself formed from the past participle of the latin verb “dare”, meaning “to give”) hence meaning “things given” or words to that effect. The usage of “data” that we use now (to refer to measurements or quantitative information) seems not to have been present in roman or mediaeval times so Norman argues that it is a deliberate archaism to treat it as a latin plural now. He also argues that “data” in modern usage is a “mass noun” so should on that grounds also be treated as singular.

For those of you who aren’t up with such things, English nouns can be of two forms: “count” and “non-count” (or “mass”). Count nouns are those that can be enumerated and therefore have both plural and singular forms:  one eye, two eyes, etc. Non-count nouns (which is a better term than “mass nouns”) are those which describe something which is not enumerable, such as “furniture” or “cutlery”. Such things can’t be counted and they don’t have a different singular and plural forms. You can have two chairs (count noun) but can’t have two furnitures (non-count noun).

Count and non-count nouns require different grammatical treatment. You can ask “how much furniture do you have?” but not how many. The answer to a “how much” question usually requires a unit or measure word (e.g. “a vanload of furniture”) but the answer to a “how many” question would be just a number. Next time you are in a supermarket queue where it says “ten items or less” you will appreciate that it the sign is grammatically incorrect. “Item” is most definitely a count noun, so the correct form should be “ten items or fewer”..

Anyway, Norman Gray asserts that (a) “data” is a non-count noun and that (b) it should therefore be singular. Forms such as “the data are..” are out (“a vile anacoluthon”) and “the data is…” is in.

So is he right?

Not really.  Unkind though it may be to dismantle a carefully constructed obsession, I think his arguments have quite a few problems with them.

For a start, it seems clear to me that there are (at least) two distinct uses of the word data. One is clearly of non-count type. This is the use of “data” to describe an undifferentiated unspecified or unlimited quantity of information such as that stored on a computer disk. Of such stuff you might well ask “how much data do you have?” and the answer would be in some units (e.g. Gbytes). This clearly identifies it as a mass noun.

But there is another meaning, which is that ascribed to specified pieces of information either given (as per the original latin) or obtained from a measurement. Such things are precisely defined, enumerable and clearly therefore of count-noun form. Indeed one such entity could reasonably be called a datum and the plural would be data. This usage applies when the context defines the relevant quantum of information so no unit is required. This is the usage that arises in most scientific papers, as opposed to software manuals. “In Figure 1, the data are plotted…” is correct. Although it sounds clumsy you could well ask in such a situation “how many data do you have?” (meaning how many measurements do you have) and the answer would just be a number. Archaism? No. It’s just right.

To labour the point still further,  here are another two sentences that show the different uses:

“If I had less data my disk would have more free space on it.” (Non-count)

“If I had fewer data I would not be able to obtain an astrometric solution.” (Count).

Contrary to Norman’s claims, it is not unusual for the same words (if they’re nouns) to have both count and non-count forms in different contexts. I give the example of “whisky” as in “my glass is full of whisky” (non-count) versus “two whiskies, please, barman”. His objection to this was that in the second case a whisky is an artefact of a metonymic shift which takes the word “whisky” to refer to the glass containing it.

Metonymy involves using a word related to a thing rather than the word for thing itself, as in “I have hungry mouths to feed”; it’s not really the mouths that are fed, but the people the mouths belong to. In fact there’s a bit of this going on when people talk about sources being “extincted” rather than their light.

This invalidates the example because, Norman alleges, the resulting meaning is different. This objection is a bit silly because the whole point is that the two forms should have different meanings, otherwise why have them? In any case the  example  simply involves me asking for two well-defined quantities of whisky. I’m not convinced of the relevance of metonymy here. What I care about is the whisky, not what it comes in, and when I drink the whisky I don’t drink the glass anyway. Metonymy would apply if I talked about drinking a couple of glasses. Consider “I drank two whiskies, one after the other” versus “I drank two glasses one after the other”. In both cases what has actually been drunk?

There are countless other examples (pun intended). “Fire” can be a mass noun “fire is dangerous”) but also a count noun (“the firemen were fighting three fires simultaneously”). Another nice one  is “hair” which is non-count when it is on someone’s head (“my hair is going grey”) but count when  they, in the plural, are being split.

Interestingly, though, the  non-count forms of these nouns are all singular. Indeed, many non-count nouns exist only in the singular: such nouns are called singularia tantum. Examples include “dust” and “wealth”. So,  if we accept that “data” can be a non-count noun, does that mean that it should necessarily be treated as singular when it does take on that role?

An example that might be taken to support this view could be “statistics” (the field thereof) which is a non-count noun. Although it appears to be derived from a plural, you would certainly say “statistics is a hard subject”  rather than “statistics are a hard subject”.  On the other hand “statistics” can refer to a set, each element of which is a statistic (i.e. a number), thus giving another example of a noun that can be of either count or non-count form; you might reasonably say “the statistics are impressive” in the count case.  The non-count form “statistics” is a better  example of metonymy than the example above, as it refers to the study of the (count) statistics rather than to the things themselves.

In fact there are also mass nouns, described as pluralia tantum, which exist only in the plural. A (not entirely accurate) list is given here. Examples include scissors and pants, for which the normal measure  is a “pair”. Although these are technically non-count nouns (in the sense that you can’t have one scissor, etc) they don’t shed much light on the example in front of us. Perhaps more pertinent is the word “clothes” which is of non-count type but which is certainly plural. You can’t have one “clothe” (or any other number for that matter) but you would definitely say “your clothes are dirty”.

A more subtle example with relevance to the latin root of “data” is “media” which can refer to broadcast media (non-count) or plural of medium (count).  “The media are out to get me”  seems a correct construction to me, so the non-count form of this noun is a plurale tantum (singular of pluralia tantum).

So,  just because a word may be a non-count noun, it doesn’t necessarily have to be singular.

To summarise,  my argument is that (a) it is not correct to assert “data” is a mass noun. It may or may not be, depending on the context. If it is acting as a count noun (which I contend is the case in most science writing) then it is definitely plural. Furthermore, even in cases where it is clearly a mass noun, and especially if you reject the alternative meaning as a count noun, then  it is still by no means obvious that it must be treated as singular (because of the existence of the plurale tantum). In fact I would go a bit further and argue that you can only justify the singular non-count form at all if you accept that there is a count alternative. To be honest, though, I think I prefer the singular interpretation in the non-count case, as in “statistics”. It just sounds better.

If anyone has managed to read all the way through this exercise in pedantry I’d be interested to see any comments on my analysis of data.