#UKGC21
Data Is / Data Are
To be clear - I don't care about this; I just think it is interesting.
Is the word "data" a plural? On a strict reading, yes. Datum is singular, data is its plural.
But humans are spongey meatbags who evolve language. And there will always be a tension between traditionalists and modernists.
So, I took a serious, scientific, and accurate Twitter poll.
It's amazing how many people are wrong, eh?
What I want to understand, is why it has evolved to a singular?
Charlie expresses it perfectly:
I understand what Charlie is saying, but I think I disagree with her. Think about cooking some chips. Would you say "the chips is ready"?
No. But chips are small individual "things". Just like rice.
What's something smaller than chips, but bigger than rice. Peas?
"How's dinner coming along?" "The peas is ready!" Nope!
Something between peas and rice? Sweetcorn?
"The sweetcorn is ready." Aha!
There seems to be some intuitive size related to when something is an individual thing, and when it is part of a whole.
If a dozen bees are flying towards you - they're a plural.
But a swarm of bees is a singular thing - despite being made of many small bees.
Data is a swarm. The individual datums are tiny compared to the mass of the dataset.
Giuseppe Sollazzo said on twitter.com:
mate, you're so vintage. The battle now is "dataset / data set".
😛
Jez said on scholar.social:
@Edent Thank goodness it's not just me! Technically "data" in modern usage is a mass noun: takes singular pronouns but is uncountable. http://erambler.co.uk/blog/language-is-like-clothing/
bob said on twitter.com:
Mostly data are just sounds wrong.
Alex says:
I wonder if it makes sense to consider "data" as a mass noun (https://en.m.wikipedia.org/wiki/Mass_noun) In that case, "sweetcorn" is a mass noun for kernels of corn, and "rice" is a mass noun for grains of rice.
Alex Gibson says:
The mass noun approach makes sense.
I think the battle for purism over all those Latin/Greek nouns with -um singular becoming -a plural is sadly lost in wider society, and the use of datum is so niche compared to the ubiquity of data that it's a lost cause.
I will keep using datum/data, stadium/stadia and I will cringe every time BBC presenters say stadiums, referendums, etc - I don't know whether they made a style guide choice or just the old fogeys shuffled off, but it seemed like that changed a few years ago and -ums became the rule. The worst is when people try to double-pluralise, IE I've heard criterias said very often. But I've stopped caring that much.
Edward Saperia said on twitter.com:
I’m pretty sure it’s “datas are”
Dr Amy Roberts said on twitter.com:
OMG my PhD supervisor was so harsh on this exact subject 🤣
Owen Blacker says:
This reminds me of one of my favourite quirks in French. French has imported both "medium" and "media" as singular nouns (from Latin for users of crystal balls and from English for publishers of news, respectively), so has "un médium", "les médiums", "le média" and "les médias" all as valid terms.
▶ https://en.wiktionary.org/wiki/m%C3%A9dium#French; ▶ https://en.wiktionary.org/wiki/m%C3%A9dia#French
Jeremy Keith says:
The data show that water are wet.
JK says:
Hot dogs are is sammich(es)!¡!
Mike Rose says:
I think we now use 'data' as shorthand for dataset / database...
So 'is' works... if this is where you are coming from.
It is not often in my world people talk about data from the other direction - ie multiple datum = data...
IMO