Data Is / Data Are


To be clear - I don't care about this; I just think it is interesting.

Is the word "data" a plural? On a strict reading, yes. Datum is singular, data is its plural.

But humans are spongey meatbags who evolve language. And there will always be a tension between traditionalists and modernists.

So, I took a serious, scientific, and accurate Twitter poll.

It's amazing how many people are wrong, eh?

What I want to understand, is why it has evolved to a singular?

Charlie expresses it perfectly:

I understand what Charlie is saying, but I think I disagree with her. Think about cooking some chips. Would you say "the chips is ready"?

No. But chips are small individual "things". Just like rice.

What's something smaller than chips, but bigger than rice. Peas?

"How's dinner coming along?" "The peas is ready!" Nope!

Something between peas and rice? Sweetcorn?

"The sweetcorn is ready." Aha!

There seems to be some intuitive size related to when something is an individual thing, and when it is part of a whole.

If a dozen bees are flying towards you - they're a plural.

But a swarm of bees is a singular thing - despite being made of many small bees.

Data is a swarm. The individual datums are tiny compared to the mass of the dataset.


Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

11 thoughts on “Data Is / Data Are”

  1. Alex Gibson says:

    The mass noun approach makes sense.

    I think the battle for purism over all those Latin/Greek nouns with -um singular becoming -a plural is sadly lost in wider society, and the use of datum is so niche compared to the ubiquity of data that it's a lost cause.

    I will keep using datum/data, stadium/stadia and I will cringe every time BBC presenters say stadiums, referendums, etc - I don't know whether they made a style guide choice or just the old fogeys shuffled off, but it seemed like that changed a few years ago and -ums became the rule. The worst is when people try to double-pluralise, IE I've heard criterias said very often. But I've stopped caring that much.

    Reply
  2. Mike Rose says:

    I think we now use 'data' as shorthand for dataset / database...

    So 'is' works... if this is where you are coming from.

    It is not often in my world people talk about data from the other direction - ie multiple datum = data...

    IMO

    Reply

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">