It’s quite popular to see high street shops names “Somesuch and Sons”. Indeed, my grandparents ran “Eden & Sons” for many year.
Much rarer is seeing “… & daughters”.
But, of course, the plural of anecdote is not data!
The UK register of businesses – Companies House – has a pretty good search engine.
Doing a search for AND SON returns 220,000 results. We use the singular because that should also match the plural.
Instinctively, how many “AND DAUGHTER” businesses do you think they are? Fewer? By how much?
A search for AND DAUGHTER returns 206,000 results!
At first glance, they look similar. Yay gender equality! But there’s a problem. Both searches also return dissolved companies.
Additionally, “AND SON” also matches “ANDSON” – which distorts the results. How can we get all the live companies in the system? The search doesn’t offer any filters.
This PDF contains details of all the currently trading businesses in the UK. Let’s open it up!
# Import the library import pandas as pd # Read only the first column into a dataframe df = pd.read_csv("BasicCompanyDataAsOneFile-2020-07-01.csv", usecols=["CompanyName"])
What are we looking for? We can’t just search for “SONS” and that will bring back things like “HENDERSONS”. Some businesses use “AND SONS” others use “& SONS”. Some have “&SONS”. We also have to account for the singular.
This quick-and-dirty regex will attempt to find any of the above, without also getting “AND SONGS”, for example. Do tell me if there’s a better way.
In Pandas terms, that’s:
Which prints out:
CompanyName 258 & SON STUDIO LIMITED 307 &SONS TRADING COMPANY LIMITED 467 (BOWEN AND SONS) BAS MECHANICAL SERVICES LIMITED 19492 1ST CLASS REMOVALS & SONS LTD 28096 24LEX & SON LTD ... ... 4667235 ZOLEE & SON LTD 4668636 ZORAN&SONS LTD 4668689 ZORBAS & SONS LIMITED 4671244 ZYBERI & SONS CAPITAL INVESTMENTS LIMITED 4671350 ZYGMUNT CURRY & SONS LIMITED [17950 rows x 1 columns]
Or, you can run
len() on the output to get the count.
Running the same for
CompanyName 246 & DAUGHTER LIMITED 86594 A STEWART AND DAUGHTERS LIMITED 98526 A.R & DAUGHTERS LIMITED 100064 A.W.F FLETCHER AND DAUGHTERS LLP 179094 AFI AND DAUGHTERS LTD ... ... 4568242 WILSON SON & DAUGHTERS LIMITED 4582108 WIZDOM: BY OSAGIE & DAUGHTERS LIMITED 4583025 WK LUMSDEN AND DAUGHTER LIMITED 4649690 Z.J. KUBANEK & DAUGHTERS LTD 4669405 ZS & DAUGHTERS LTD [320 rows x 1 columns]
Oh. There are about 56x as many “AND SON” businesses as there are “AND DAUGHTER” businesses. Of course, these data don’t tell us anything about the size of the businesses or how successful they are. It doesn’t tell us how many companies are named “sons and daughters”. And a dozen other little data issues.
But, I think the trend is clear. Over time, approximately the same number of “& SONS” businesses and “& DAUGHTER” businesses have been registered. But far more DAUGHTERs have been dissolved.
Why is that?