78 datasets found
  1. UK Baby Names 👶 (1996-2021)

    • kaggle.com
    zip
    Updated Aug 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jean (2023). UK Baby Names 👶 (1996-2021) [Dataset]. https://www.kaggle.com/datasets/johnsmith44/uk-baby-names-1996-2021
    Explore at:
    zip(983352 bytes)Available download formats
    Dataset updated
    Aug 4, 2023
    Authors
    Jean
    Area covered
    United Kingdom
    Description

    Introduction.

    Baby name statistics are compiled from first names recorded when live births are registered in England and Wales as part of civil registration, a legal requirement. The statistics are based only on live births which occurred in the calendar year, as there is no public register of stillbirths. Babies born in England and Wales to women whose usual residence is outside England and Wales are included in the statistics for England and Wales as a whole, but excluded from any sub-division of England and Wales. The statistics are based on the exact spelling of the name given on the birth certificate. Grouping names with similar pronunciation would change the rankings. Exact names are given so users can group if they wish.

    The dataset contains records of around 16k boy names and 22k girl names.

    Notes and definitions.

    Baby name statistics do not include births to women usually resident in England or Wales who give birth abroad. They do include births to women whose usual residence is outside England and Wales where the birth occurred in England or Wales. Births where the name of the baby is not stated are excluded from all the ranks. Births where the usual residence of the mother was not in England and Wales are excluded from the regional ranks and from the separate England and Wales ranks. Names with a count of 2 or less in total within England and Wales have been redacted using S40 of the Freedom of Information Act in order to protect the confidentiality of individuals. This is consistent with the disclosure control methodology used for our birth statistics.

    Source of data: The ONS.

    License: Open Government License

  2. E

    A corpus of names drawn from the local birth registers of England and Wales,...

    • dtechtive.com
    • find.data.gov.scot
    txt, xlsx, zip
    Updated Jan 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Edinburgh (2018). A corpus of names drawn from the local birth registers of England and Wales, 1838-2014 [Dataset]. http://doi.org/10.7488/ds/2294
    Explore at:
    xlsx(30.21 MB), zip(5.395 MB), txt(0.0166 MB)Available download formats
    Dataset updated
    Jan 25, 2018
    Dataset provided by
    University of Edinburgh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    UNITED KINGDOM
    Description

    This dataset comprises a corpus of names, in both the first and middle position, for approximately 22 million individuals born in England and Wales between 1838 and 2014. This data is obtained from birth records made available by a set of volunteer-run genealogical resources - collectively, the 'UK local BMD project' (http://www.ukbmd.org.uk/local) - and has been re-purposed here to demonstrate the applicability of network analysis methods to an onomastic dataset. The ownership and licensing of the intellectual property constituting the original birth records is detailed at https://www.ukbmd.org.uk/TermsAndConditions. Under section 29A of the UK Copyright, Designs and Patents Act 1988, a copyright exception permits copies to be made of lawfully accessible material in order to conduct text and data mining for non-commercial research. The data included in this dataset represents the outcome of such a text-mining analysis. No birth records are included in this dataset, and nor is it possible for records to be reconstructed from the data presented herein. The data comprises an archive of tables, presenting this corpus in various forms: as a rank order of names (in both the first and middle position) by number of registered births per year, and by the total number of births across all years sampled. An overview of the data is also provided, with summary statistics such as the number of usable records registered per year, most popular names per year, and measures of forename diversity and the surname-to-forename usage ratio (an indicator of which forenames are more likely to be transferred uses of surnames). These tables are extensive but not exhaustive, and do not exclude the possibility that errors are present in the corpus. Data are also presented both as '.expression' files (an input format readable by the network analysis tool Graphia Professional) and as '.layout' files, a text file format output by Graphia Professional that describes the characteristics of the network so that it may be replicated. Characteristics of the original birth records that allow the identification of individuals - for instance, full name or location of birth - have been removed.

  3. Baby names for boys in England and Wales

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Baby names for boys in England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsboys
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 31, 2025
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Rank and count of the top names for baby boys, changes in rank since the previous year and breakdown by country, region, mother's age and month of birth.

  4. Baby names for girls in England and Wales

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Baby names for girls in England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsgirls
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 31, 2025
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Rank and count of the top names for baby girls, changes in rank since the previous year and breakdown by country, region, mother's age and month of birth.

  5. l

    Census 2021 - Country of birth

    • data.leicester.gov.uk
    csv, excel, geojson +1
    Updated Apr 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Census 2021 - Country of birth [Dataset]. https://data.leicester.gov.uk/explore/dataset/census-2021-leicester-country-of-birth/
    Explore at:
    geojson, excel, csv, jsonAvailable download formats
    Dataset updated
    Apr 19, 2023
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    The census is undertaken by the Office for National Statistics every 10 years and gives us a picture of all the people and households in England and Wales. The most recent census took place in March of 2021.The census asks every household questions about the people who live there and the type of home they live in. In doing so, it helps to build a detailed snapshot of society. Information from the census helps the government and local authorities to plan and fund local services, such as education, doctors' surgeries and roads.Key census statistics for Leicester are published on the open data platform to make information accessible to local services, voluntary and community groups, and residents. There is also a dashboard published showcasing various datasets from the census allowing users to view data for Leicester and compare this with national statistics.Further information about the census and full datasets can be found on the ONS website - https://www.ons.gov.uk/census/aboutcensus/censusproductsCountry of birthThis dataset provides Census 2021 estimates that classify usual residents in England and Wales by their country of birth. The estimates are as at Census Day, 21 March 2021.Definition: The country in which a person was born. For people not born in one of in the four parts of the UK, there was an option to select "elsewhere". People who selected "elsewhere" were asked to write in the current name for their country of birth.

  6. All UK Active Company Names

    • kaggle.com
    zip
    Updated Nov 14, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian J (2017). All UK Active Company Names [Dataset]. https://www.kaggle.com/dalreada/all-uk-active-company-names
    Explore at:
    zip(44974391 bytes)Available download formats
    Dataset updated
    Nov 14, 2017
    Authors
    Brian J
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United Kingdom
    Description

    I work with UK company information on a daily basis, and I thought it would be useful to publish a list of all active companies, in a way that could be used for machine learning.

    There are 3,838,469 rows in the dataset, one for each active company. Each row, has the company name, date of incorporation and the Standard Industrial Classification Code.

    The company list is from the publicly available 1st November 2017 Companies House snapshot.

    The SIC code descriptions are from the gov.uk website.

    In the file AllCompanies.csv each row is formatted as follows:

    • CompanyName - Alpha numberic company name
    • IncorporationDate - in British date format, dd/mm/yyyy
    • SIC - 5 digits or if not known, None - see separate file for description of each code.

    Inspiration

    Possible uses for this data is to use ML to suggest a new unique but suitable name for a company based on what other companies of the same SIC are called.

    Perhaps analyse how company names have evolved over time.

    Using ML, perhaps determine what a typical company name looks like, maybe analyse if company names have got longer or more complicated over time.

    I am sure there are many more possible uses for this data in ways, that I cannot imagine.

    This is my second go (the first was published a few hours ago) at publishing a dataset on any medium, so any useful tips and hints would be extremely welcome.

    Links to the raw data sources are here:

  7. Name Popularity in the USA and UK

    • kaggle.com
    zip
    Updated Feb 5, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lauren Ackerman (2018). Name Popularity in the USA and UK [Dataset]. https://www.kaggle.com/datasets/lmackerman/name-popularity-in-the-usa-and-uk/discussion
    Explore at:
    zip(1212917 bytes)Available download formats
    Dataset updated
    Feb 5, 2018
    Authors
    Lauren Ackerman
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United Kingdom, United States
    Description

    Context

    In order to test how gender is linguistically encoded and represented, I have designed a series of studies that involve gender biased and equibiased names. This dataset is designed to help me calculate which names would be most appropriate for stimuli created to display to both US and UK audiences.

    Content

    These data contain name frequencies and ranks by year (1996-2013, UK) or decade (2000, US) by binary gender (male or female). I have tried to design these data to provide the most useful metrics for determining the most popular names by gender bias (masculine, feminine) and equibias (unisex).

    Acknowledgements

    Inspiration

    I'm looking to identify popular gender-biased and popular gender neutral (equi-biased) names.

  8. Distribution of first name and last name frequencies by country

    • figshare.com
    xlsx
    Updated Feb 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mike Thelwall (2023). Distribution of first name and last name frequencies by country [Dataset]. http://doi.org/10.6084/m9.figshare.21956795.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 2, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Mike Thelwall
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Distribution of first and last name frequencies of academic authors by country.

    Spreadsheet 1 contains 50 countries, with names based on affiliations in Scopus journal articles 2001-2021.

    Spreadsheet 2 contains 200 countries, with names based on affiliations in Scopus journal articles 2001-2021, using a marginally updated last name extraction algorithm that is almost the same except for Dutch/Flemish names.

    From the paper: Can national researcher mobility be tracked by first or last name uniqueness?

    For example the distribution for the UK shows a single peak for international names, with no national names, Belgium has a national peak and an international peak, and China has mainly a national peak. The 50 countries are:

    No Code Country 1 SB Serbia 2 IE Ireland 3 HU Hungary 4 CL Chile 5 CO Columbia 6 NG Nigeria 7 HK Hong Kong 8 AR Argentina 9 SG Singapore 10 NZ New Zealand 11 PK Pakistan 12 TH Thailand 13 UA Ukraine 14 SA Saudi Arabia 15 RO Israel 16 ID Indonesia 17 IL Israel 18 MY Malaysia 19 DK Denmark 20 CZ Czech Republic 21 ZA South Africa 22 AT Austria 23 FI Finland 24 PT Portugal 25 GR Greece 26 NO Norway 27 EG Egypt 28 MX Mexico 29 BE Belgium 30 CH Switzerland 31 SW Sweden 32 PL Poland 33 TW Taiwan 34 NL Netherlands 35 TK Turkey 36 IR Iran 37 RU Russia 38 AU Australia 39 BR Brazil 40 KR South Korea 41 ES Spain 42 CA Canada 43 IT France 44 FR France 45 IN India 46 DE Germany 47 US USA 48 UK UK 49 JP Japan 50 CN China

  9. Top 100 baby names in England and Wales: historical data

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Top 100 baby names in England and Wales: historical data [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalestop100babynameshistoricaldata
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 31, 2025
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Historic lists of top 100 names for baby boys and girls for 1904 to 2024 at 10-yearly intervals.

  10. England and Wales Census 2021 - Characteristics of usual residents aged 16...

    • statistics.ukdataservice.ac.uk
    xlsx
    Updated Feb 10, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics; National Records of Scotland; Northern Ireland Statistics and Research Agency; UK Data Service. (2023). England and Wales Census 2021 - Characteristics of usual residents aged 16 and over by whether they have previously served in the UK armed forces, England and Wales [Dataset]. https://statistics.ukdataservice.ac.uk/dataset/england-and-wales-census-2021-characteristics-of-ur-16-and-over-by-whether-previously-served
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 10, 2023
    Dataset provided by
    Northern Ireland Statistics and Research Agency
    Office for National Statisticshttp://www.ons.gov.uk/
    UK Data Servicehttps://ukdataservice.ac.uk/
    Authors
    Office for National Statistics; National Records of Scotland; Northern Ireland Statistics and Research Agency; UK Data Service.
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    England, Wales, United Kingdom
    Description

    This dataset is an analysis of the Characteristics of usual residents by whether they have previously served in the UK armed forces, with adjusted estimates for the non-veteran population, based on Census 2021.

    People who have previously served in the UK armed forces includes those who have served for at least one day in HM’s Armed Forces, either regular or reserves, or Merchant Mariners who have seen duty on legally defined military operations. It does not include those who have left and since re-entered the regular or reserve UK armed forces, those who have only served in.

    The veteran population are older than the general population and also differ in relation to sex and where they live, and these factors interact with other personal characteristics. For example, age can be strongly related to other personal characteristics such as legal partnership status, health and religion. Because of this, veterans may also differ to the usual or non-veteran population when considering related factors such as health and legal partnership status. It is important to be aware of these differences but also to understand when these differences are not attributable to experience of having previously served in the UK armed forces.

    Country of birth

    The country in which a person was born. For people not born in one of in the four parts of the UK, there was an option to select "elsewhere". People who selected "elsewhere" were asked to write in the current name for their country of birth.

    Ethnic group and high-level ethnic group

    The ethnic group that the person completing the census feels they belong to. This could be based on their culture, family background, identity or physical appearance. Respondents could choose one out of 19 tick-box response categories, including write-in response options. High-level ethnic group refers to the first stage of the two-stage ethnic group question. High-level groups refer to the first stage where the respondent identifies through one of the following options: * "Asian, Asian British, Asian Welsh" * "Black, Black British, Black Welsh, Caribbean or African" * "Mixed or Multiple" * "White" * "Other ethnic group"

    General health

    A person's assessment of the general state of their health from very good to very bad. This assessment is not based on a person's health over any specified period of time.

    Legal partnership status

    Classifies a person according to their legal marital or registered civil partnership status on Census Day 21 March 2021.

    Gender identity

    Gender identity refers to a person’s sense of their own gender, whether male, female or another category such as non-binary. This may or may not be the same as their sex registered at birth.

    Religion

    The religion people connect or identify with (their religious affiliation), whether or not they practice or have belief in it. This question was voluntary, and the variable includes people who answered the question, including “No religion”, alongside those who chose not to answer this question. This variable classifies responses into the eight tick-box response options. Write-in responses are classified by their "parent" religious affiliation, including “No religion”, where applicable.

    Sexual orientation

    Sexual orientation is an umbrella term covering sexual identity, attraction, and behaviour. For an individual respondent, these may not be the same. For example, someone in an opposite-sex relationship may also experience same-sex attraction, and vice versa. This means the statistics should be interpreted purely as showing how people responded to the question, rather than being about whom they are attracted to or their actual relationships. We have not provided glossary entries for individual sexual orientation categories. This is because individual respondents may have differing perspectives on the exact meaning.

    Usual resident

    A usual resident is anyone who on Census Day, 21 March 2021, was in the UK and had stayed or intended to stay in the UK for a period of 12 months or more, or had a permanent UK address and was outside the UK and intended to be outside the UK for less than 12 months.

    UK armed forces veteran

    People who have previously served in the UK armed forces. This includes those who have served for at least one day in HM’s Armed Forces, either regular or reserves, or Merchant Mariners who have seen duty on legally defined military operations. It does not include those who have left and since re-entered the regular or reserve UK armed forces, those who have only served in foreign armed forces, or those who have served in the UK armed forces and are currently living outside of England and Wales.

  11. d

    Data from: I-CeM

    • doi.org
    • datacatalogue.ukdataservice.ac.uk
    Updated May 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schurer, K., University of Essex, Department of History; Higgs, E., University of Essex, Department of History (2025). I-CeM [Dataset]. http://doi.org/10.5255/10.5255/UKDA-SN-7856-2
    Explore at:
    Dataset updated
    May 29, 2025
    Dataset provided by
    UK Data Servicehttps://ukdataservice.ac.uk/
    Authors
    Schurer, K., University of Essex, Department of History; Higgs, E., University of Essex, Department of History
    Time period covered
    Jan 1, 1851 - Jan 1, 1911
    Area covered
    England and Wales, Scotland
    Description

    This Special Licence access dataset contains names and addresses from the Integrated Census Microdata (I-CeM) dataset of the censuses of Great Britain for the period 1851 to 1911. These data are made available under Special Licence (SL) access conditions due to commercial sensitivity.

    The anonymised main I-CeM database that complements these names and addresses is available under SN 7481. It comprises the Censuses of Great Britain for the period 1851-1911; data are available for England and Wales for 1851-1861 and 1881-1911 (1871 is not currently available for England and Wales) and for Scotland for 1851-1901 (1911 is not currently available for Scotland). The database contains over 180 million individual census records and was digitised and harmonised from the original census enumeration books. It details characteristics for all individuals resident in Great Britain at each of the included Censuses. The original digital data has been coded and standardised; the I-CeM database has consistent geography over time and standardised coding schemes for many census variables.

    This dataset of names and addresses for individual census records is organised per country (England and Wales; Scotland) and per census year. Within each data file each census record contains first and last name, street address and an individual identification code (RecID) that allows linking with the corresponding anonymised I-CeM record. The data cannot be used for true linking of individual census records across census years for commercial genealogy purposes nor for any other commercial purposes. The SL arrangements are required to ensure that commercial sensitivity is protected. For information on making an application, see the Access section.

    The data were updated in February 2020, with some files redeposited with longer field length limits. Users should note that some name and address fields are truncated due to the limits set by the LDS project that transcribed the original data. No more than 10,000 records out of some 210 million across the study should be affected. Examples include:

    • England and Wales:
      • 1851 - truncated at the 24th character (maximum I-CeM field length 95 characters)
      • 1881 - truncated at the 16th character (maximum I-CeM field length 50 characters).
    • Scotland: for 1851‐71, truncations affect less than 0.01% of all addresses and for 1851 around 1% at most
      • 1851 - truncated at the 70th character
      • 1861 - truncated at the 76th character
      • 1871 - truncated at the 82th character
      • 1881 - truncated at the 50th character.

    Further information about I-CeM can be found on the I-CeM Integrated Microdata Project and I-CeM Guide webpages.

  12. w

    Dataset of books called The British are coming : a look at the British...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called The British are coming : a look at the British record breaking and the people that made it happen [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=The+British+are+coming+%3A+a+look+at+the+British+record+breaking+and+the+people+that+made+it+happen
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is The British are coming : a look at the British record breaking and the people that made it happen. It features 7 columns including author, publication date, language, and book publisher.

  13. u

    I-CeM

    • datacatalogue.ukdataservice.ac.uk
    Updated Nov 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schurer, K., University of Essex, Department of History; Higgs, E., University of Essex, Department of History (2025). I-CeM [Dataset]. http://doi.org/10.5255/UKDA-SN-7856-2
    Explore at:
    Dataset updated
    Nov 12, 2025
    Dataset provided by
    UK Data Servicehttps://ukdataservice.ac.uk/
    Authors
    Schurer, K., University of Essex, Department of History; Higgs, E., University of Essex, Department of History
    Time period covered
    Jan 1, 1851 - Jan 1, 1911
    Area covered
    England and Wales, Scotland
    Description

    This Special Licence access dataset contains names and addresses from the Integrated Census Microdata (I-CeM) dataset of the censuses of Great Britain for the period 1851 to 1911. These data are made available under Special Licence (SL) access conditions due to commercial sensitivity.

    The anonymised main I-CeM database that complements these names and addresses is available under SN 7481. It comprises the Censuses of Great Britain for the period 1851-1911; data are available for England and Wales for 1851-1861 and 1881-1911 (1871 is not currently available for England and Wales) and for Scotland for 1851-1901 (1911 is not currently available for Scotland). The database contains over 180 million individual census records and was digitised and harmonised from the original census enumeration books. It details characteristics for all individuals resident in Great Britain at each of the included Censuses. The original digital data has been coded and standardised; the I-CeM database has consistent geography over time and standardised coding schemes for many census variables.

    This dataset of names and addresses for individual census records is organised per country (England and Wales; Scotland) and per census year. Within each data file each census record contains first and last name, street address and an individual identification code (RecID) that allows linking with the corresponding anonymised I-CeM record. The data cannot be used for true linking of individual census records across census years for commercial genealogy purposes nor for any other commercial purposes. The SL arrangements are required to ensure that commercial sensitivity is protected. For information on making an application, see the Access section.

    The data were updated in February 2020, with some files redeposited with longer field length limits. Users should note that some name and address fields are truncated due to the limits set by the LDS project that transcribed the original data. No more than 10,000 records out of some 210 million across the study should be affected. Examples include:

    • England and Wales:
      • 1851 - truncated at the 24th character (maximum I-CeM field length 95 characters)
      • 1881 - truncated at the 16th character (maximum I-CeM field length 50 characters).
    • Scotland: for 1851‐71, truncations affect less than 0.01% of all addresses and for 1851 around 1% at most
      • 1851 - truncated at the 70th character
      • 1861 - truncated at the 76th character
      • 1871 - truncated at the 82th character
      • 1881 - truncated at the 50th character.

    Further information about I-CeM can be found on the I-CeM Integrated Microdata Project and I-CeM Guide webpages.

  14. Z

    Historically Irish Surnames Dataset

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crymble, Adam (2020). Historically Irish Surnames Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_20985
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    University of Hertfordshire
    Authors
    Crymble, Adam
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset provides a list of surnames that are reliably Irish and that can be used for identifying textual references to Irish individuals in the London area and surrounding countryside within striking distance of the capital. This classification of the Irish necessarily includes the Irish-born and their descendants. The dataset has been validated for use on records up to the middle of the nineteenth century, and should only be used in cases in which a few mis-classifications of individuals would not undermine the results of the work, such as large-scale analyses. These data were created through an analysis of the 1841 Census of England and Wales, and validated against the Middlesex Criminal Registers (National Archives HO 26) and the Vagrant Lives Dataset (Crymble, Adam et al. (2014). Vagrant Lives: 14,789 Vagrants Processed by Middlesex County, 1777-1786. Zenodo. 10.5281/zenodo.13103). The sample was derived from the records of the Hundred of Ossulstone, which included much of rural and urban Middlesex, excluding the City of London and Westminster. The analysis was based upon a study of 278,949 adult males. Full details of the methodology for how this dataset was created can be found in the following article, and anyone intending to use this dataset for scholarly research is strongly encouraged to read it so that they understand the strengths and limits of this resource:

    Adam Crymble, 'A Comparative Approach to Identifying the Irish in Long Eighteenth Century London', _Historical Methods: A Journal of Quantitative and Interdisciplinary History_, vol. 48, no. 3 (2015): 141-152.
    

    The data here provided includes all 283 names listed in Appendix I of the above paper, but also an additional 209 spelling variations of those root surnames, for a total of 492 names.

  15. Cambridgeshire LSOA Local Names - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2025). Cambridgeshire LSOA Local Names - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/cambridgeshire-lsoa-local-names
    Explore at:
    Dataset updated
    Aug 2, 2025
    Dataset provided by
    CKANhttps://ckan.org/
    Area covered
    Cambridgeshire
    Description

    LSOAs or Lower Super Output areas are statistical geographies that cover the entirety of England and Wales, in all there are around 33000 LSOAs. They were designed for the Census 2001 as a way to release and present data in a more consistent way at a hyper local level. Each LSOA has a consistent population of between 1000-3000 which enables the comparison of data more easily between different areas. The issue with them is that they were not given recognisable names when designed – instead they have standard ONS codes (e.g. ‘E01017975’) and schematic names relating to the local authorities where they are located (e.g. ‘Cambridge 001’). This has meant it was difficult to present data at LSOA level in a way that is easy to understand or interpret. As a result, the Cambridgeshire Policy and Insight team decided that LSOAs could be improved with the addition of a local name that means something to local people! So to solve this problem members of the team came up with a methodology and undertook the process of naming all 395 LSOAs in Cambridgeshire. These names draw upon local landmarks, key roads and compass points within towns, villages and neighbourhoods. A process of consultation was also undertaken with members of the Communities service in CCC who provided local insight and further refined the dataset. We thank all those that contributed to this process.The plan now is to integrate the local LSOAs names into all CCC LSOA based interactive reports, maps and dashboards on Cambridgeshire and Peterborough Insight in the near future.

  16. Independent living - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Apr 12, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2018). Independent living - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/independent-living
    Explore at:
    Dataset updated
    Apr 12, 2018
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This data has been taken from LG Inform at http://lginform.local.gov.uk/ data reference ID 31. It shows the percentage of vulnerable people achieving independent living in Plymouth from financial year 2006/2007 to 2010/2011 Percentage of vulnerable people achieving independent living. This is the number of service users (i.e. people who are receiving a Supporting People Service) who have moved on from supported accommodation in a planned way, as a percentage of total service users who have left the service. This was previously reported as NI 141. Source name: Communities and Local Government Collection name: Supporting People Local System (SPLS) Polarity: High is good Polarity is how sentiment is measured "Sentiment is usually considered to have "poles" positive and negative these are often translated into "good" and "bad" sentiment analysis is considered useful to tell us what is good and bad in our information stream

  17. Who's the Boss? People with Significant Control

    • kaggle.com
    zip
    Updated Aug 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob Boysen (2017). Who's the Boss? People with Significant Control [Dataset]. https://www.kaggle.com/datasets/jboysen/uk-psc
    Explore at:
    zip(639061444 bytes)Available download formats
    Dataset updated
    Aug 23, 2017
    Authors
    Jacob Boysen
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context:

    The People with significant control (PSC) snapshot is a data snapshot containing the full list of PSC's provided to Companies House. The Prime Minister first put corporate transparency on the international agenda when he chaired the G8 summit in Lough Erne and secured commitment to action, the commitment to enhance corporate transparency in the UK was reaffirmed at London’s International Anti-Corruption Summit in May 2016. Since then the EU and G20 countries have also agreed to act. The UK is the first country in the G20 to create a public register of this kind.

    The UK has high standards of business behaviour and corporate governance. The overwhelming majority of UK companies contribute productively to the UK economy, abide by the law and make a valuable contribution to society. But there are exceptions. Some of the features of the company structure which make it good for business also make it attractive to criminals. Companies can be misused to facilitate a range of criminal activities - from money laundering to tax evasion, corruption to terrorist financing. Sometimes those individuals running companies will not conduct themselves in accordance with the high standards we expect in the UK, posing a risk to other companies and consumers alike.

    Information about the ownership and control of UK corporate entities will bring benefits for law enforcement, business, civil society and citizens. By making this information publicly available, free of charge, the government is setting a standard that we are persuading other countries to follow.

    Content:

    A person of significant control is someone that holds more than 25% of shares or voting rights in a company, has the right to appoint or remove the majority of the board of directors or otherwise exercises significant influence or control. This is a snapshot of data in zipped JSON form, as of Aug 23 2017. Daily updated snapshots and streaming API details can be found here. The People with Significant Control (PSC) register includes information about the individuals who own or control companies including their name, month and year of birth, nationality, and details of their interest in the company. From 30 June 2016, UK companies (except listed companies) and limited liability partnerships (LLPs) need to declare this information when issuing their annual confirmation statement to Companies House.

    Acknowledgements:

    Guidance here. The data is collected by UK government.

    Inspiration:

    • Who owns the most businesses? In certain areas?
    • Any weird looking situations where ownership might be obscured?
  18. Smoking in the UK

    • kaggle.com
    zip
    Updated Jun 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gautam (2024). Smoking in the UK [Dataset]. https://www.kaggle.com/datasets/gautamdhall/uk-smoking
    Explore at:
    zip(31342 bytes)Available download formats
    Dataset updated
    Jun 22, 2024
    Authors
    Gautam
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    United Kingdom
    Description

    The dataset is a comprehensive collection of smoking statistics in the UK, segmented by various demographic groups and time periods. Here's a detailed description of the columns and the overall structure of the data:

    Columns

    1. Sex: Gender of the group being analyzed (e.g., Persons, Males, Females).
    2. Country code: The code representing the country (e.g., E92000001 for England).
    3. Country: The name of the country (e.g., England).
    4. Age group: Age range of the group being analyzed (e.g., 18-24, 25-34).

    Yearly Data Columns

    Each year has several related columns: - Year Current smokers %: Percentage of current smokers in the given year. - Year Current smokers LCL: Lower Confidence Limit (LCL) for the percentage of current smokers. - Year Current smokers UCL: Upper Confidence Limit (UCL) for the percentage of current smokers. - Year Ex-smokers %: Percentage of ex-smokers in the given year. - Year Ex-smokers LCL: Lower Confidence Limit (LCL) for the percentage of ex-smokers. - Year Ex-smokers UCL: Upper Confidence Limit (UCL) for the percentage of ex-smokers. - Year Never smoked %: Percentage of people who have never smoked in the given year. - Year Never smoked LCL: Lower Confidence Limit (LCL) for the percentage of people who have never smoked. - Year Never smoked UCL: Upper Confidence Limit (UCL) for the percentage of people who have never smoked. - Year Weighted count [note 2]: Weighted count of the survey respondents for the given year. - Year Sample size [note 3]: Sample size of the survey respondents for the given year.

    This structure is repeated for each year included in the dataset, from 2011 to 2022.

    Data Overview

    • Data Entries: There are 105 entries in the dataset, each corresponding to a unique combination of sex, country, and age group.
    • Total Columns: 136 columns, covering various statistics for each year from 2011 to 2022.

    Data Characteristics

    • Object Columns: These typically include textual information like sex, country code, country name, and age group.
    • Float Columns: These columns contain numerical data, such as percentages and confidence limits.
    • Sample Size and Weighted Count: These columns also include numerical data but are formatted as strings due to the presence of commas.

    Example Row

    For a better understanding, here's an example of a row from the dataset: - Sex: Persons - Country code: E92000001 - Country: England - Age group: 18-24 - 2022 Current smokers %: 11.6 - 2022 Current smokers LCL: 10.5 - 2022 Current smokers UCL: 12.7 - 2022 Ex-smokers %: 5.8 - 2022 Ex-smokers LCL: 5.1 - 2022 Ex-smokers UCL: 6.5 - 2011 Current smokers %: 25.7 - 2011 Ex-smokers %: 14.6 - 2011 Never smoked %: 59.6 - 2011 Weighted count [note 2]: 45,34,797 - 2011 Sample size [note 3]: 16,005

    This example showcases how the dataset provides detailed smoking statistics segmented by various demographic factors and over multiple years, making it a rich source of information for analyzing smoking trends in the UK.

  19. Care Homes for Older People - Scotland - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Aug 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2020). Care Homes for Older People - Scotland - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/care-homes-for-older-people-scotland
    Explore at:
    Dataset updated
    Aug 10, 2020
    Dataset provided by
    CKANhttps://ckan.org/
    Area covered
    Scotland
    Description

    This point location dataset of the name, address, location and unique IDs (with Unique Property Reference Number) of every older person care home in Scotland has been supplied by the Care Inspectorate.

  20. Success.ai | | US Premium B2B Emails & Phone Numbers Dataset - APIs and flat...

    • datarade.ai
    Updated Oct 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2024). Success.ai | | US Premium B2B Emails & Phone Numbers Dataset - APIs and flat files available – 170M+, Verified Profiles - Best Price Guarantee [Dataset]. https://datarade.ai/data-products/success-ai-us-premium-b2b-emails-phone-numbers-dataset-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Oct 25, 2024
    Dataset provided by
    Area covered
    United States
    Description

    Success.ai offers a comprehensive, enterprise-ready B2B leads data solution, ideal for businesses seeking access to over 150 million verified employee profiles and 170 million work emails. Our data empowers organizations across industries to target key decision-makers, optimize recruitment, and fuel B2B marketing efforts. Whether you're looking for UK B2B data, B2B marketing data, or global B2B contact data, Success.ai provides the insights you need with pinpoint accuracy.

    Tailored for B2B Sales, Marketing, Recruitment and more: Our B2B contact data and B2B email data solutions are designed to enhance your lead generation, sales, and recruitment efforts. Build hyper-targeted lists based on job title, industry, seniority, and geographic location. Whether you’re reaching mid-level professionals or C-suite executives, Success.ai delivers the data you need to connect with the right people.

    API Features:

    • Real-Time Updates: Our APIs deliver real-time updates, ensuring that the contact data your business relies on is always current and accurate.
    • High Volume Handling: Designed to support up to 860k API calls per day, our system is built for scalability and responsiveness, catering to enterprises of all sizes.
    • Flexible Integration: Easily integrate with CRM systems, marketing automation tools, and other enterprise applications to streamline your workflows and enhance productivity.

    Key Categories Served: B2B sales leads – Identify decision-makers in key industries, B2B marketing data – Target professionals for your marketing campaigns, Recruitment data – Source top talent efficiently and reduce hiring times, CRM enrichment – Update and enhance your CRM with verified, updated data, Global reach – Coverage across 195 countries, including the United States, United Kingdom, Germany, India, Singapore, and more.

    Global Coverage with Real-Time Accuracy: Success.ai’s dataset spans a wide range of industries such as technology, finance, healthcare, and manufacturing. With continuous real-time updates, your team can rely on the most accurate data available: 150M+ Employee Profiles: Access professional profiles worldwide with insights including full name, job title, seniority, and industry. 170M Verified Work Emails: Reach decision-makers directly with verified work emails, available across industries and geographies, including Singapore and UK B2B data. GDPR-Compliant: Our data is fully compliant with GDPR and other global privacy regulations, ensuring safe and legal use of B2B marketing data.

    Key Data Points for Every Employee Profile: Every profile in Success.ai’s database includes over 20 critical data points, providing the information needed to power B2B sales and marketing campaigns: Full Name, Job Title, Company, Work Email, Location, Phone Number, LinkedIn Profile, Experience, Education, Technographic Data, Languages, Certifications, Industry, Publications & Awards.

    Use Cases Across Industries: Success.ai’s B2B data solution is incredibly versatile and can support various enterprise use cases, including: B2B Marketing Campaigns: Reach high-value professionals in industries such as technology, finance, and healthcare. Enterprise Sales Outreach: Build targeted B2B contact lists to improve sales efforts and increase conversions. Talent Acquisition: Accelerate hiring by sourcing top talent with accurate and updated employee data, filtered by job title, industry, and location. Market Research: Gain insights into employment trends and company profiles to enrich market research. CRM Data Enrichment: Ensure your CRM stays accurate by integrating updated B2B contact data. Event Targeting: Create lists for webinars, conferences, and product launches by targeting professionals in key industries.

    Use Cases for Success.ai's Contact Data - Targeted B2B Marketing: Create precise campaigns by targeting key professionals in industries like tech and finance. - Sales Outreach: Build focused sales lists of decision-makers and C-suite executives for faster deal cycles. - Recruiting Top Talent: Easily find and hire qualified professionals with updated employee profiles. - CRM Enrichment: Keep your CRM current with verified, accurate employee data. - Event Targeting: Create attendee lists for events by targeting relevant professionals in key sectors. - Market Research: Gain insights into employment trends and company profiles for better business decisions. - Executive Search: Source senior executives and leaders for headhunting and recruitment. - Partnership Building: Find the right companies and key people to develop strategic partnerships.

    Why Choose Success.ai’s Employee Data? Success.ai is the top choice for enterprises looking for comprehensive and affordable B2B data solutions. Here’s why: Unmatched Accuracy: Our AI-powered validation process ensures 99% accuracy across all data points, resulting in higher engagement and fewer bounces. Global Scale: With 150M+ employee profiles and 170M veri...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jean (2023). UK Baby Names 👶 (1996-2021) [Dataset]. https://www.kaggle.com/datasets/johnsmith44/uk-baby-names-1996-2021
Organization logo

UK Baby Names 👶 (1996-2021)

Official data of baby names in the UK for 1996-2021

Explore at:
zip(983352 bytes)Available download formats
Dataset updated
Aug 4, 2023
Authors
Jean
Area covered
United Kingdom
Description

Introduction.

Baby name statistics are compiled from first names recorded when live births are registered in England and Wales as part of civil registration, a legal requirement. The statistics are based only on live births which occurred in the calendar year, as there is no public register of stillbirths. Babies born in England and Wales to women whose usual residence is outside England and Wales are included in the statistics for England and Wales as a whole, but excluded from any sub-division of England and Wales. The statistics are based on the exact spelling of the name given on the birth certificate. Grouping names with similar pronunciation would change the rankings. Exact names are given so users can group if they wish.

The dataset contains records of around 16k boy names and 22k girl names.

Notes and definitions.

Baby name statistics do not include births to women usually resident in England or Wales who give birth abroad. They do include births to women whose usual residence is outside England and Wales where the birth occurred in England or Wales. Births where the name of the baby is not stated are excluded from all the ranks. Births where the usual residence of the mother was not in England and Wales are excluded from the regional ranks and from the separate England and Wales ranks. Names with a count of 2 or less in total within England and Wales have been redacted using S40 of the Freedom of Information Act in order to protect the confidentiality of individuals. This is consistent with the disclosure control methodology used for our birth statistics.

Source of data: The ONS.

License: Open Government License

Search
Clear search
Close search
Google apps
Main menu