3 datasets found
  1. Ethnicity coding

    • zenodo.org
    Updated Mar 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paola Galdi; Paola Galdi; Luna De Ferrari; Luna De Ferrari (2025). Ethnicity coding [Dataset]. http://doi.org/10.5281/zenodo.15044385
    Explore at:
    Dataset updated
    Mar 18, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Paola Galdi; Paola Galdi; Luna De Ferrari; Luna De Ferrari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This Zenodo entry details the methodology for extracting and reconciling ethnicity data from the Clinical Practice Research Datalink (CPRD), incorporating both General Practitioner (GP) and Hospital Episode Statistics (HES) sources. The approach aims to resolve discrepancies between these sources and provide a standardized single ethnicity value per patient, categorized into 6 and 12 levels according to NHS coding guidelines.

    Materials and Methods:

    Ethnicity data from the CPRD are recorded in multiple formats. This study harmonizes these data to achieve consistent ethnicity classification across patient records, following a hierarchal reconciliation protocol prioritizing hospital data over GP records.

    Ethnicity Levels: Ethnicity data are processed to conform to two levels of granularity:

    1. Six high-level categories: White, Black, Asian, Mixed, Other, Unknown
    2. Twelve detailed categories: Bangladeshi, Black African, Black Caribbean, Black Other, Chinese, Indian, Mixed, Other Asian, Other, Pakistani, Unknown, White

    Source Data Mapping:

    • CPRD Medcodes: Directly mapped to 490 SNOMED codes
    • SNOMED to NHS Codes: SNOMED codes are linked to 26 NHS ethnicity codes
    • NHS to HES Codes: These NHS codes further map into 12 HES hospital ethnicities, which then consolidate into the 6 broad categories mentioned above

    Algorithm (AIM-CISC):

    • Hospital Data Priority: Ethnicity records from hospital sources override those from GP records unless the hospital data is classified as "Unknown", null, or empty.
    • Conflict Resolution Within GP Data:
      • The frequency of recorded ethnicities determines the selection. The most frequently recorded ethnicity prevails.
      • If frequencies are tied, the most recent record is used.
      • In cases where records are equally recent, the first alphabetically listed ethnicity is selected.

    Unique Patient Identifiers: Each patient is represented once in hospital data, ensuring a single source of truth for hospital-based ethnicities. This simplifies reconciliation with GP data when discrepancies arise.

    Source Documentation and References:

    Notes on mapping:

    Instances were noted where multiple Medcodes map back to a single SNOMED code, highlighting the importance of careful data cross-referencing. For example, two different Medcodes represent the New Zealand European ethnicity, which both map back to the identical SNOMED code.

  2. g

    Live births to Welsh residents by ethnic group and health board providing...

    • statswales.gov.wales
    Updated Jul 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Live births to Welsh residents by ethnic group and health board providing the service [Dataset]. https://statswales.gov.wales/Catalogue/Health-and-Social-Care/NHS-Primary-and-Community-Activity/Maternity/LivebirthstoWelshresidents-by-ethnicgroup-healthboardprovidingtheservice
    Explore at:
    Dataset updated
    Jul 2024
    Area covered
    Wales
    Description

    Full details of every data item available on both the Maternity Indicators dataset and National Community Child Health Database are available through the NWIS Data Dictionary: http://www.datadictionary.wales.nhs.uk/#!WordDocuments/datasetstructure20.htm From 1st April 2019 health service provision for residents of Bridgend local authority moved from Abertawe Bro Morgannwg to Cwm Taf. For more information see the joint statement from Cwm Taf and Abertawe Bro Morgannwg University Health Boards (see weblinks). The health board names have changed with Cwm Taf University Health Board becoming Cwm Taf Morgannwg University Health Board and Abertawe Bro Morgannwg University Health Board becoming Swansea Bay University Health Board. Data for Abertawe Bro Morgannwg and Cwm Taf are available for previous years in this table by selecting the tick boxes in the Area drop-down box.

  3. Born in Bradford

    • redivis.com
    application/jsonl +7
    Updated Sep 16, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2016). Born in Bradford [Dataset]. http://doi.org/10.57761/yexf-qd19
    Explore at:
    csv, parquet, sas, spss, stata, application/jsonl, arrow, avroAvailable download formats
    Dataset updated
    Sep 16, 2016
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford Center for Population Health Sciences
    Area covered
    Bradford
    Description

    Abstract

    The Born in Bradford study is tracking the health and wellbeing of over 13,500 children, and their parents born at Bradford Royal Infirmary between March 2007 and December 2010.

    Documentation

    Born in Bradford is a prospective pregnancy and birth cohort established to examine how genetic, nutritional, environmental, behavioral and social factors affect health and development during childhood, and subsequently adult life, in a deprived multi-ethnic population. It was developed in close consultation with local communities, clinicians and policy makers with commitment from the outset to undertake research that would both inform interventions to improve health in the city and generate robust science relevant to similar communities in the UK and across the world. Between 2007 and 2011 information on a wide range of characteristics were collected from 12,453 women (and 3,356 partners) who experienced 13,778 pregnancies and delivered 13,818 live births.

    Source

    Notes

    Data Presentation: Born in Bradford Data

    Born in Bradford Data Dictionary

    Born in Bradford has a number of unique strengths: a) Composition. Half of all the families recruited are living in the UK’s most deprived wards, and 45% are of Pakistani origin. Half of Pakistani-origin mothers and fathers were born outside the UK and over half are related to their partner. This combination enhances the opportunity to study the interplay of deprivation, ethnicity, migration and cultural characteristics and their relationship to social, economic and health outcomes research relevant to many communities across the world.

    b) Rich characterization. Detailed information has been collected from parents about demographic, economic, lifestyle, cultural, medical and health factors. Pregnancy oral glucose tolerance tests (OGTT), have been completed in 85% of the cohort and in combination with repeat fetal ultrasound data and subsequent follow-up growth and adiposity (repeat skinfolds, weight and height from birth to current age) will enable BiB uniquely to explore ethnic differences in body composition trajectories through infancy and childhood.

    c) Genetic and biomarker data. Maternal, neonatal and follow-up child blood samples have provided biomarker measures of adiposity and immunity, together with stored samples, for which funding has been secured, to assess targeted NMR metabolites in maternal pregnancy fasting samples, cord-blood and infant samples taken at 12-24 months. Genome wide data is available for 9000+ mothers and 8000+ children and funding has been secured for DNA methylation of 1000 mother-child pairs. Our BiB biobank contains 200,000 stored samples.

    d) System-wide coverage. The study has successfully linked primary and secondary care, radiology, laboratory and local authority data. This successful data linkage to routine health and education data will allow life-time follow up of clinical outcomes for BiB children and their parents, and educational attainment for children.

    e) Community involvement. Close links with members of the public and particularly with cohort members allow the co-production of research in terms of the identification of research questions, monitoring the demands research makes on participants and discussion of the implementation of findings. The study has strong community roots and city-wide support.

    Full details of the cohort and related publications can be found on the website

    Patient characteristics Children born in the city of Bradford Claims years: 2007-2011 12,453 women with 13,776 pregnancies and 3,448 of their partners Cord blood samples have been obtained and stored and DNA extraction on 10,000 mother\offspring pairs. Sex: Adults: 12,453 women, 3,448 males

    Application

    If you are interested in working with these data, the application packet, with examples, can be found here: Born in Bradford Application Packet

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Paola Galdi; Paola Galdi; Luna De Ferrari; Luna De Ferrari (2025). Ethnicity coding [Dataset]. http://doi.org/10.5281/zenodo.15044385
Organization logo

Ethnicity coding

Explore at:
Dataset updated
Mar 18, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Paola Galdi; Paola Galdi; Luna De Ferrari; Luna De Ferrari
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This Zenodo entry details the methodology for extracting and reconciling ethnicity data from the Clinical Practice Research Datalink (CPRD), incorporating both General Practitioner (GP) and Hospital Episode Statistics (HES) sources. The approach aims to resolve discrepancies between these sources and provide a standardized single ethnicity value per patient, categorized into 6 and 12 levels according to NHS coding guidelines.

Materials and Methods:

Ethnicity data from the CPRD are recorded in multiple formats. This study harmonizes these data to achieve consistent ethnicity classification across patient records, following a hierarchal reconciliation protocol prioritizing hospital data over GP records.

Ethnicity Levels: Ethnicity data are processed to conform to two levels of granularity:

  1. Six high-level categories: White, Black, Asian, Mixed, Other, Unknown
  2. Twelve detailed categories: Bangladeshi, Black African, Black Caribbean, Black Other, Chinese, Indian, Mixed, Other Asian, Other, Pakistani, Unknown, White

Source Data Mapping:

  • CPRD Medcodes: Directly mapped to 490 SNOMED codes
  • SNOMED to NHS Codes: SNOMED codes are linked to 26 NHS ethnicity codes
  • NHS to HES Codes: These NHS codes further map into 12 HES hospital ethnicities, which then consolidate into the 6 broad categories mentioned above

Algorithm (AIM-CISC):

  • Hospital Data Priority: Ethnicity records from hospital sources override those from GP records unless the hospital data is classified as "Unknown", null, or empty.
  • Conflict Resolution Within GP Data:
    • The frequency of recorded ethnicities determines the selection. The most frequently recorded ethnicity prevails.
    • If frequencies are tied, the most recent record is used.
    • In cases where records are equally recent, the first alphabetically listed ethnicity is selected.

Unique Patient Identifiers: Each patient is represented once in hospital data, ensuring a single source of truth for hospital-based ethnicities. This simplifies reconciliation with GP data when discrepancies arise.

Source Documentation and References:

Notes on mapping:

Instances were noted where multiple Medcodes map back to a single SNOMED code, highlighting the importance of careful data cross-referencing. For example, two different Medcodes represent the New Zealand European ethnicity, which both map back to the identical SNOMED code.

Search
Clear search
Close search
Google apps
Main menu