PRIO is hosting a copy of this dataset with permission from Global Data Lab. Please see their webpage for more information about this data.
The report is released by the Ministry of Justice and produced in accordance with arrangements approved by the UK Statistics Authority.
For further information about the Justice Data Lab, please refer to the following guidance:
http://www.justice.gov.uk/justice-data-lab" class="govuk-link">http://www.justice.gov.uk/justice-data-lab
One request is being published this quarter: The Chrysalis Programme (2012-2017).
The Chrysalis Programme is an integrated personal leadership and effectiveness development programme, working with individuals while they are in prison. This is the first JDL evaluation for Chrysalis, looking at programme participants between 2012 and 2017.
The overall results show that those who took part in the Chrysalis Programme had a lower offending frequency compared to a matched comparison group. More people would be needed to determine the effect on the rate of reoffending and the time to first proven reoffence.
The Justice Data Lab team have brought in reoffending data for the first quarter of 2021 into the service. It is now possible for an organisation to submit information on the individuals it was working with up to the end of March 2021, in addition to during the years 2002 to 2020.
The bulletin is produced and handled by the Ministry’s analytical professionals and production staff. Pre-release access of up to 24 hours is granted to the following persons: Minister of State, Lord Chancellor and Secretary of State for Justice, Special Advisers, Permanent Secretary, Head of News, 1 Director General, 4 press officers, 2 policy officials, and 6 analytical officials. Relevant Special Advisers and Private Office staff of Ministers and senior officials may have access to pre-release figures to inform briefing and handling arrangements.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Open access repository currently comprising dozens of datasets covering India's 500,000 villages, 8000 towns, and 4000 legislative constituencies using a set of a common geographic identifiers that span 25 years
Used for assembly, checkout, and lab-scale testing of prototype hardware.
The Justice Data Lab has been launched as a pilot for one year from April 2013. During this year, a small team from Analytical Services within the Ministry of Justice will support organisations that provide offender services by allowing them easy access to aggregate re-offending data, specific to the group of people they have worked with. This will support organisations in understanding their effectiveness at reducing re-offending.
The service model involves organisations sending the Justice Data Lab team details of the offenders they have worked with along with information about the specific intervention they have delivered. The Data Lab team then matches these offenders to MoJ’s central datasets and returns the re-offending rate of this particular cohort, alongside that of a control group of offenders with very similar characteristics in order to better identify the impact of the organisation’s work.
There are two publication types:
A summary of the findings of the Justice Data Lab pilot to date (2 April to 30 November 2013).
Tailored reports about the re-offending outcomes of services or interventions delivered by each of the organisations who have requested information through the Justice Data Lab pilot. Each report is an Official Statistic and will show the results of the re-offending analysis for the particular service or intervention delivered by the organisation who delivered it.
To date, the Justice Data Lab has received 65 requests for re-offending information and has produced 30 reports, 23 of which were published last month. A further 6 are now complete and ready for publication, bringing the total of completed reports to 36. To date, there have been 11 requests that could not be processed as the minimum criteria for analyses through the Data Lab had not been met. The remaining requests are currently in progress and will be published in future monthly releases of these statistics.
Of the 6 reports being published this month:
Two reports are National analyses of the NOMS Co-Financing Organisation (NOMS CFO) project – following the regional analyses that were published last month. This programme helps offenders access mainstream services with the aim of gaining skills and moving them into employment. The initiative is funded in partnership with the European Social Fund and is delivered regionally through a number of different suppliers; these include a number of probation trusts and private companies such as Serco, A4E, and Pertemps People Development Group. There are two reports presented where the programme was started by individuals in 2010; one report covers individuals starting the programme in custody, and the second report covers those who started the programme in the community.
For more information about the NOMS CFO project, please see the following http://co-financing.org/about_main.php" class="govuk-link">information
There were four additional inconclusive results which looked at programmes delivered by A4e, HMP Downview, Foundation and Prince’s Trust. Reasons for an inconclusive result include; the sample of individuals provided by the organisation was too small to detect a statistically significant change in behaviour; or that the service or programme genuinely does not affect re-offending behaviour. However, it is very difficult to differentiate between these reasons in the analysis, so the organisations are recommended to submit larger samples of data when it becomes available. Detailed discussion of results and interpretation is available in the individual reports.
The bulletin is produced and handled by the Ministry’s analytical professionals and production staff. Pre-release access of up to 24 hours is granted to the following persons: Ministry of Justice Secretary of State, Parliamentary Under Secretary of State, Permanent Secretary, Policy Advisers for reducing re-offending, Policy Advisors for the Transforming Rehabilitation Programme, and relevant Press Officers and Special Advisers.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dashboard is part of SDGs Today. Please see sdgstoday.orgExtreme poverty poses a major challenge to the livelihood of current and future generations everywhere and threatens Agenda 2030’s promise of leaving no one behind. The World Poverty Clock developed by the World Data Lab provides real-time poverty estimates through 2030 for nearly all countries. The World Poverty Clock uses publicly available data on income distributions, production factors, and household consumption provided by various international organizations, including the World Bank and the International Monetary Fund (IMF). These organizations compile data provided to them by the local governments, and when this information is not available, the World Poverty Clock uses specific models to estimate poverty in these countries. The models include how individual incomes might change over time using IMF growth forecasts for the medium-term complemented by long-term “shared socio-economic pathways” developed by the International Institute for Applied Systems Analysis (IIASA) and similar analysis developed by the OECD. The World Poverty Clock dataset was updated in February 2021, taking into consideration the COVID-19 pandemic effects on the economy.
This dataset contains data from the World Development Indicators on Poverty and Shared Prosperity presenting indicators that measure progress toward the World Bank Group’s twin goals of ending extreme poverty by 2030 and promoting shared prosperity in every country in a sustainable manner.
The Innovation Lab for Applied Wheat Genomics aims to develop heat tolerant, high yielding, and farmer-accepted varieties for South Asia, while simultaneously increasing the research for development capacity of the global wheat improvement system through application of cutting-edge genomics and high-throughput phenotyping in applied wheat improvement.
An effective policy response to the economic impacts of the COVID-19 pandemic requires an enormous range of data to inform the design and response of programs. Public health measures require data on the spread of the disease, beliefs in the population, and capacity of the health system. Relief efforts depend on an understanding of hardships being faced by various segments of the population. Food policy requires measurement of agricultural production and hunger. In such a rapidly evolving pandemic, these data must be collected at a high frequency. Given the unexpected nature of the shock and urgency with which a response was required, Indian policymakers needed to formulate policies affecting India’s 1.4 billion people, without the detailed evidence required to construct effective programs. To help overcome this evidence gap, the World Bank, IDinsight, and the Development Data Lab sought to produce rigorous and responsive data for policymakers across six states in India: Jharkhand, Rajasthan, Uttar Pradesh, Andhra Pradesh, Bihar, and Madhya Pradesh.
Andhra Pradesh, Bihar, Jharkhand, Madhya Pradesh, Rajasthan, and Uttar Pradesh
Household
Sample survey data [ssd]
This dataset includes observations covering six states (Andhra Pradesh, Bihar, Jharkhand, Madhya Pradesh, Rajasthan, Uttar Pradesh) and three survey rounds. The survey did not have a single, unified frame from which to sample phone numbers. The final sample was assembled from several different sample frames, and the choice of frame sample frames varied across states and survey rounds.
These frames comprise four prior IDinsight projects and from an impact evaluation of the National Rural Livelihoods project conducted by the Ministry of Rural Development. Each of these surveys sought to represent distinct populations, and employed idiosyncratic sample designs and weighting schemes.
A detailed note covering key features of each sample frame is available for download.
Computer Assisted Telephone Interview [cati]
The survey questionnaires covered the following subjects:
Agriculture: COVID-19-related changes in price realisation, acreage decisions, input expenditure, access to credit, access to fertilisers, etc.
Income and consumption: Changes in wage rates, employment duration, consumption expenditure, prices of essential commodities, status of food security etc.
Migration: Rates of in-migration, migrant income and employment status, return migration plans etc.
Access to relief: Access to in-kind, cash and workfare relief, quantities of relief received, and constraints on the access to relief.
Health: Access to health facilities and rates of foregone healthcare, knowledge of COVID-19 related symptoms and protective behaviours.
While a number of indicators were consistent across all three rounds, questions were added and removed as and when necessary to account for seasonal changes (i.e: in the agricultural cycle).
Round 1: ~55% Round 2: ~46% Round 3: ~55%
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dataset Description
This dataset contains all the tests used for the low-cost sensor development during the iSCAPE project. The dataset is divided in a series of tests, each of them described on a yaml file with the test name. Each csv file contains time series data of each experiment, and the yaml files contain the lists of devices used in each test. The tests are described in the comment of the yaml file, and are meant to be self explanatory. The conditions of the test and the purpose vary, and their reports are also included.
Sensors
The sensors used are herein referred as Citizen Kits or Smart Citizen Kits, and the Living Lab Station or Smart Citizen Station. These are a set of modular hardware components that feature a selection of low cost sensors for environmental monitoring listed below. The Smart Citizen Station is meant to expand the capabilities of the Smart Citizen Kit, aiming to measure pollutants with more advanced sensors. The hardware is licensed under CERN Open Hardware License V1.2 and is fully described in the HardwareX Open Access publication: https://doi.org/10.1016/j.ohx.2019.e00070. The sensor documentation can be found at https://docs.smartcitizen.me and with this DOI at Zenodo: https://doi.org/10.5281/zenodo.2555029.
In the list below, the different sensors for the Citizen Kits are detailed, and their [CHANNELS] in the csv files above linked.
Air temperature (ºC): Sensirion SHT-31 [TEMP]
Relative Humidity (%rh): Sensirion SHT-31 [HUM]
Noise level (dBA): Invensense ICS-434342 [NOISE_A]
Ambient light (lux): Rohm BH1721FVC [LIGHT]
Barometric pressure (kPa): NXP MPL3115A26 [PRESS]
Particulate Matter PM 1 / 2.5 / 10 (µg/m3) Planttower PMS 5003 [EXT_PM_1,EXT_PM_25,EXT_PM_10]
In the list below, the different sensors for the Citizen Kits are detailed, and their [CHANNELS] in the csv files above linked.
Air Temperature (ºC) Sensirion SHT-31 [TEMP]
Relative Humidity (% REL) Sensirion SHT-31 [HUM]
Noise Level (dBA) Invensense ICS-434342 [NOISE_A]
Ambient Light (Lux) Rohm BH1721FVC [LIGHT]
Barometric pressure and AMSL (Pa and Meters) NXP MPL3115A26 [PRESS]
Carbon Monoxide (µg/m3 (Periodic Baseline Calibration Required) SGX MICS-4514 [NA]
Nitrogen Dioxide (µg/m3 (Periodic Baseline Calibration Required) SGX MICS-4514 [NA]
Carbon Monoxide (ppm) Alphasense CO-B4 [GB_1W, GB_1A]
Nitrogen Dioxide (ppb) Alphasense NO2-B43F [GB_2W, GB_2A]
Ozone (ppb) Alphasense OX-B431 [GB_3W, GB_3A]
Gases Board Temperature (ºC) Sensirion SHT-31 [GB_TEMP] or [EXT_TEMP]
Gases Board Rel. Humidity (% REL) Sensirion SHT-31 [GB_HUM] or [EXT_HUM]
PM 1 (µg/m3) Plantower PMS5003 [EXT_PM_1] or [EXT_PM_A_1], [EXT_PM_B_1] for each PM sensor in the case of the Living Lab Station
PM 2.5 (µg/m3) Plantower PMS5003 [EXT_PM_25] or [EXT_PM_A_25], [EXT_PM_B_25] for each PM sensor in the case of the Living Lab Station
PM 10 (µg/m3) Plantower PMS5003 [EXT_PM_10] or [EXT_PM_A_10], [EXT_PM_B_10] for each PM sensor in the case of the Living Lab Station
PN between 0.3um<0.5um particle size (#/l) Plantower PMS5003 [EXT_PN_03] or [EXT_PN_A_03], [EXT_PN_B_03] for each PM sensor in the case of the Living Lab Station
PN between 0.5um<1um particle size (#/l) Plantower PMS5003 [EXT_PN_05] or [EXT_PN_A_05], [EXT_PN_B_05] for each PM sensor in the case of the Living Lab Station
PN between 1m<2.5um particle size (#/l) Plantower PMS5003 [EXT_PN_1] or [EXT_PN_A_1], [EXT_PN_B_1] for each PM sensor in the case of the Living Lab Station
PN between 2.5m<5um particle size (#/l) Plantower PMS5003 [EXT_PN_25] or [EXT_PN_A_25], [EXT_PN_B_25] for each PM sensor in the case of the Living Lab Station
PN between 5m<10um particle size (#/l) Plantower PMS5003 [EXT_PN_5] or [EXT_PN_A_5], [EXT_PN_B_5] for each PM sensor in the case of the Living Lab Station
PN between >10um particle size (#/l) Plantower PMS5003 [EXT_PN_10] or [EXT_PN_A_10], [EXT_PN_B_10] for each PM sensor in the case of the Living Lab Station
How to find the data
Each yaml file contains the description of a test. Each test is comprised of recordings of several devices in the same location and during the same period. Each yaml file is comprised of the following fields:
author: who has been in charge of performing the test (internal reference - not relevant)
comment: describing in general terms what was done in the test, and with what purpose
commit: the firmware commit (in the case of Smart Citizen devices) with which the test was performed, for development purposes only
devices: a descriptor containing different fields for traceability (below)
id: the test name
project: within the test was performed, in this case it is always iscape
report: if there is any report analysing the test
type_test: indoor, oudoor test or other.
Description of devices entry
For each device that was used in the test, two generic types are used:
low cost sensors (type: STATION or KIT)
high end sensors (type: REFERENCE)
For low cost Smart Citizen sensors, the fields are:
alphasense: electrochemical sensors device ids, by pollutant (for manufacturer calibration) and slots in which they were placed
device_id: device id in Smartcitizen API
fileNameInfo: not used
fileNameProc: (only if source = csv is specified) 2019-03_EXT_UCD_URBAN_BACKGROUND_API_CITY_COUNCIL_REF.csv
fileNameRaw: (only if source = csv is used) raw file name
frequency: original recording frequency
location: for timezone correction only, not accurate
max_date: last recording date
min_date: first recording date
name: self-explanatory
pm_sensor: if there was a pm sensor connected (all of them are PMS5003 if no sensor is specified)
source: api or csv
type: STATION (KIT + Alphasense + PM board with two PMS5003) or KIT
version: smartcitizen hardware version
For high end sensors, the fields are:
channels: which channels the device was recording for internal convertion
names: which are the columns in the csv file
pollutants: which pollutants do they respectively refer to
units: the units of these pollutants
equipment: the brand of the analyser
fileNameProc: same as above
fileNameRaw: same as above
index: format in which the timeindex is done, for parsing purposes
format: (example '%Y-%m-%d %H:%M:%S')
frequency: frequency at which the device was recorded
name: column name
location: same as above
name: name of the device
type: REFERENCE (always for these devices)
source: csv
iSCAPE Dataset Reference Numbers:
The datasets here presented are related to the following iSCAPE dataset reference numbers:
DS_TS_054
DS_TS_062
DS_TS_063
DS_TS_065
DS_TS_067
DS_TS_068
DS_TS_069
DS_TS_070
DS_TS_071
DS_TS_072
DS_TS_073
DS_TS_074
DS_TS_075
DS_TS_076
DS_TS_077
DS_TS_078
DS_TS_079
DS_TS_080
DS_TS_081
DS_TS_084
DS_TS_088
DS_TS_089
DS_TS_090
DS_TS_092
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The Laboratory Data Management and Analysis Software market is experiencing robust growth, driven by increasing adoption of LIMS (Laboratory Information Management Systems) and ELN (Electronic Lab Notebook) solutions across various research and development sectors. The market's expansion is fueled by the need for efficient data management, enhanced collaboration, regulatory compliance, and improved data analysis capabilities within laboratories. The rising volume of experimental data generated necessitates sophisticated software to streamline workflows, reduce errors, and accelerate research timelines. Pharmaceutical and biotechnology companies are major adopters, followed by academic institutions and contract research organizations. The trend towards cloud-based solutions and integration with other laboratory instruments is further driving market growth. Competitive pressures among software vendors are leading to innovative features, improved user interfaces, and cost-effective pricing models. While initial investment costs can be a barrier to entry for some smaller labs, the long-term benefits in terms of efficiency gains and reduced operational costs outweigh these considerations. The market is expected to maintain a strong growth trajectory, with continuous technological advancements and increasing demand from diverse sectors contributing to its expansion. The market's segmentation reveals a significant presence of established players like Agilent Technologies and Thermo Fisher Scientific, alongside emerging companies offering specialized solutions and catering to niche market segments. Strategic partnerships and acquisitions are common occurrences as larger players aim to expand their product portfolios and market reach. Geographic variations in market growth are influenced by factors such as technological infrastructure, regulatory landscapes, and research funding. North America and Europe currently dominate the market, but Asia-Pacific is anticipated to demonstrate substantial growth in the coming years, driven by increased investment in research and development across the region. Sustained innovation, alongside the ongoing need for efficient data management and advanced analytics within laboratories, will continue to shape the future trajectory of this dynamic market. We project continued strong growth, albeit at a moderating pace, as the market matures.
An effective policy response to the economic impacts of the COVID-19 pandemic requires an enormous range of data to inform the design and response of programs. Public health measures require data on the spread of the disease, beliefs in the population, and capacity of the health system. Relief efforts depend on an understanding of hardships being faced by various segments of the population. Food policy requires measurement of agricultural production and hunger. In such a rapidly evolving pandemic, these data must be collected at a high frequency. Given the unexpected nature of the shock and urgency with which a response was required, Indian policymakers needed to formulate policies affecting India's 1.4 billion people, without the detailed evidence required to construct effective programs. To help overcome this evidence gap, researchers from the World Bank, in collaboration with IDinsight, the Development Data Lab, and John Hopkins University sought to produce rigorous and responsive data for policymakers across six states in India: Jharkhand, Rajasthan, Uttar Pradesh, Andhra Pradesh, Bihar, and Madhya Pradesh.
Jharkhand, Rajasthan, Uttar Pradesh, Andhra Pradesh, Bihar, and Madhya Pradesh
Households
Sample survey data [ssd]
The samples for these surveys were drawn from surveys and impact evaluations previously conducted by the World Bank, the Ministry of Rural Development, India and IDInsight. A detailed note on the sampling frames is available for download.
Details will be made available after all rounds of data collection and analysis is complete.
Computer Assisted Telephone Interview [cati]
Approximately 55%
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This toolkit is for anyone intending to upload or share any datasets as open data on the Open Development Mekong’s (ODM) datahub. It is meant to be used to help guide users of ODM’s datahub through a process of assessing the risks and benefits involved with making a specific dataset available as open data. It was developed by Open Data Lab Jakarta of the World Wide Web Foundation, based on unique research conducted in each of Cambodia, Lao PDR, Myanmar, Thailand and Vietnam to understand the factors that help or hinder data sharing in each country, and how ODM’s network of organizations in the region are managing these risks already. In addition, a literature review was undertaken to form the theoretical basis of the toolkit. The framework of the toolkit consists of four steps: Identification of the knowledge assets; Identification of benefits for each stakeholder; Identification and quantification of risk variables; and Dealing with risks. It is comprised of the toolkit (in .xls) and documentation to assist the user (.pdf). This toolkit is currently available in English, and will be made available in Khmer, Laotian, Burmese, Thai, and Vietnamese shortly. The toolkit has been tested by multiple stakeholders to improve the usability of the product. However as circumstances change within the Mekong Region we welcome feedback from users of this toolkit to continue to improve its usability.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The global lab data management software market size was valued at USD 0.41 Billion in 2020 and is projected to reach USD 1.35 Billion by 2028, growing at a CAGR of 16.3% from 2021 to 2028. The growth of this market is attributed to the increasing adoption of laboratory information management systems (LIMS) by pharmaceutical and biotechnology companies, the growing need for data management and analysis in life science research, and the increasing need for compliance with regulatory requirements. The major drivers of the lab data management software market include the increasing adoption of laboratory information management systems (LIMS) by pharmaceutical and biotechnology companies, the growing need for data management and analysis in life science research, and the increasing need for compliance with regulatory requirements. The major restraints of the lab data management software market include the high cost of implementation and maintenance of LIMS, the lack of skilled professionals to manage and use LIMS, and the lack of interoperability between different LIMS systems. The major opportunities in the lab data management software market include the increasing adoption of cloud-based LIMS, the growing demand for LIMS in emerging markets, and the increasing development of new technologies for LIMS. This comprehensive report provides an in-depth analysis of the global Lab Data Management Software market, offering valuable insights into current market dynamics, growth opportunities, and future trends.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
As of 2023, the global lab data management software market size is estimated to be around USD 2.5 billion, with a projected growth to reach approximately USD 6.2 billion by 2032, reflecting a compound annual growth rate (CAGR) of 10.5%. This robust growth is driven by the rising demand for efficient and accurate data management in laboratory environments, coupled with advancements in technology and increasing investments in healthcare infrastructure.
A significant growth factor in the lab data management software market is the rapid advancement in healthcare technologies. Laboratories are increasingly relying on sophisticated software solutions to manage large volumes of data generated from various diagnostic and research activities. Innovations in artificial intelligence and machine learning have further enhanced the capabilities of these software systems, enabling them to offer more precise and faster data analysis. Additionally, the need for compliance with stringent regulatory requirements and the growing emphasis on patient safety and data integrity are compelling laboratories to adopt advanced data management solutions.
Another critical factor driving market growth is the increasing prevalence of chronic diseases and the consequent rise in diagnostic procedures. As the global population ages, the demand for diagnostic tests and clinical research is surging, necessitating efficient data management systems. Lab data management software helps streamline operations, reduce errors, and improve the overall efficiency of laboratories, thereby supporting the growing demand for diagnostic services. Furthermore, the ongoing trend of personalized medicine is also boosting the need for sophisticated data management solutions to handle complex and voluminous data.
The growing investments in research and development by pharmaceutical and biotechnology companies are also contributing to the expansion of the lab data management software market. These companies are continuously engaged in drug discovery and clinical trials, generating a vast amount of data that needs to be managed effectively. Lab data management software provides a centralized platform for storing, retrieving, and analyzing research data, which is crucial for accelerating drug development processes. Additionally, the integration of these software systems with other laboratory instruments and technologies enhances data accuracy and reliability, further driving their adoption.
In the context of laboratory environments, the implementation of a Labor Management System can significantly enhance operational efficiency and productivity. These systems are designed to optimize workforce allocation, streamline scheduling, and improve overall resource management within laboratories. By automating routine administrative tasks and providing real-time insights into workforce performance, a Labor Management System can help laboratories reduce operational costs and improve service delivery. This is particularly beneficial in high-demand settings where timely and accurate data processing is crucial. As laboratories continue to evolve with technological advancements, integrating a Labor Management System becomes essential for maintaining competitive advantage and ensuring optimal use of human resources.
From a regional perspective, North America is expected to dominate the lab data management software market during the forecast period, owing to the presence of a well-established healthcare infrastructure, high adoption of advanced technologies, and substantial investment in healthcare research. However, significant growth is also anticipated in the Asia Pacific region, driven by the rapid expansion of the healthcare sector, increasing government initiatives to improve healthcare facilities, and the growing focus on research and development activities in countries like China and India.
The lab data management software market can be segmented into two primary components: software and services. The software segment encompasses various types of laboratory information management systems (LIMS), electronic lab notebooks (ELNs), and other data management tools. These software solutions are designed to facilitate the efficient handling, storage, and analysis of laboratory data, ensuring accuracy, compliance, and streamlined workflows. The demand for advanced software solutions is growing, driven by the need for enhanced data management capabil
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectiveThe German Health Data Lab is going to provide access to German statutory health insurance claims data ranging from 2009 to the present for research purposes. Due to evolving data formats within the German Health Data Lab, there is a need to standardize this data into a Common Data Model to facilitate collaborative health research and minimize the need for researchers to adapt to multiple data formats. For this purpose we selected transforming the data to the Observational Medical Outcomes Partnership Common Data Model.MethodsWe developed an Extract, Transform, and Load (ETL) pipeline for two distinct German Health Data Lab data formats: Format 1 (2009-2016) and Format 3 (2019 onwards). Due to the identical format structure of Format 1 and Format 2 (2017 -2018), the ETL pipeline of Format 1 can be applied on Format 2 as well. Our ETL process, supported by Observational Health Data Sciences and Informatics tools, includes specification development, SQL skeleton creation, and concept mapping. We detail the process characteristics and present a quality assessment that includes field coverage and concept mapping accuracy using example data.ResultsFor Format 1, we achieved a field coverage of 92.7%. The Data Quality Dashboard showed 100.0% conformance and 80.6% completeness, although plausibility checks were disabled. The mapping coverage for the Condition domain was low at 18.3% due to invalid codes and missing mappings in the provided example data. For Format 3, the field coverage was 86.2%, with Data Quality Dashboard reporting 99.3% conformance and 75.9% completeness. The Procedure domain had very low mapping coverage (2.2%) due to the use of mocked data and unmapped local concepts The Condition domain results with 99.8% of unique codes mapped. The absence of real data limits the comprehensive assessment of quality.ConclusionThe ETL process effectively transforms the data with high field coverage and conformance. It simplifies data utilization for German Health Data Lab users and enhances the use of OHDSI analysis tools. This initiative represents a significant step towards facilitating cross-border research in Europe by providing publicly available, standardized ETL processes (https://github.com/FraunhoferMEVIS/ETLfromHDLtoOMOP) and evaluations of their performance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectiveThe German Health Data Lab is going to provide access to German statutory health insurance claims data ranging from 2009 to the present for research purposes. Due to evolving data formats within the German Health Data Lab, there is a need to standardize this data into a Common Data Model to facilitate collaborative health research and minimize the need for researchers to adapt to multiple data formats. For this purpose we selected transforming the data to the Observational Medical Outcomes Partnership Common Data Model.MethodsWe developed an Extract, Transform, and Load (ETL) pipeline for two distinct German Health Data Lab data formats: Format 1 (2009-2016) and Format 3 (2019 onwards). Due to the identical format structure of Format 1 and Format 2 (2017 -2018), the ETL pipeline of Format 1 can be applied on Format 2 as well. Our ETL process, supported by Observational Health Data Sciences and Informatics tools, includes specification development, SQL skeleton creation, and concept mapping. We detail the process characteristics and present a quality assessment that includes field coverage and concept mapping accuracy using example data.ResultsFor Format 1, we achieved a field coverage of 92.7%. The Data Quality Dashboard showed 100.0% conformance and 80.6% completeness, although plausibility checks were disabled. The mapping coverage for the Condition domain was low at 18.3% due to invalid codes and missing mappings in the provided example data. For Format 3, the field coverage was 86.2%, with Data Quality Dashboard reporting 99.3% conformance and 75.9% completeness. The Procedure domain had very low mapping coverage (2.2%) due to the use of mocked data and unmapped local concepts The Condition domain results with 99.8% of unique codes mapped. The absence of real data limits the comprehensive assessment of quality.ConclusionThe ETL process effectively transforms the data with high field coverage and conformance. It simplifies data utilization for German Health Data Lab users and enhances the use of OHDSI analysis tools. This initiative represents a significant step towards facilitating cross-border research in Europe by providing publicly available, standardized ETL processes (https://github.com/FraunhoferMEVIS/ETLfromHDLtoOMOP) and evaluations of their performance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectiveThe German Health Data Lab is going to provide access to German statutory health insurance claims data ranging from 2009 to the present for research purposes. Due to evolving data formats within the German Health Data Lab, there is a need to standardize this data into a Common Data Model to facilitate collaborative health research and minimize the need for researchers to adapt to multiple data formats. For this purpose we selected transforming the data to the Observational Medical Outcomes Partnership Common Data Model.MethodsWe developed an Extract, Transform, and Load (ETL) pipeline for two distinct German Health Data Lab data formats: Format 1 (2009-2016) and Format 3 (2019 onwards). Due to the identical format structure of Format 1 and Format 2 (2017 -2018), the ETL pipeline of Format 1 can be applied on Format 2 as well. Our ETL process, supported by Observational Health Data Sciences and Informatics tools, includes specification development, SQL skeleton creation, and concept mapping. We detail the process characteristics and present a quality assessment that includes field coverage and concept mapping accuracy using example data.ResultsFor Format 1, we achieved a field coverage of 92.7%. The Data Quality Dashboard showed 100.0% conformance and 80.6% completeness, although plausibility checks were disabled. The mapping coverage for the Condition domain was low at 18.3% due to invalid codes and missing mappings in the provided example data. For Format 3, the field coverage was 86.2%, with Data Quality Dashboard reporting 99.3% conformance and 75.9% completeness. The Procedure domain had very low mapping coverage (2.2%) due to the use of mocked data and unmapped local concepts The Condition domain results with 99.8% of unique codes mapped. The absence of real data limits the comprehensive assessment of quality.ConclusionThe ETL process effectively transforms the data with high field coverage and conformance. It simplifies data utilization for German Health Data Lab users and enhances the use of OHDSI analysis tools. This initiative represents a significant step towards facilitating cross-border research in Europe by providing publicly available, standardized ETL processes (https://github.com/FraunhoferMEVIS/ETLfromHDLtoOMOP) and evaluations of their performance.
This document includes data from International Development Innovation Network (IDIN) program monitoring and evaluation surveys from 2014-2017. IDIN was a program led by the Massachusetts Institute of Technology’s D-Lab, implemented by a global consortium of academic, institutional, and innovation center partners, and supported by USAID’s Higher Education Solutions Network in the U.S. Global Development Lab. Together with IDIN Network members and partners the D-Lab team worked to support innovators and entrepreneurs around the globe to design, develop, and disseminate technologies to improve the lives of people living in poverty. The program consisted of five components: design workshops and summits, innovation project funding, local innovation centers, research, and MIT student engagement.
PRIO is hosting a copy of this dataset with permission from Global Data Lab. Please see their webpage for more information about this data.