Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Electronic health records (EHRs) are a rich source of information for medical research and public health monitoring. Information systems based on EHR data could also assist in patient care and hospital management. However, much of the data in EHRs is in the form of unstructured text, which is difficult to process for analysis. Natural language processing (NLP), a form of artificial intelligence, has the potential to enable automatic extraction of information from EHRs and several NLP tools adapted to the style of clinical writing have been developed for English and other major languages. In contrast, the development of NLP tools for less widely spoken languages such as Swedish has lagged behind. A major bottleneck in the development of NLP tools is the restricted access to EHRs due to legitimate patient privacy concerns. To overcome this issue we have generated a citizen science platform for collecting artificial Swedish EHRs with the help of Swedish physicians and medical students. These artificial EHRs describe imagined but plausible emergency care patients in a style that closely resembles EHRs used in emergency departments in Sweden. In the pilot phase, we collected a first batch of 50 artificial EHRs, which has passed review by an experienced Swedish emergency care physician. We make this dataset publicly available as OpenChart-SE corpus (version 1) under an open-source license for the NLP research community. The project is now open for general participation and Swedish physicians and medical students are invited to submit EHRs on the project website (https://github.com/Aitslab/openchart-se), where additional batches of quality-controlled EHRs will be released periodically.
Dataset content
OpenChart-SE, version 1 corpus (txt files and and dataset.csv)
The OpenChart-SE corpus, version 1, contains 50 artificial EHRs (note that the numbering starts with 5 as 1-4 were test cases that were not suitable for publication). The EHRs are available in two formats, structured as a .csv file and as separate textfiles for annotation. Note that flaws in the data were not cleaned up so that it simulates what could be encountered when working with data from different EHR systems. All charts have been checked for medical validity by a resident in Emergency Medicine at a Swedish hospital before publication.
Codebook.xlsx
The codebook contain information about each variable used. It is in XLSForm-format, which can be re-used in several different applications for data collection.
suppl_data_1_openchart-se_form.pdf
OpenChart-SE mock emergency care EHR form.
suppl_data_3_openchart-se_dataexploration.ipynb
This jupyter notebook contains the code and results from the analysis of the OpenChart-SE corpus.
More details about the project and information on the upcoming preprint accompanying the dataset can be found on the project website (https://github.com/Aitslab/openchart-se).
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
The Open Database of Healthcare Facilities (ODHF) is a collection of open data containing the names, types, and locations of health facilities across Canada. It is released under the Open Government License - Canada. The ODHF compiles open, publicly available, and directly-provided data on health facilities across Canada. Data sources include regional health authorities, provincial, territorial and municipal governments, and public health and professional healthcare bodies. This database aims to provide enhanced access to a harmonized listing of health facilities across Canada by making them available as open data. This database is a component of the Linkable Open Data Environment (LODE).
https://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html
Health economic models are crucial for health technology assessment (HTA) to evaluate the value of medical interventions. Open source models (OSMs), where source code and calculations are publicly accessible, enhance transparency, efficiency, credibility, and reproducibility. This study systematically reviews databases to map the landscape of available OSMs in health economics.
The SWAN Public Use Datasets provide access to longitudinal data describing the physical, biological, psychological, and social changes that occur during the menopausal transition. Data collected from 3,302 SWAN participants from Baseline through the 10th Annual Follow-Up visit are currently available to the public. Registered users are able to download datasets in a variety of formats, search variables and view recent publications.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The dataset contains multi-modal data from over 70,000 open access and de-identified case reports, including metadata, clinical cases, image captions and more than 130,000 images. Images and clinical cases belong to different medical specialties, such as oncology, cardiology, surgery and pathology. The structure of the dataset allows to easily map images with their corresponding article metadata, clinical case, captions and image labels. Details of the data structure can be found in the file data_dictionary.csv.
More than 90,000 patients and 280,000 medical doctors and researchers were involved in the creation of the articles included in this dataset. The citation data of each article can be found in the metadata.parquet file.
Refer to the examples showcased in this GitHub repository to understand how to optimize the use of this dataset.The license of the dataset as a whole is CC BY-NC-SA. However, its individual contents may have less restrictive license types (CC BY, CC BY-NC, CC0). For instance, regarding image filess, 66K of them are CC BY, 32K are CC BY-NC-SA, 32K are CC BY-NC, and 20 of them are CC0.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Digital Response to COVID-19 gives you access to a large resource database including open source software, websites, and platforms that are useful for public administrations, businesses, and citizens dealing with the ongoing crisis:
The listed solutions and resources cover a wide range of areas from public health, education, e-events, and volunteering, to collaboration opportunities, among others.
The collection is constantly evolving and being updated with newly available digital solutions to tackle COVID-19 crisis.
MONAHRQ® is a desktop software tool that enables organizations—such as state and local data organizations, regional reporting collaborations, hospitals and hospital systems, nursing homes and nursing home organizations, and health plans—to quickly and easily generate a health care reporting website. Effective September 27, 2017, technical support and software updates will no longer be available. Version 7, build 5, will be the final update. Existing software and supporting materials will remain available on this site. In addition, the open source project will remain active with software and materials available through GitHub: https://github.com/AHRQ/MONAHRQ-Open-Source
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Make Open Data is an open source initiative to facilitate the transformation of public data by centralizing logic. Here is a table providing the number of health professionals (all sex, age) by specialty and department for the year 2022 Data catalogue: Catalogue link Make Open Data source_url: https://data.ameli.fr/api/explore/v2.1/catalog/datasets/demography-effective-and-densites/exports/csv source_reference: https://www.data.gouv.fr/en/datasets/professionals-de-sante-liberaux-effectif-et-densite-par-tranche-dage-sexe-et-territoire-departement-region/#/resources This is a construction and collaborative project: Repo Make Open Data link
The Synthetic Patient Data in OMOP Dataset is a synthetic database released by the Centers for Medicare and Medicaid Services (CMS) Medicare Claims Synthetic Public Use Files (SynPUF). It is synthetic data containing 2008-2010 Medicare insurance claims for development and demonstration purposes. It has been converted to the Observational Medical Outcomes Partnership (OMOP) common data model from its original form, CSV, by the open source community as released on GitHub Please refer to the CMS Linkable 2008–2010 Medicare Data Entrepreneurs’ Synthetic Public Use File (DE-SynPUF) User Manual for details regarding how DE-SynPUF was created." This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
http://www.opendefinition.org/licenses/cc-by-sahttp://www.opendefinition.org/licenses/cc-by-sa
This dataset contains agency- and open source events affecting public-health related programmes, including COVID-19, Ebola, and vaccination campaigns. published in the Attacks on Health Care News Brief. Categorized by country. Please get in touch if you are interested in curated datasets: info@insecurityinsight.org.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The open-source data acquisition (DAQ) instrument market, currently valued at $545 million in 2025, is projected to experience robust growth, driven by a Compound Annual Growth Rate (CAGR) of 5.5% from 2025 to 2033. This expansion is fueled by several key factors. The increasing adoption of open-source hardware and software in research, industrial automation, and consumer electronics reduces development costs and fosters innovation. Furthermore, the rising demand for customized and flexible data acquisition solutions across diverse sectors, such as medical, industrial, automotive, and agriculture, is significantly boosting market growth. The availability of versatile platforms like Arduino and readily accessible online communities supporting open-source DAQ instruments contributes to broader accessibility and faster adoption. The General Purpose Collection Instrument segment currently holds a larger market share compared to Special Purpose instruments, due to its wider applicability across various applications. However, the Special Purpose segment is expected to witness faster growth in the coming years, driven by niche applications requiring highly specialized functionalities. Geographic expansion, particularly in rapidly developing economies of Asia Pacific and other emerging markets, further contributes to the market's growth trajectory. However, challenges persist. Competition from established players offering proprietary DAQ solutions necessitates continuous innovation and community engagement to maintain open-source relevance and attract users. Ensuring consistent quality, reliability, and support for open-source hardware and software remains crucial for sustaining market trust and driving widespread adoption. Furthermore, the need for specialized skills and expertise to effectively utilize and customize open-source DAQ instruments may pose a barrier to entry for some users. Addressing these challenges through enhanced documentation, user-friendly interfaces, and robust community support will be essential to ensure the continued growth and success of this dynamic market segment.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The Electronic Health Records (EHR) software market is experiencing robust growth, driven by increasing government mandates for electronic health data, rising healthcare expenditure, and the expanding adoption of cloud-based solutions for enhanced accessibility and scalability. The market's expansion is fueled by a significant shift towards value-based care, demanding efficient data management and analysis for improved patient outcomes and reduced healthcare costs. This necessitates the implementation of sophisticated EHR systems capable of handling large volumes of patient data, integrating with other healthcare IT systems, and providing advanced analytics capabilities. Key segments within the market include open-source and non-open-source software, catering to diverse needs and budget constraints among healthcare providers. Hospitals and large clinical settings are major adopters, but smaller practices are also increasingly embracing EHR solutions, driven by streamlined workflows and improved patient engagement features. The market is highly competitive, with a mix of established players like Epic Systems, Cerner, and Allscripts Healthcare Solutions alongside smaller, specialized vendors offering niche solutions. Continued innovation in areas such as artificial intelligence (AI) for predictive analytics, telehealth integration, and interoperability enhancements will further shape market growth in the coming years. Regional variations exist, with North America maintaining a significant market share due to early adoption and high technological advancements, but other regions, particularly Asia-Pacific, are experiencing rapid growth. The competitive landscape is dynamic, with vendors focusing on strategic partnerships, acquisitions, and continuous product development to maintain a competitive edge. Challenges remain, however, including the high initial investment costs associated with EHR implementation, the complexity of data migration and integration, and the ongoing need for robust cybersecurity measures to protect sensitive patient information. Addressing these challenges will be crucial for sustaining market growth and ensuring the widespread adoption of EHR systems, leading to a more efficient and patient-centered healthcare system. The projected Compound Annual Growth Rate (CAGR) suggests a consistently expanding market, promising significant opportunities for both established players and new entrants. However, market success will hinge on the ability to deliver user-friendly, interoperable, and secure EHR solutions that meet the evolving needs of the healthcare industry.
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. Here we present Medical Information Mart for Intensive Care (MIMIC)-IV, a large deidentified dataset of patients admitted to the emergency department or an intensive care unit at the Beth Israel Deaconess Medical Center in Boston, MA. MIMIC-IV contains data for over 65,000 patients admitted to an ICU and over 200,000 patients admitted to the emergency department. MIMIC-IV incorporates contemporary data and adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.
https://borealisdata.ca/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.5683/SP2/TYRRMVhttps://borealisdata.ca/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.5683/SP2/TYRRMV
Design document for an open-source, open-standard web-service that supports extraction and translation of Change Tracking (CT) data from the enterprise-level database engine that supports an electronic medical record (or similar health information system), and makes it available for consumption by a secure Learning Record Store (LRS).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
It contains data sets from Weber et al, Journal of Cell Science, 2024 (https://doi.org/10.1242/jcs.262166)
The details for citation provided in the README file.
Please cite this item as:
Florian Weber, Sofiia Iskrak, Franziska Ragaller, Jan Schlegel, Birgit Plochberger, Erdinc Sezgin, Luca A. Andronico
DOI: 10.17044/scilifelab.26233184
It contains microscopy images and excel sheets of the data.
Abstract: Environment-sensitive probes are frequently used in spectral/multi-channel microscopy to study alterations in cell homeostasis. However, the few open-source packages available for processing of spectral images are limited in scope. Here, we present VISION, a stand-alone software based on Python for spectral analysis with improved applicability. In addition to classical intensity-based analysis, our software can batch-process multidimensional images with an advanced single-cell segmentation capability and apply user-defined mathematical operations on spectra to calculate biophysical and metabolic parameters of single cells. VISION allows for 3D and temporal mapping of properties such as membrane fluidity and mitochondrial potential. We demonstrate the broad applicability of VISION by applying it to study the effect of various drugs on cellular biophysical properties; the correlation between membrane fluidity and mitochondrial potential; protein distribution in cell-cell contacts; and properties of nanodomains in cell-derived vesicles. Together with the code, we provide a graphical user interface for facile adoption.
Data usage
Researchers are welcome to use the data contained in the dataset for any projects. Please cite this item upon use or when published. We encourage reuse using the same CC BY 4.0 License.
Data Content
lsm or czi files for confocal and spectral images
tif files for super-resolution images
Excel files for graphs
Software to open files
.xlsx - Microsoft Excel
.tif, .lsm, .czi - Fiji (https://imagej.net/software/fiji/)
http://www.opendefinition.org/licenses/cc-by-sahttp://www.opendefinition.org/licenses/cc-by-sa
This dataset contains agency- and open source events affecting public-health related programmes, including COVID-19, Ebola, and vaccination campaigns. published in the Attacks on Health Care News Brief. Categorized by country. Please get in touch if you are interested in curated datasets: info@insecurityinsight.org.
https://physionet.org/about/duas/medical-ai-foundations/https://physionet.org/about/duas/medical-ai-foundations/
Medical AI Research Foundations is a repository of open-source medical foundation models. With this collection of non-diagnostic models, APIs, and resources like code and data, researchers and developers can accelerate their medical AI research. This is a clear unmet need as currently there is no central resource today that developers and researchers can leverage to build medical AI and as such, this has slowed down both research and translation efforts. Our goal is to democratize access to foundational medical AI models, and help researchers and medical AI developers rapidly build new solutions. To this end, we open-sourced REMEDIS code-base and we are currently hosting REMEDIS models for chest x-ray and pathology. We expect to add more models and resources for training medical foundation models such as datasets and benchmarks in the future. We also welcome the medical AI research community to contribute to this.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The database contains information on COVID-19 cases and deaths and its trends over time and how they compare between locations or regions that have similar case loads. It also contains SARS-CoV-2 Variant Reports that are real-time surveillance reports to track the prevalence of these variants or any arbitrary combination of mutations. The catalogue contains a COVID-19 research library that includes a searchable library of COVID-19 and SARS-CoV-2 publications, datasets, clinical trials, protocols, and more, updated daily. The data are downloadable via APIs.
In this study, we conducted a time-motion study observing healthcare workers (HCWs) completing data management activities including monitoring and evaluation (M&E) and manual data linkage of individual-level app data to electronic medical records (EMRS). This study served as a baseline study for an open-source app to mirror EMRS and reduce HCW workload while improving care in the Nurse-led Community-based Antiretroviral therapy Program (NCAP) in Lilongwe, Malawi. , , , # The workload of manual data entry for integration between mobile health applications and eHealth infrastructure
Corresponding author (Caryl Feldacker): cfeld@uw.edu
Data will be made available at Dryad upon acceptance at this link: https://doi.org/10.5061/dryad.k0p2ngfdz
Data dictionary
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Electronic health records (EHRs) are a rich source of information for medical research and public health monitoring. Information systems based on EHR data could also assist in patient care and hospital management. However, much of the data in EHRs is in the form of unstructured text, which is difficult to process for analysis. Natural language processing (NLP), a form of artificial intelligence, has the potential to enable automatic extraction of information from EHRs and several NLP tools adapted to the style of clinical writing have been developed for English and other major languages. In contrast, the development of NLP tools for less widely spoken languages such as Swedish has lagged behind. A major bottleneck in the development of NLP tools is the restricted access to EHRs due to legitimate patient privacy concerns. To overcome this issue we have generated a citizen science platform for collecting artificial Swedish EHRs with the help of Swedish physicians and medical students. These artificial EHRs describe imagined but plausible emergency care patients in a style that closely resembles EHRs used in emergency departments in Sweden. In the pilot phase, we collected a first batch of 50 artificial EHRs, which has passed review by an experienced Swedish emergency care physician. We make this dataset publicly available as OpenChart-SE corpus (version 1) under an open-source license for the NLP research community. The project is now open for general participation and Swedish physicians and medical students are invited to submit EHRs on the project website (https://github.com/Aitslab/openchart-se), where additional batches of quality-controlled EHRs will be released periodically.
Dataset content
OpenChart-SE, version 1 corpus (txt files and and dataset.csv)
The OpenChart-SE corpus, version 1, contains 50 artificial EHRs (note that the numbering starts with 5 as 1-4 were test cases that were not suitable for publication). The EHRs are available in two formats, structured as a .csv file and as separate textfiles for annotation. Note that flaws in the data were not cleaned up so that it simulates what could be encountered when working with data from different EHR systems. All charts have been checked for medical validity by a resident in Emergency Medicine at a Swedish hospital before publication.
Codebook.xlsx
The codebook contain information about each variable used. It is in XLSForm-format, which can be re-used in several different applications for data collection.
suppl_data_1_openchart-se_form.pdf
OpenChart-SE mock emergency care EHR form.
suppl_data_3_openchart-se_dataexploration.ipynb
This jupyter notebook contains the code and results from the analysis of the OpenChart-SE corpus.
More details about the project and information on the upcoming preprint accompanying the dataset can be found on the project website (https://github.com/Aitslab/openchart-se).