MIMIC-IV ICD-10 contains 122,279 discharge summaries—free-text medical documents—annotated with ICD-10 diagnosis and procedure codes. It contains data for patients admitted to the Beth Israel Deaconess Medical Center emergency department or ICU between 2008-2019. All codes with fewer than ten examples have been removed, and the train-val-test split was created using multi-label stratified sampling. The dataset is described further in Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study, and the code to use the dataset is found here.
The dataset is intended for medical code prediction and was created using MIMIC-IV v2.2 and MIMIC-IV-NOTE v2.2. Using the two datasets requires a license obtained in Physionet; this can take a couple of days.
generative-technologies/synth-ehr-icd10-alpaca-format dataset hosted on Hugging Face and contributed by the HF Datasets community
This dataset contains the International Classification of Diseases, Clinical Modification, 10th Edition (ICD-10-CM) 2020 files that contain information on the new diagnosis coding system, ICD-10-CM, that is a replacement for ICD-9-CM, Volumes 1 and 2. These 2020 ICD-10-CM codes are to be used for services provided from October 1, 2019 through September 30, 2020.
This dataset contains the International Classification of Diseases, Clinical Modification, 10th Edition (ICD-10-CM) 2023 files that contain information on the new diagnosis coding system, ICD-10-CM, that is a replacement for ICD-9-CM, Volumes 1 and 2. These 2023 ICD-10-CM codes are to be used for discharges occurring from October 1, 2022 through September 30, 2023 and for patient encounters occurring from October 1, 2022 through September 30, 2023.
This dataset contains the International Classification of Diseases, Clinical Modification, 10th Edition (ICD-10-CM) 2016 files that contain information on the new diagnosis coding system, ICD-10-CM, that is a replacement for ICD-9-CM, Volumes 1 and 2. These 2016 ICD-10-CM codes are to be used for services provided from October 1, 2015 through September 30, 2016.
chemouda/ICD10 dataset hosted on Hugging Face and contributed by the HF Datasets community
There are 2 datasets of high-risk patient populations; one from calendar year 2014 (N1 = 937,407), for which we used International Classification of Disease Version 9 (ICD9) codes to identify comorbid conditions, and a second, more recent population selected from June 2017 to June 2018 (N2 = 979,607) for use with the newer International Classification of Disease Version 10 (ICD10) codes. DOI: 10.1109/JBHI.2019.2948734
The International Classification of Diseases 10th Revision is a medical classification list by the WHO for coding various diseases and conditions
The Diuagnostics and Statistical Manual of Mental Discorders, Fifth Edition, is used by clinicians to diagnose mental disorders
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
JSON file with abstracts from Lilacs and Ibecs with ICD10 codes (ICD10-CM and ICD10-PCS) associated to them (CIE10 in Spanish).
Please, cite us:
Miranda-Escalada, A., Gonzalez-Agirre, A., Armengol-Estapé, J., Krallinger, M.: Overview of automatic clinical coding: annotations, guidelines, and solutions for non-English clinical cases at CodiEsp track of eHealth CLEF 2020. In: CLEF (Working Notes) (2020)
@inproceedings{miranda2020overview, title={Overview of automatic clinical coding: annotations, guidelines, and solutions for non-english clinical cases at codiesp track of CLEF eHealth 2020}, author={Miranda-Escalada, Antonio and Gonzalez-Agirre, Aitor and Armengol-Estap{\'e}, Jordi and Krallinger, Martin}, booktitle={Working Notes of Conference and Labs of the Evaluation (CLEF) Forum. CEUR Workshop Proceedings}, year={2020} }
Lilacs and Ibecs databases have MeSH terms describing some of their documents. Then, using UMLS Metathesaurus, those MeSH terms have been translated into ICD10 codes (ICD10-CM and ICD10-PCS). Every abstract have at least one ICD10 code.
In addition, MeSH codes given by the databases (Lilacs and Ibecs) have a "word" describing them. These "words" have been used to add further ICD10 codes. We have done strict string matching to find whether those "words" were a descriptor of any ICD10 code (in the Spanish version, CIE10).
The format of the JSON file is the following:
{'articles': [{'title': 'title', 'pmid': 'pmid', 'abstractText': 'abtract (in Spanish)', 'Mesh': [{'Code': 'MeSHCode', 'Word': 'reference', 'CIE': [CIE10_1, CIE10_2, ...]}, ...] }, ...] }
Additionally, the compressed file includes a folder with all the abstracts extracted in individual UTF-8 encoded text files and a tab-separated file with 4 fields:
pmid label cie10-code word
Summary statistics:
number of abstracts: 355 840
number abstracts with at least one ICD10 code: 176 294
Percentage of MeSH codes mapped to ICD10: 10.6% (there were 2 526 772 MeSH codes and 266 949 mapped to ICD10)
average number of MeSH codes per article: 7.1
average number of ICD10 codes per article: 2.5
number of ICD10 codes that have an associated MeSH code in UMLS: 3293
number of ICD10 codes that have an associated MeSH code in UMLS and appear in this dataset: 3082
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The MeSDiCon consists of a list or gazetteer of candidate names of diseases and symptoms mentioned in Spanish clinical texts. Thus MeSDiCon serves as a lexical resource or dictionary for automatic detection of disease/symptom mentions, as well as indexing or classification of medical texts with such concept types. Terms in MeSDiCon were mapped to MESH terminology.
In this subset, we have mapped MESH codes to ICD10-CM and ICD10-PCS through UMLS Metathesaurus. Then, this resource contains diseases and symptoms terms from Spanish clinical texts mapped to MESH and ICD10.
File structure. TSV. Data is separated by tabs (\t). Every row of the file has the following fields:terminology identifier translatedTerm termCount documentCount ICD10CM-code ICD10PCS-code
In case one MESH term is mapped to more than one ICD10 code, they are separated by commas.
This dataset contains the International Classification of Diseases, Clinical Modification, 10th edition Age Restriction Database contains information on the age-restricted diagnosis codes of the diagnoses coding system, ICD-10-CM.
The table ICD9 CCS Neuro-Neurosurgery is part of the dataset ICD9 and ICD10 Neuro-Neurosurgery Codes CCS, available at https://redivis.com/datasets/6v7y-b8rx0vh7z. It contains 3948 rows across 8 variables.
The table ICD10 Descriptions is part of the dataset Hospitalarios Secretaria de Salud, 2008-2023, available at https://redivis.com/datasets/bq13-2xzrxkrw2. It contains 14498 rows across 78 variables.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Dataset containing 8,386 non-technical summaries (NTS) of animal experiments recently carried out in Germany (as of September 19, 2018) and originally on-line available at the AnimalTestInfo database (http://animaltestinfo.de). Each NTS contains a title, uses (goals) of the experiments, possible harms caused to the animals, and comments about replacement, reduction and refinement (in the scope of the 3R principles). All documents are in the German language. The dataset includes the ICD-10 codes manually assigned by experts to the NTS. However, some NTSs have no ICD-10 codes assigned to them, as the codes were not applicable to the uses described in the NTS. All codes are chapters or groups from the ICD-10 German Modification 2016 version (https://www.dimdi.de/static/de/klassifikationen/icd/icd-10-gm/kode-suche/htmlgm2016/). Finally, the dataset is split into training and development datasets which are meant to be used in the CLEF eHealth 2019, Task 1 - Multilingual Information Extraction (https://sites.google.com/view/clefehealth2019/task-1-multilingual-information-extraction-icd10-coding).
https://www.futuremarketinsights.com/privacy-policyhttps://www.futuremarketinsights.com/privacy-policy
The global ICD-10 market was valued US$ 18.78 billion in 2022, and it is expected to grow at a CAGR of 10.0% over the forecast period. By 2032, the global market is expected to be worth US$ 18.78 billion. The growing requirement for a uniform language in medical documentation to streamline hospital billing operations is driving market expansion.
Attributes | Details |
---|---|
ICD-10 Market CAGR | 10% |
ICD-10 Market Size 2022 | US$ 18.78 billion |
ICD-10 Market Size 2032 | US$ 18.78 billion |
International Statistical Classification of Diseases and Related Health Problems (ICD-10). 10th rev. Geneva
MIMIC-IV ICD-10 contains 122,279 discharge summaries—free-text medical documents—annotated with ICD-10 diagnosis and procedure codes. It contains data for patients admitted to the Beth Israel Deaconess Medical Center emergency department or ICU between 2008-2019. All codes with fewer than ten examples have been removed, and the train-val-test split was created using multi-label stratified sampling. The dataset is described further in Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study, and the code to use the dataset is found here.
The dataset is intended for medical code prediction and was created using MIMIC-IV v2.2 and MIMIC-IV-NOTE v2.2. Using the two datasets requires a license obtained in Physionet; this can take a couple of days.