MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Supplementary data for the paper "Quality Assurance of a German COVID-19 Question Answering Systems using Component-based Microbenchmarking" at the 15th ACM International WSDM Conference (WSDM 2022).Abstract: Question Answering (QA) has become an often used method to retrieve data as part of chatbots and other natural-language user interfaces. In particular, QA systems of official institutions have high expectations regarding the answers computed by the system, as the provided information might be critical. In this demonstration, we use the official COVID-19 QA system that was developed together with the German Federal government to provide German citizens access to data regarding incident values, number of deaths, etc. To ensure high quality, a component-based approach was used that enables exchanging data between QA components using RDF and validating the functionality of the QA system using SPARQL. Here, we will demonstrate how our solution enables developers of QA systems to use a descriptive approach to validate the quality of their implementation before the system's deployment and also within a live environment.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This table provides a consolidated view for the statistics of phone calls received by the Contact Center "16000" related to Coronavirus 2019 (COVID-19), highlighting the following indicators:Number of Incoming CallsNumber of Answered CallsAverage Speed of AnswerAverage Handle TimeAverage On-Hold TimeEfficiency Rate
Dataset Card for "COVID-QA-question-answering-biencoder-data-75_25"
More Information needed
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘COVID-19 in Italy’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sudalairajkumar/covid19-in-italy on 28 January 2022.
--- Dataset description provided by original source is as follows ---
Coronaviruses are a large family of viruses which may cause illness in animals or humans. In humans, several coronaviruses are known to cause respiratory infections ranging from the common cold to more severe diseases such as Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS). The most recently discovered coronavirus causes coronavirus disease COVID-19 - WHO
People can catch COVID-19 from others who have the virus. This has been spreading rapidly around the world and Italy is one of the most affected country.
On March 8, 2020 - Italy’s prime minister announced a sweeping coronavirus quarantine early Sunday, restricting the movements of about a quarter of the country’s population in a bid to limit contagions at the epicenter of Europe’s outbreak. - TIME
This dataset is from https://github.com/pcm-dpc/COVID-19
collected by Sito del Dipartimento della Protezione Civile - Emergenza Coronavirus: la risposta nazionale
This dataset has two files
covid19_italy_province.csv
- Province level data of COVID-19 casescovid_italy_region.csv
- Region level data of COVID-19 casesData is collected by Sito del Dipartimento della Protezione Civile - Emergenza Coronavirus: la risposta nazionale and is uploaded into this github repo.
Dashboard on the data can be seen here. Picture courtesy is from the dashboard.
Insights on * Spread to various regions over time * Try to predict the spread of COVID-19 ahead of time to take preventive measures
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan MHLW: COVID-19: PCR: Confirmed: QA data was reported at 24,164.000 Person in 08 May 2023. This records an increase from the previous number of 24,147.000 Person for 07 May 2023. Japan MHLW: COVID-19: PCR: Confirmed: QA data is updated daily, averaging 4,322.000 Person from Mar 2020 (Median) to 08 May 2023, with 1143 observations. The data reached an all-time high of 24,164.000 Person in 08 May 2023 and a record low of 16.000 Person in 22 Mar 2020. Japan MHLW: COVID-19: PCR: Confirmed: QA data remains active status in CEIC and is reported by Ministry of Health, Labour and Welfare. The data is categorized under High Frequency Database’s Disease Outbreaks – Table JP.D001: Ministry of Health, Labour and Welfare: Coronavirus Disease 2019 (COVID-2019).
Dataset Card for "COVID-QA-Chunk-64-testset-biencoder-data-65_25_10-v2"
More Information needed
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan MHLW: COVID-19: PCR: Confirmed: QA: Under Diagnosing data was reported at 0.000 Person in 08 May 2020. This stayed constant from the previous number of 0.000 Person for 07 May 2020. Japan MHLW: COVID-19: PCR: Confirmed: QA: Under Diagnosing data is updated daily, averaging 0.000 Person from Mar 2020 (Median) to 08 May 2020, with 48 observations. The data reached an all-time high of 0.000 Person in 08 May 2020 and a record low of 0.000 Person in 08 May 2020. Japan MHLW: COVID-19: PCR: Confirmed: QA: Under Diagnosing data remains active status in CEIC and is reported by Ministry of Health, Labour and Welfare. The data is categorized under High Frequency Database’s Disease Outbreaks – Table JP.D001: Ministry of Health, Labour and Welfare: Coronavirus Disease 2019 (COVID-2019).
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.
Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.
This case surveillance public use dataset has 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors, and no geographic data.
The COVID-19 case surveillance database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and reported voluntarily to CDC.
For more information:
NNDSS Supports the COVID-19 Response | CDC.
The deidentified data in the “COVID-19 Case Surveillance Public Use Data” include demographic characteristics, any exposure history, disease severity indicators and outcomes, clinical data, laboratory diagnostic test results, and presence of any underlying medical conditions and risk behaviors. All data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
COVID-19 case reports have been routinely submitted using nationally standardized case reporting forms. On April 5, 2020, CSTE released an Interim Position Statement with national surveillance case definitions for COVID-19 included. Current versions of these case definitions are available here: https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-2021/.
All cases reported on or after were requested to be shared by public health departments to CDC using the standardized case definitions for laboratory-confirmed or probable cases. On May 5, 2020, the standardized case reporting form was revised. Case reporting using this new form is ongoing among U.S. states and territories.
To learn more about the limitations in using case surveillance data, visit FAQ: COVID-19 Data and Surveillance.
CDC’s Case Surveillance Section routinely performs data quality assurance procedures (i.e., ongoing corrections and logic checks to address data errors). To date, the following data cleaning steps have been implemented:
To prevent release of data that could be used to identify people, data cells are suppressed for low frequency (<5) records and indirect identifiers (e.g., date of first positive specimen). Suppression includes rare combinations of demographic characteristics (sex, age group, race/ethnicity). Suppressed values are re-coded to the NA answer option; records with data suppression are never removed.
For questions, please contact Ask SRRG (eocevent394@cdc.gov).
COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths by state and by county. These
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan MHLW: COVID-19: PCR: Confirmed: QA: IS: Midst of Diagnosis of Symp data was reported at 0.000 Person in 08 May 2020. This stayed constant from the previous number of 0.000 Person for 07 May 2020. Japan MHLW: COVID-19: PCR: Confirmed: QA: IS: Midst of Diagnosis of Symp data is updated daily, averaging 0.000 Person from Mar 2020 (Median) to 08 May 2020, with 42 observations. The data reached an all-time high of 0.000 Person in 08 May 2020 and a record low of 0.000 Person in 08 May 2020. Japan MHLW: COVID-19: PCR: Confirmed: QA: IS: Midst of Diagnosis of Symp data remains active status in CEIC and is reported by Ministry of Health, Labour and Welfare. The data is categorized under High Frequency Database’s Disease Outbreaks – Table JP.D001: Ministry of Health, Labour and Welfare: Coronavirus Disease 2019 (COVID-2019).
----------------------UPDATED------UPDATED---------UPDATED----------------------- ----------------------------- (3616 COVID-19 Chest X-ray) -------------------------------
A team of researchers from Qatar University, Doha, Qatar, and the University of Dhaka, Bangladesh along with their collaborators from Pakistan and Malaysia in collaboration with medical doctors have created a database of chest X-ray images for COVID-19 positive cases along with Normal and Viral Pneumonia images. This COVID-19, normal, and other lung infection dataset is released in stages. In the first release, we have released 219 COVID-19, 1341 normal, and 1345 viral pneumonia chest X-ray (CXR) images. In the first update, we have increased the COVID-19 class to 1200 CXR images. In the 2nd update, we have increased the database to 3616 COVID-19 positive cases along with 10,192 Normal, 6012 Lung Opacity (Non-COVID lung infection), and 1345 Viral Pneumonia images. We will continue to update this database as soon as we have new x-ray images for COVID-19 pneumonia patients.
-M.E.H. Chowdhury, T. Rahman, A. Khandakar, R. Mazhar, M.A. Kadir, Z.B. Mahbub, K.R. Islam, M.S. Khan, A. Iqbal, N. Al-Emadi, M.B.I. Reaz, M. T. Islam, “Can AI help in screening Viral and COVID-19 pneumonia?” IEEE Access, Vol. 8, 2020, pp. 132665 - 132676. Paper link -Rahman, T., Khandakar, A., Qiblawey, Y., Tahir, A., Kiranyaz, S., Kashem, S.B.A., Islam, M.T., Maadeed, S.A., Zughaier, S.M., Khan, M.S. and Chowdhury, M.E., 2020. Exploring the Effect of Image Enhancement Techniques on COVID-19 Detection using Chest X-ray Images. Paper Link
To view images please check image folders and references of each image are provided in the metadata.xlsx.
*****Research Team members and their affiliation***** Muhammad E. H. Chowdhury, PhD (mchowdhury@qu.edu.qa) Department of Electrical Engineering, Qatar University, Doha-2713, Qatar Tawsifur Rahman (tawsifurrahman.1426@gmail.com) Department of Biomedical Physics & Technology, University of Dhaka, Dhaka-1000, Bangladesh Amith Khandakar (amitk@qu.edu.qa) Department of Electrical Engineering, Qatar University, Doha-2713, Qatar Rashid Mazhar, MD Thoracic Surgery, Hamad General Hospital, Doha-3050, Qatar Muhammad Abdul Kadir, PhD Department of Biomedical Physics & Technology, University of Dhaka, Dhaka-1000, Bangladesh Zaid Bin Mahbub, PHD Department of Mathematics and Physics, North South University, Dhaka-1229, Bangladesh Khandakar R. Islam, MD Department of Orthodontics, Bangabandhu Sheikh Mujib Medical University, Dhaka-1000, Bangladesh Muhammad Salman Khan, PhD Department of Electrical Engineering (JC), University of Engineering and Technology, Peshawar-25120, Pakistan Prof. Atif Iqbal, PhD Department of Electrical Engineering, Qatar University, Doha-2713, Qatar Nasser Al-Emadi, PhD Department of Electrical Engineering, Qatar University, Doha-2713, Qatar Prof. Mamun Bin Ibne Reaz. PhD Department of Electrical, Electronic & Systems Engineering, Universiti Kebangsaan Malaysia, Bangi, Selangor 43600, Malaysia
****Contribution**** - We have developed the database of COVID-19 x-ray images from the Italian Society of Medical and Interventional Radiology (SIRM) COVID-19 DATABASE [1], Novel Corona Virus 2019 Dataset developed by Joseph Paul Cohen and Paul Morrison, and Lan Dao in GitHub [2] and images extracted from 43 different publications. References of each image are provided in the metadata. Normal and Viral pneumonia images were adopted from the Chest X-Ray Images (pneumonia) database [3].
Image Formats - All the images are in Portable Network Graphics (PNG) file format and the resolution are 299*299 pixels.
Objective - Researchers can use this database to produce useful and impactful scholarly work on COVID-19, which can help in tackling this pandemic.
Citation - Please cite these papers if you are using it for any scientific purpose: -M.E.H. Chowdhury, T. Rahman, A. Khandakar, R. Mazhar, M.A. Kadir, Z.B. Mahbub, K.R. Islam, M.S. Khan, A. Iqbal, N. Al-Emadi, M.B.I. Reaz, M. T. Islam, “Can AI help in screening Viral and COVID-19 pneumonia?” IEEE Access, Vol. 8, 2020, pp. 132665 - 132676. Paper link -Rahman, T., Khandakar, A., Qiblawey, Y., Tahir, A., Kiranyaz, S., Kashem, S.B.A., Islam, M.T., Maadeed, S.A., Zughaier, S.M., Khan, M.S. and Chowdhury, M.E., 2020. Exploring the Effect of Image Enhancement Techniques on COVID-19 Detection using Chest X-ray Images. Paper Link
Acknowledgments
Thanks to the Italian Society of Medical and Interventional Radiology (SIRM) for publicly providing the COVID-19 Chest X-Ray dataset [3], Valencia Region Image Bank (BIMCV) padchest dataset [1] and would like to thank J. P. Cohen for taking the initiative to gather images from articles and online resources [5]. Finally to the Chest X-Ray Images (pneumonia) database in Kaggle and Radiological Society of North America (RSNA) Kaggle database for making a wonderful X-ray database for normal, lung opacity, viral, and bacterial pneumonia images [8-9]. Also, a big thanks to our collaborators!
DATA ACCESS AND USE: Academic/Non-Commercial Use
References:
[1]https://bimcv.cipf.es/bimcv-projects/bimcv-covid19/#1590858128006-9e640421-6711
[2]https://github.com/ml-workgroup/covid-19-image-repository/tree/master/png
[3]https://sirm.org/category/senza-categoria/covid-19/
[4]https://eurorad.org
[5]https://github.com/ieee8023/covid-chestxray-dataset
[6]https://figshare.com/articles/COVID-19_Chest_X-Ray_Image_Repository/12580328
[7]https://github.com/armiro/COVID-CXNet
[8]https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/data
[9] https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The COVID-19 pandemic is a global healthcare emergency. Prediction models for COVID-19 imaging are rapidly being developed to support medical decision making in imaging. However, inadequate availability of a diverse annotated dataset has limited the performance and generalizability of existing models.
Purpose
The Radiological Society of North America (RSNA) assembled the RSNA International COVID-19 Open Radiology Database (RICORD), a collection of COVID-related imaging datasets and expert annotations to support research and education. The RICORD datasets are made freely available to the research community and will be incorporated in the Medical Imaging and Data Resource Center (MIDRC), a multi-institutional research data repository funded by the National Institute of Biomedical Imaging and Bioengineering of the National Institutes of Health.
Materials and Methods
MIDRC-RICORD dataset 1a was created through a collaboration between the RSNA and the Society of Thoracic Radiology (STR). Pixel-level volumetric segmentation with clinical annotations by thoracic radiology subspecialists was performed for all COVID positive thoracic computed tomography (CT) imaging studies in a labeling schema coordinated with other international consensus panels and COVID data annotation efforts.
Results
MIDRC-RICORD dataset 1a consists of 120 thoracic computed tomography (CT) scans from four international sites annotated with detailed segmentation and diagnostic labels.
Patient Selection: Patients at least 18 years in age receiving positive diagnosis for COVID-19.
Data Abstract
1. 120 Chest CT examinations (axial series only, any protocol).
2. Annotations comprised of
3. Supporting clinical variables: MRN*, Age, Study Date*, Exam Description, Sex, Study UID*, Image Count, Modality, Testing Result, Specimen Source (* pseudonymous values).
How to use the JSON annotations
More information about how the JSON annotations are organized can be found on https://docs.md.ai/data/json/. Steps 2 & 3 in this example code demonstrate how to to load the JSON into a Dataframe. The JSON file can be downloaded via the data access table below; it is not available via MD.ai. This Jupyter Notebook may also be helpful.
Code for converting CT scan segmentation labels for lung opacities from MD.ai JSON to DICOM-SEG : https://github.com/QIICR/dcmqi/blob/add-mdai-converter/util/mdai2dcm.py
Research Benefits
As this is a public dataset, RICORD is available for non-commercial use (and further enrichment) by the research and education communities which may include development of educational resources for COVID-19, use of RICORD to create AI systems for diagnosis and quantification, benchmarking performance for existing solutions, exploration of distributed/federated learning, further annotation or data augmentation efforts, and evaluation of the examinations for disease entities beyond COVID-19 pneumonia. Deliberate consideration of the detailed annotation schema, demographics, and other included meta-data will be critical when generating cohorts with RICORD, particularly as more public COVID-19 imaging datasets are made available via complementary and parallel efforts. It is important to emphasize that there are limitations to the clinical “ground truth” as the SARS-CoV-2 RT-PCR tests have widely documented limitations and are subject to both false-negative and false-positive results which impact the distribution of the included imaging data, and may have led to an unknown epidemiologic distortion of patients based on the inclusion criteria. These limitations notwithstanding, RICORD has achieved the stated objectives for data complexity, heterogeneity, and high-quality expert annotations as a comprehensive COVID-19 thoracic imaging data resource.
On March 10, 2023, the Johns Hopkins Coronavirus Resource Center ceased its collecting and reporting of global COVID-19 data. For updated cases, deaths, and vaccine data please visit: U.S. Centers for Disease Control and Prevention (CDC)For more information, visit the Johns Hopkins Coronavirus Resource Center.Trends represent the day-to-day rate of new cases with a focus on the most recent 10 to 14 days. Includes Puerto Rico, Guam, Northern Marianas, and U.S. Virgin Islands. Daily new case counts are volatile for many reasons and sometimes the trends reflect that volatility. Thus, we decided to include longer-term summaries here. County Trends as of 9 Mar 20230 (-0) in Emergent1135 (+51) in Spreading1664 (-63) in Epidemic230 (+10) in Controlled110 (+2) in End StageNotes: Many states now only report once per week, and FL only once every two weeks. On 3/7/2022 we adjusted the formula for active cases to reflect the Omicron Variant which is documented to cause lower rates of serious and severe illness. To produce these trends we analyze daily updates from the Johns Hopkins University Coronavirus COVID-19 Global Cases Dashboard, though we expect to be one day behind the dashboard’s live feeds to allow for quality assurance of the data.For more information about COVID-19 trends, see our country level trends story map and the full methodology.Data Source: Johns Hopkins University CSSE US Cases by County dashboard and USAFacts for Utah County level Data.Feature layer generated from running the Join Features solution that is the basis for daily updates for the U.S. County COVID-19 Tends Story Map.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY This dataset contains COVID-19 positive confirmed cases aggregated by several different geographic areas and by day. COVID-19 cases are mapped to the residence of the individual and shown on the date the positive test was collected. In addition, 2016-2020 American Community Survey (ACS) population estimates are included to calculate the cumulative rate per 10,000 residents.
Dataset covers cases going back to 3/2/2020 when testing began. This data may not be immediately available for recently reported cases and data will change to reflect as information becomes available. Data updated daily.
Geographic areas summarized are: 1. Analysis Neighborhoods 2. Census Tracts 3. Census Zip Code Tabulation Areas
B. HOW THE DATASET IS CREATED Addresses from the COVID-19 case data are geocoded by the San Francisco Department of Public Health (SFDPH). Those addresses are spatially joined to the geographic areas. Counts are generated based on the number of address points that match each geographic area for a given date.
The 2016-2020 American Community Survey (ACS) population estimates provided by the Census are used to create a cumulative rate which is equal to ([cumulative count up to that date] / [acs_population]) * 10000) representing the number of total cases per 10,000 residents (as of the specified date).
COVID-19 case data undergo quality assurance and other data verification processes and are continually updated to maximize completeness and accuracy of information. This means data may change for previous days as information is updated.
C. UPDATE PROCESS Geographic analysis is scripted by SFDPH staff and synced to this dataset daily at 05:00 Pacific Time.
D. HOW TO USE THIS DATASET San Francisco population estimates for geographic regions can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).
This dataset can be used to track the spread of COVID-19 throughout the city, in a variety of geographic areas. Note that the new cases column in the data represents the number of new cases confirmed in a certain area on the specified day, while the cumulative cases column is the cumulative total of cases in a certain area as of the specified date.
Privacy rules in effect To protect privacy, certain rules are in effect: 1. Any area with a cumulative case count less than 10 are dropped for all days the cumulative count was less than 10. These will be null values. 2. Once an area has a cumulative case count of 10 or greater, that area will have a new row of case data every day following. 3. Cases are dropped altogether for areas where acs_population < 1000 4. Deaths data are not included in this dataset for privacy reasons. The low COVID-19 death rate in San Francisco, along with other publicly available information on deaths, means that deaths data by geography and day is too granular and potentially risky. Read more in our privacy guidelines
Rate suppression in effect where counts lower than 20 Rates are not calculated unless the cumulative case count is greater than or equal to 20. Rates are generally unstable at small numbers, so we avoid calculating them directly. We advise you to apply the same approach as this is best practice in epidemiology.
A note on Census ZIP Code Tabulation Areas (ZCTAs) ZIP Code Tabulation Areas are special boundaries created by the U.S. Census based on ZIP Codes developed by the USPS. They are not, however, the same thing. ZCTAs are areal representations of routes. Read how the Census develops ZCTAs on their website.
Rows included for Citywide case counts Rows are included for the Citywide case counts and incidence rate every day. These Citywide rows can be used for comparisons. Citywide will capture all cases regardless of address quality. While some cases cannot be mapped to sub-areas like Census Tracts, ongoing data quality efforts result in improved mapping on a rolling bases.
Related dataset See the dataset of the most recent cumulative counts for all geographic areas here: https://data.sfgov.org/COVID-19/COVID-19-Cases-and-Deaths-Summarized-by-Geography/tpyr-dvnc
E. CHANGE LOG
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan MHLW: COVID-19: PCR: Confirmed: QA: IS: Light to Medium Symptoms data was reported at 155.000 Person in 27 Sep 2022. This records an increase from the previous number of 152.000 Person for 26 Sep 2022. Japan MHLW: COVID-19: PCR: Confirmed: QA: IS: Light to Medium Symptoms data is updated daily, averaging 141.000 Person from Mar 2020 (Median) to 27 Sep 2022, with 920 observations. The data reached an all-time high of 3,298.000 Person in 03 Feb 2022 and a record low of 4.000 Person in 25 Mar 2020. Japan MHLW: COVID-19: PCR: Confirmed: QA: IS: Light to Medium Symptoms data remains active status in CEIC and is reported by Ministry of Health, Labour and Welfare. The data is categorized under High Frequency Database’s Disease Outbreaks – Table JP.D001: Ministry of Health, Labour and Welfare: Coronavirus Disease 2019 (COVID-2019).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COVID corpus in Romanian collects Romanian documents about the COVID pandemics, about the COVID disease and its diverse aspects that it entails, including medical, sociological and logistical. It was intended to foster the development of an intelligent chatbot, answering Romanian questions about COVID, in the Enrich4All project. It contains two parts: one that can be released publicly (Enrich4All-RoCOVID-Open) and one that contains copyrighted material (Enrich4All-RoCOVID-Restricted).
The archive contains a readme file and the publicly released documents: 82 (.txt, UTF-8) files together with their associated metadata files (.xml).
Example of a metadata file:
When using our dataset, please cite the following paper:
Ion, Radu, Andrei-Marius Avram, Vasile Păiş, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, and Valentin Badea. "An Open-Domain QA System for e-Governance." arXiv preprint arXiv:2206.08046 (2022).
Person of contact: Radu Ion, radu@racai.ro
Dataset Card for "COVID-QA-Chunk-64-sentence-transformer-biencoder-data-65_25_10"
More Information needed
This dataset has been retired as of February 17, 2023. This dataset will be kept for historical purposes, but will no longer be updated. Similar data are available on the state’s open data portal: https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state.
A. DATASET DESCRIPTION This dataset contains COVID-19 positive confirmed cases aggregated by several different geographic areas and by day. COVID-19 cases are mapped to the residence of the individual and shown on the date the positive test was collected. In addition, 2019 American Community Survey (ACS) 5-year population estimates are included to calculate the cumulative rate per 10,000 residents.
Dataset covers cases going back to March 18th, 2020 when the first person in Marin County tested positive for COVID-19. This data may not be immediately available for recently reported cases and data will change to reflect as information becomes available. Data updated daily.
COVID-19 case data undergo quality assurance and other data verification processes and are continually updated to maximize completeness and accuracy of information. This means data may change for previous days as information is updated.
Geographic areas summarized are: 1. City, Town, or Community Area 2. Census Tracts 3. Census ZIP Code Tabulation Areas (ZCTAs)
B. HOW THE DATASET IS CREATED Addresses from the COVID-19 case data are geocoded by Marin County HHS. Those addresses are spatially joined to the geographic areas. Counts are generated based on the number of address points that match each geographic area for a given date.
The 2019 ACS estimates for population provided by the Census are used to create a cumulative rate which is equal to ([cumulative count up to that date] / [acs_population]) * 10000) representing the number of total cases per 10,000 residents (as of the specified date).
C. UPDATE PROCESS Geographic analysis is scripted by Marin HHS staff and synced to this dataset each day.
D. HOW TO USE THIS DATASET This dataset can be used to track the spread of COVID-19 throughout Marin County in a variety of geographic areas. Note that the new cases column in the data represents the number of new cases confirmed in a certain area on the specified day, while the cumulative cases column is the cumulative total of cases in a certain area as of the specified date.
Privacy rules in effect To protect privacy, certain rules are in effect: 1. Any area with a cumulative case count less than 10 are dropped for all days the cumulative count was less than 10. These will be null values. For example if a zip code did not have 10 cumulative cases until June 1, 2020 that location will not be included in the dataset until June 1. 2. Once an area has a cumulative case count of 10 or greater, that area will have a new row of case data every day following. 3. 3. Cases are dropped altogether for areas where acs_population < 1000. Some adjacent geographic areas may be combined until the ACS population exceeds 1,000 to still provide information for these regions.
Note: 14-day case rate or 30-day case rate where the counts are lower than 20 may be unstable. We advise caution in interpreting rates at these small numbers.
A note on Census ZIP Code Tabulation Areas (ZCTAs) ZIP Code Tabulation Areas are special boundaries created by the U.S. Census based on ZIP Codes developed by the USPS. They are not, however, the same thing. ZCTAs are areal representations of routes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan MHLW: COVID-19: PCR: Confirmed: QA: IS: Using Ventilator or in ICU data was reported at 0.000 Person in 08 May 2023. This stayed constant from the previous number of 0.000 Person for 07 May 2023. Japan MHLW: COVID-19: PCR: Confirmed: QA: IS: Using Ventilator or in ICU data is updated daily, averaging 0.000 Person from Feb 2020 (Median) to 08 May 2023, with 1180 observations. The data reached an all-time high of 0.000 Person in 08 May 2023 and a record low of 0.000 Person in 08 May 2023. Japan MHLW: COVID-19: PCR: Confirmed: QA: IS: Using Ventilator or in ICU data remains active status in CEIC and is reported by Ministry of Health, Labour and Welfare. The data is categorized under High Frequency Database’s Disease Outbreaks – Table JP.D001: Ministry of Health, Labour and Welfare: Coronavirus Disease 2019 (COVID-2019).
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Colorado COVID-19 Positive Cases and Rates of Infection by County of Identification
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan MHLW: COVID-19: PCR: Confirmed: QA: Death data was reported at 8.000 Person in 08 May 2023. This stayed constant from the previous number of 8.000 Person for 07 May 2023. Japan MHLW: COVID-19: PCR: Confirmed: QA: Death data is updated daily, averaging 8.000 Person from Jun 2020 (Median) to 08 May 2023, with 1048 observations. The data reached an all-time high of 8.000 Person in 08 May 2023 and a record low of 1.000 Person in 28 Jan 2021. Japan MHLW: COVID-19: PCR: Confirmed: QA: Death data remains active status in CEIC and is reported by Ministry of Health, Labour and Welfare. The data is categorized under High Frequency Database’s Disease Outbreaks – Table JP.D001: Ministry of Health, Labour and Welfare: Coronavirus Disease 2019 (COVID-2019).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Supplementary data for the paper "Quality Assurance of a German COVID-19 Question Answering Systems using Component-based Microbenchmarking" at the 15th ACM International WSDM Conference (WSDM 2022).Abstract: Question Answering (QA) has become an often used method to retrieve data as part of chatbots and other natural-language user interfaces. In particular, QA systems of official institutions have high expectations regarding the answers computed by the system, as the provided information might be critical. In this demonstration, we use the official COVID-19 QA system that was developed together with the German Federal government to provide German citizens access to data regarding incident values, number of deaths, etc. To ensure high quality, a component-based approach was used that enables exchanging data between QA components using RDF and validating the functionality of the QA system using SPARQL. Here, we will demonstrate how our solution enables developers of QA systems to use a descriptive approach to validate the quality of their implementation before the system's deployment and also within a live environment.