Facebook
TwitterThe complexity of COVID-19 and variations in control measures and containment efforts in different countries have caused difficulties in the prediction and modeling of the COVID-19 pandemic. We attempted to predict the scale of the latter half of the pandemic based on real data using the ratio between the early and latter halves from countries where the pandemic is largely over. We collected daily pandemic data from China, South Korea, and Switzerland and subtracted the ratio of pandemic days before and after the disease apex day of COVID-19. We obtained the ratio of pandemic data and created multiple regression models for the relationship between before and after the apex day. We then tested our models using data from the first wave of the disease from 14 countries in Europe and the US. We then tested the models using data from these countries from the entire pandemic up to March 30, 2021. Results indicate that the actual number of cases from these countries during the first wave mostly fall in the predicted ranges of liniar regression, excepting Spain and Russia. Similarly, the actual deaths in these countries mostly fall into the range of predicted data. Using the accumulated data up to the day of apex and total accumulated data up to March 30, 2021, the data of case numbers in these countries are falling into the range of predicted data, except for data from Brazil. The actual number of deaths in all the countries are at or below the predicted data. In conclusion, a linear regression model built with real data from countries or regions from early pandemics can predict pandemic scales of the countries where the pandemics occur late. Such a prediction with a high degree of accuracy provides valuable information for governments and the public.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Excess Death excl COVID: Predicted: Single Excess Est: Massachusetts data was reported at 0.000 Number in 16 Sep 2023. This stayed constant from the previous number of 0.000 Number for 09 Sep 2023. United States Excess Death excl COVID: Predicted: Single Excess Est: Massachusetts data is updated weekly, averaging 0.000 Number from Jan 2017 (Median) to 16 Sep 2023, with 350 observations. The data reached an all-time high of 209.000 Number in 13 Jan 2018 and a record low of 0.000 Number in 16 Sep 2023. United States Excess Death excl COVID: Predicted: Single Excess Est: Massachusetts data remains active status in CEIC and is reported by Centers for Disease Control and Prevention. The data is categorized under Global Database’s United States – Table US.G012: Number of Excess Deaths: by States: All Causes excluding COVID-19: Predicted (Discontinued).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was created by Umut Toygar Göz
Released under Attribution 4.0 International (CC BY 4.0)
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. Most people infected with COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness. During the entire course of the pandemic, one of the main problems that healthcare providers have faced is the shortage of medical resources and a proper plan to efficiently distribute them. In these tough times, being able to predict what kind of resource an individual might require at the time of being tested positive or even before that will be of immense help to the authorities as they would be able to procure and arrange for the resources necessary to save the life of that patient.
The main goal of this project is to build a machine learning model that, given a Covid-19 patient's current symptom, status, and medical history, will predict whether the patient is in high risk or not.
The dataset was provided by the Mexican government (link). This dataset contains an enormous number of anonymized patient-related information including pre-conditions. The raw dataset consists of 21 unique features and 1,048,576 unique patients. In the Boolean features, 1 means "yes" and 2 means "no". values as 97 and 99 are missing data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Excess Death excl COVID: Predicted: Single Estimate: Wyoming data was reported at 0.000 Number in 16 Sep 2023. This stayed constant from the previous number of 0.000 Number for 09 Sep 2023. United States Excess Death excl COVID: Predicted: Single Estimate: Wyoming data is updated weekly, averaging 2.000 Number from Jan 2017 (Median) to 16 Sep 2023, with 350 observations. The data reached an all-time high of 51.000 Number in 04 Jan 2020 and a record low of 0.000 Number in 16 Sep 2023. United States Excess Death excl COVID: Predicted: Single Estimate: Wyoming data remains active status in CEIC and is reported by Centers for Disease Control and Prevention. The data is categorized under Global Database’s United States – Table US.G012: Number of Excess Deaths: by States: All Causes excluding COVID-19: Predicted (Discontinued).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Excess Death excl COVID: Predicted: Single Estimate: Maine data was reported at 0.000 Number in 16 Sep 2023. This stayed constant from the previous number of 0.000 Number for 09 Sep 2023. United States Excess Death excl COVID: Predicted: Single Estimate: Maine data is updated weekly, averaging 0.000 Number from Jan 2017 (Median) to 16 Sep 2023, with 350 observations. The data reached an all-time high of 54.000 Number in 06 Nov 2021 and a record low of 0.000 Number in 16 Sep 2023. United States Excess Death excl COVID: Predicted: Single Estimate: Maine data remains active status in CEIC and is reported by Centers for Disease Control and Prevention. The data is categorized under Global Database’s United States – Table US.G012: Number of Excess Deaths: by States: All Causes excluding COVID-19: Predicted (Discontinued).
Facebook
Twitter***Is there a decision tree for covid19 possible with these datasets ***validation demonstrates limited clinical utility of the interpretable mortality prediction model for patients with COVID-19
https://github.com/HAIRLAB/Pre_Surv_COVID_19/blob/master/response/EDA.ipynb The sudden increase of COVID-19 cases is putting a high pressure on health-care services worldwide. At the current stage, fast, accurate and early clinical assessment of the disease severity is vital. To support decision making and logistical planning in healthcare systems, this study leverages a database of blood samples from 485 infected patients in the region of Wuhan, China to identify crucial predictive biomarkers of disease mortality. For this purpose, machine learning tools selected three biomarkers that predict the mortality of individual patients with more than 90% accuracy: lactic dehydrogenase (LDH), lymphocyte and high-sensitivity C-reactive protein (hs-CRP). In particular, relatively high levels of LDH alone seem to play a crucial role in distinguishing the vast majority of cases that require immediate medical attention. This finding is consistent with current medical knowledge that high LDH levels are associated with tissue breakdown occurring in various diseases, including pulmonary disorders such as pneumonia. Overall, this paper suggests a simple and operable decision rule to quickly predict patients at the highest risk, allowing them to be prioritised and potentially reducing the mortality rate.
Facebook
TwitterThis dataset is a per-state amalgamation of demographic, public health and other relevant predictors for COVID-19.
Used positive, death and totalTestResults from the API for, respectively, Infected, Deaths and Tested in this dataset.
Please read the documentation of the API for more context on those columns
Density is people per meter squared https://worldpopulationreview.com/states/
https://worldpopulationreview.com/states/gdp-by-state/
https://worldpopulationreview.com/states/per-capita-income-by-state/
https://en.wikipedia.org/wiki/List_of_U.S._states_by_Gini_coefficient
Rates from Feb 2020 and are percentage of labor force
https://www.bls.gov/web/laus/laumstrk.htm
Ratio is Male / Female
https://www.kff.org/other/state-indicator/distribution-by-gender/
https://worldpopulationreview.com/states/smoking-rates-by-state/
Death rate per 100,000 people
https://www.cdc.gov/nchs/pressroom/sosmap/flu_pneumonia_mortality/flu_pneumonia.htm
Death rate per 100,000 people
https://www.cdc.gov/nchs/pressroom/sosmap/lung_disease_mortality/lung_disease.htm
https://www.kff.org/other/state-indicator/total-active-physicians/
https://www.kff.org/other/state-indicator/total-hospitals
Includes spending for all health care services and products by state of residence. Hospital spending is included and reflects the total net revenue. Costs such as insurance, administration, research, and construction expenses are not included.
https://www.kff.org/other/state-indicator/avg-annual-growth-per-capita/
Pollution: Average exposure of the general public to particulate matter of 2.5 microns or less (PM2.5) measured in micrograms per cubic meter (3-year estimate)
https://www.americashealthrankings.org/explore/annual/measure/air/state/ALL
For each state, number of medium and large airports https://en.wikipedia.org/wiki/List_of_the_busiest_airports_in_the_United_States
Note that FL was incorrect in the table, but is corrected in the Hottest States paragraph
https://worldpopulationreview.com/states/average-temperatures-by-state/
District of Columbia temperature computed as the average of Maryland and Virginia
Urbanization as a percentage of the population https://www.icip.iastate.edu/tables/population/urban-pct-states
https://www.kff.org/other/state-indicator/distribution-by-age/
Schools that haven't closed are marked NaN https://www.edweek.org/ew/section/multimedia/map-coronavirus-and-school-closures.html
Note that some datasets above did not contain data for District of Columbia, this missing data was found via Google searches manually entered.
Facebook
TwitterThis dataset was created by suresh dv
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Estimates of excess deaths can provide information about the burden of mortality potentially related to the COVID-19 pandemic, including deaths that are directly or indirectly attributed to COVID-19. Excess deaths are typically defined as the difference between the observed numbers of deaths in specific time periods and expected numbers of deaths in the same time periods. This visualization provides weekly estimates of excess deaths by the jurisdiction in which the death occurred. Weekly counts of deaths are compared with historical trends to determine whether the number of deaths is significantly higher than expected.Counts of deaths from all causes of death, including COVID-19, are presented. As some deaths due to COVID-19 may be assigned to other causes of deaths (for example, if COVID-19 was not diagnosed or not mentioned on the death certificate), tracking all-cause mortality can provide information about whether an excess number of deaths is observed, even when COVID-19 mortality may be undercounted. Additionally, deaths from all causes excluding COVID-19 were also estimated. Comparing these two sets of estimates — excess deaths with and without COVID-19 — can provide insight about how many excess deaths are identified as due to COVID-19, and how many excess deaths are reported as due to other causes of death. These deaths could represent misclassified COVID-19 deaths, or potentially could be indirectly related to the COVID-19 pandemic (e.g., deaths from other causes occurring in the context of health care shortages or overburdened health care systems).Estimates of excess deaths can be calculated in a variety of ways, and will vary depending on the methodology and assumptions about how many deaths are expected to occur. Estimates of excess deaths presented in this webpage were calculated using Farrington surveillance algorithms (1). A range of values for the number of excess deaths was calculated as the difference between the observed count and one of two thresholds (either the average expected count or the upper bound of the 95% prediction interval), by week and jurisdiction.Provisional death counts are weighted to account for incomplete data. However, data for the most recent week(s) are still likely to be incomplete. Weights are based on completeness of provisional data in prior years, but the timeliness of data may have changed in 2020 relative to prior years, so the resulting weighted estimates may be too high in some jurisdictions and too low in others. As more information about the accuracy of the weighted estimates is obtained, further refinements to the weights may be made, which will impact the estimates. Any changes to the methods or weighting algorithm will be noted in the Technical Notes when they occur. More detail about the methods, weighting, data, and limitations can be found in the Technical Notes.This visualization includes several different estimates:Number of excess deaths: A range of estimates for the number of excess deaths was calculated as the difference between the observed count and one of two thresholds (either the average expected count or the upper bound threshold), by week and jurisdiction. Negative values, where the observed count fell below the threshold, were set to zero.Percent excess: The percent excess was defined as the number of excess deaths divided by the threshold.Total number of excess deaths: The total number of excess deaths in each jurisdiction was calculated by summing the excess deaths in each week, from February 1, 2020 to present. Similarly, the total number of excess deaths for the US overall was computed as a sum of jurisdiction-specific numbers of excess deaths (with negative values set to zero), and not directly estimated using the Farrington surveillance algorithms.Select a dashboard from the menu, then click on “Update Dashboard” to navigate through the different graphics.The first dashboard shows the weekly predicted counts of deaths from all causes, and the threshold for the expected number of deaths. Select a jurisdiction from the drop-down menu to show data for that jurisdiction.The second dashboard shows the weekly predicted counts of deaths from all causes and the weekly count of deaths from all causes excluding COVID-19. Select a jurisdiction from the drop-down menu to show data for that jurisdiction.The th
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Excess Death excl COVID: Predicted: Total Estimate: Florida data was reported at 20,737.000 Number in 16 Sep 2023. This stayed constant from the previous number of 20,737.000 Number for 09 Sep 2023. United States Excess Death excl COVID: Predicted: Total Estimate: Florida data is updated weekly, averaging 20,737.000 Number from Jan 2017 (Median) to 16 Sep 2023, with 350 observations. The data reached an all-time high of 20,737.000 Number in 16 Sep 2023 and a record low of 20,737.000 Number in 16 Sep 2023. United States Excess Death excl COVID: Predicted: Total Estimate: Florida data remains active status in CEIC and is reported by Centers for Disease Control and Prevention. The data is categorized under Global Database’s United States – Table US.G012: Number of Excess Deaths: by States: All Causes excluding COVID-19: Predicted (Discontinued).
Facebook
TwitterObjectiveWe developed and validated a prediction model based on individuals' risk profiles to predict the severity of lung involvement and death in patients hospitalized with coronavirus disease 2019 (COVID-19) infection.MethodsIn this retrospective study, we studied hospitalized COVID-19 patients with data on chest CT scans performed during hospital stay (February 2020-April 2021) in a training dataset (TD) (n = 2,251) and an external validation dataset (eVD) (n = 993). We used the most relevant demographical, clinical, and laboratory variables (n = 25) as potential predictors of COVID-19-related outcomes. The primary and secondary endpoints were the severity of lung involvement quantified as mild (≤25%), moderate (26–50%), severe (>50%), and in-hospital death, respectively. We applied random forest (RF) classifier, a machine learning technique, and multivariable logistic regression analysis to study our objectives.ResultsIn the TD and the eVD, respectively, the mean [standard deviation (SD)] age was 57.9 (18.0) and 52.4 (17.6) years; patients with severe lung involvement [n (%):185 (8.2) and 116 (11.7)] were significantly older [mean (SD) age: 64.2 (16.9), and 56.2 (18.9)] than the other two groups (mild and moderate). The mortality rate was higher in patients with severe (64.9 and 38.8%) compared to moderate (5.5 and 12.4%) and mild (2.3 and 7.1%) lung involvement. The RF analysis showed age, C reactive protein (CRP) levels, and duration of hospitalizations as the three most important predictors of lung involvement severity at the time of the first CT examination. Multivariable logistic regression analysis showed a significant strong association between the extent of the severity of lung involvement (continuous variable) and death; adjusted odds ratio (OR): 9.3; 95% CI: 7.1–12.1 in the TD and 2.6 (1.8–3.5) in the eVD.ConclusionIn hospitalized patients with COVID-19, the severity of lung involvement is a strong predictor of death. Age, CRP levels, and duration of hospitalizations are the most important predictors of severe lung involvement. A simple prediction model based on available clinical and imaging data provides a validated tool that predicts the severity of lung involvement and death probability among hospitalized patients with COVID-19.
Facebook
TwitterIntroduction: Prognosis of Coronavirus disease 2019 (Covid-19) patients with vascular risk factors, and certain comorbidities is worse. The impact of chronic neurological disorders (CND) on prognosis is unclear. We evaluated if the presence of CND in Covid-19 patients is a predictor of a higher in-hospital mortality. As secondary endpoints, we analyzed the association between CND, Covid-19 severity, and laboratory abnormalities during admission.Methods: Retrospective cohort study that included all the consecutive hospitalized patients with confirmed Covid-19 disease from March 8th to April 11th, 2020. The study setting was Hospital Clínico, tertiary academic hospital from Valladolid. CND was defined as those neurological conditions causing permanent disability. We assessed demography, clinical variables, Covid-19 severity, laboratory parameters and outcome. The primary endpoint was in-hospital all-cause mortality, evaluated by multivariate cox-regression log rank test. We analyzed the association between CND, covid-19 severity and laboratory abnormalities.Results: We included 576 patients, 43.3% female, aged 67.2 years in mean. CND were present in 105 (18.3%) patients. Patients with CND were older, more disabled, had more vascular risk factors and comorbidities and fewer clinical symptoms of Covid-19. They presented 1.43 days earlier to the emergency department. Need of ventilation support was similar. Presence of CND was an independent predictor of death (HR 2.129, 95% CI: 1.382–3.280) but not a severer Covid-19 disease (OR: 1.75, 95% CI: 0.970–3.158). Frequency of laboratory abnormalities was similar, except for procalcitonin and INR.Conclusions: The presence of CND is an independent predictor of mortality in hospitalized Covid-19 patients. That was not explained neither by a worse immune response to Covid-19 nor by differences in the level of care received by patients with CND.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Daily updates of Covid-19 Global Excess Deaths from the Economist's GitHub repository: https://github.com/TheEconomist/covid-19-the-economist-global-excess-deaths-model
Interpreting estimates
Estimating excess deaths for every country every day since the pandemic began is a complex and difficult task. Rather than being overly confident in a single number, limited data means that we can often only give a very very wide range of plausible values. Focusing on central estimates in such cases would be misleading: unless ranges are very narrow, the 95% range should be reported when possible. The ranges assume that the conditions for bootstrap confidence intervals are met. Please see our tracker page and methodology for more information.
New variants
The Omicron variant, first detected in southern Africa in November 2021, appears to have characteristics that are different to earlier versions of sars-cov-2. Where this variant is now dominant, this change makes estimates uncertain beyond the ranges indicated. Other new variants may do the same. As more data is incorporated from places where new variants are dominant, predictions improve.
Non-reporting countries
Turkmenistan and the Democratic People's Republic of Korea have not reported any covid-19 figures since the start of the pandemic. They also have not published all-cause mortality data. Exports of estimates for the Democratic People's Republic of Korea have been temporarily disabled as it now issues contradictory data: reporting a significant outbreak through its state media, but zero confirmed covid-19 cases/deaths to the WHO.
Acknowledgements
A special thanks to all our sources and to those who have made the data to create these estimates available. We list all our sources in our methodology. Within script 1, the source for each variable is also given as the data is loaded, with the exception of our sources for excess deaths data, which we detail in on our free-to-read excess deaths tracker as well as on GitHub. The gradient booster implementation used to fit the models is aGTBoost, detailed here.
Calculating excess deaths for the entire world over multiple years is both complex and imprecise. We welcome any suggestions on how to improve the model, be it data, algorithm, or logic. If you have one, please open an issue.
The Economist would also like to acknowledge the many people who have helped us refine the model so far, be it through discussions, facilitating data access, or offering coding assistance. A special thanks to Ariel Karlinsky, Philip Schellekens, Oliver Watson, Lukas Appelhans, Berent Å. S. Lunde, Gideon Wakefield, Johannes Hunger, Carol D'Souza, Yun Wei, Mehran Hosseini, Samantha Dolan, Mollie Van Gordon, Rahul Arora, Austin Teda Atmaja, Dirk Eddelbuettel and Tom Wenseleers.
All coding and data collection to construct these models (and make them update dynamically) was done by Sondre Ulvund Solstad. Should you have any questions about them after reading the methodology, please open an issue or contact him at sondresolstad@economist.com.
Suggested citation The Economist and Solstad, S. (corresponding author), 2021. The pandemic’s true death toll. [online] The Economist. Available at: https://www.economist.com/graphic-detail/coronavirus-excess-deaths-estimates [Accessed ---]. First published in the article "Counting the dead", The Economist, issue 20, 2021.
Facebook
TwitterDuring the first two year of the Covid-19 pandemic, deaths tolls differed from a country to another. In a previous research work on 39 countries, we have found that some population’s characteristics were either negatively (birth rate/mortality rate, fertility rate) or positively (cancer score, Alzheimer disease score, percent of people above 65 years old, levels of alcohol intake) correlated with Covid-19 mortality. We also found that low levels of climate factors (average annual temperature, average hours of sunshine, average annual level of UV index) were positively correlated with Covid-19 deaths numbers as well. In the present study, we have developped an anti-Covid Capacity index that takes into account all the above mentioned parameters. The polynomial analysis of the anti-Covid Capacity and its corresponding geographic latitude of each country has generated a bell-shaped curve, with a high coefficient of determination (R2= 0.78). Lower anti-Covid capacity values were recorded in countries of low and high latitudes, respectively. Instead, plotting covid-19 deaths numbers against geographic latitude levels has generated an inverted bell-shaped curve, with higher deaths numbers at low and high latitudes, respectively. The analysis by a simple linear regression has shown that Covid-19 deaths numbers were significantly (p= 2,40 x 10-9) and negatively correlated to the anti-Covid Capacity index values. Our data demonstrate that the negative prepandemic human conditions, and the low scores of both annual temperature and UV index in many countries were the key factors behind high Covid-19 mortality, and they can be expressed as a simple index of anti-Covid capacity of a country that can predict the death-associated severity of Covid-19 disease, and thus, according to a country’s geographic latitude.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Dataset Description: Infected and Death Cases of Covid-19 in Bangladesh This dataset contains detailed information on Covid-19 cases in Bangladesh, focusing on the number of new cases and deaths reported. The data spans from September 27, 2020, to November 19, 2021. The dataset is structured with three primary columns:
Date: The date when the data was recorded, formatted as YYYY-MM-DD. New Cases: The number of new Covid-19 cases reported on the corresponding date. Deaths: The number of deaths attributed to Covid-19 on the corresponding date. Key Features: Time Range: Covers over a year of data, capturing various waves of the pandemic. Granularity: Daily records, providing detailed insights into the daily progression of the pandemic. Size: The dataset is compact, with a file size of 7.91 KB, making it easy to handle and analyze. Cite this paper
@InProceedings{10.1007/978-981-19-2445-3_38, author="Rahman, Ashifur and Hossain, Md. Akbar and Moon, Mohasina Jannat", editor="Hossain, Sazzad and Hossain, Md. Shahadat and Kaiser, M. Shamim and Majumder, Satya Prasad and Ray, Kanad", title="An LSTM-Based Forecast Of COVID-19 For Bangladesh", booktitle="Proceedings of International Conference on Fourth Industrial Revolution and Beyond 2021 ", year="2022", publisher="Springer Nature Singapore", address="Singapore", pages="551--561", abstract="Preoperative events can be predicted using deep learning-based forecasting techniques. It can help to improve future decision-making. Deep learning has traditionally been used to identify and evaluate adverse risks in a variety of major applications. Numerous prediction approaches are commonly applied to deal with forecasting challenges. The number of infected people, as well as the mortality rate of COVID-19, is increasing every day. Many countries, including India, Brazil, and the United States, were severely affected; however, since the very first case was identified, the transmission rate has decreased dramatically after a set time period. Bangladesh, on the other hand, was unable to keep the rate of infection low. In this situation, several methods have been developed to forecast the number of affected, time to recover, and the number of deaths. This research illustrates the ability of DL models to forecast the number of affected and dead people as a result of COVID-19, which is now regarded as a possible threat to humanity. As part of this study, we developed an LSTM based method to predict the next 100 days of death and newly identified COVID-19 cases in Bangladesh. To do this experiment we collect data on death and newly detected COVID-19 cases through Bangladesh's national COVID-19 help desk website. After collecting data we processed it to make a dataset for training our LSTM model. After completing the training, we predict our model with the test dataset. The result of our model is very robust on the basis of the training and testing dataset. Finally, we forecast the subsequent 100 days of deaths and newly infected COVID-19 cases in Bangladesh.", isbn="978-981-19-2445-3" }
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset contains observed and 4 weeks forecast new and total weekly COVID-19 deaths at national and state level until March 9, 2023. Forecasting teams predict numbers of deaths using different types of data (e.g., COVID-19 data, demographic data, mobility data), methods, and estimates of the impacts of interventions (e.g., social distancing, use of face coverings).
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset merges three distinct data sources to explore the relationship between COVID-19 death rates, vaccination efforts, and public sentiment on Twitter from December 25, 2020 to March 29, 2022. It includes 2,000 cleaned rows with 16 variables, created by combining global health statistics and social media sentiment data.
COVID-19 Deaths Data (scraped from Worldometer - COVID-19 Deaths via BeautifulSoup):
Date: Date of recorddaily_increase_percent: % change in deaths from previous daySeason: Derived from date (Winter, Spring, Summer, Fall)Tweet Sentiment Data : COVID Vaccine Tweets Dataset
Date: Tweet timestamptext_sentiment: Sentiment label (positive, neutral, negative) from NLTK’s SentimentIntensityAnalyzeruser_verified: Whether the user is verifieduser_since_days: Age of the Twitter account (in days)country: Cleaned user locationVaccination Data : Vaccination Dataset
Date: Date of recordtotal_vaccinations_per_hundred: Doses per 100 peopledaily_vaccinations: Daily dose countvaccine_group: Grouped vaccine type (e.g., mRNA, Viral Vector)country: Country nameDate and countrySeason, user_since_days, vaccine_groupThis dataset was used in a final data science project to:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background
New York City quickly became an epicenter of the COVID-19 pandemic. Due to a sudden and massive increase in patients during COVID-19 pandemic, healthcare providers incurred an exponential increase in workload which created a strain on the staff and limited resources. As this is a new infection, predictors of morbidity and mortality are not well characterized.
Methods
We developed a prediction model to predict patients at risk for mortality using only laboratory, vital and demographic information readily available in the electronic health record on more than 3000 hospital admissions with COVID-19. A variable importance algorithm was used for interpretability and understanding of performance and predictors.
Findings
We built a model with 84-97% accuracy to identify predictors and patients with high risk of mortality, and developed an automated artificial intelligence (AI) notification tool that does not require manual calculation by the busy clinician. Oximetry, respirations, blood urea nitrogen, lymphocyte percent, calcium, troponin and neutrophil percentage were important features and key ranges were identified that contributed to a 50% increase in patients’ mortality prediction score. With an increasing negative predictive value (NPV) starting 0.90 after the second day of admission, we are able more confidently able identify likely survivors. This study serves as a use case of a model with visualizations to aide clinicians with a better understanding of the model and predictors of mortality. Additionally, an example of the operationalization of the model via an AI notification tool is illustrated.
Facebook
Twitterhttps://www.usa.gov/government-works/https://www.usa.gov/government-works/
This data represents excess deaths in an area (more deaths than expected). Although causes are recorded, one can see that excess deaths increased dramatically in 2020.
Original Source: https://catalog.data.gov/dataset/excess-deaths-associated-with-covid-19-35b8c
Using the Excess Death counts, make predictions, per day, as to what the hospital case load, and death load, for covid cases will be. Can use any source (local, federal, international) as the benchmark.
Facebook
TwitterThe complexity of COVID-19 and variations in control measures and containment efforts in different countries have caused difficulties in the prediction and modeling of the COVID-19 pandemic. We attempted to predict the scale of the latter half of the pandemic based on real data using the ratio between the early and latter halves from countries where the pandemic is largely over. We collected daily pandemic data from China, South Korea, and Switzerland and subtracted the ratio of pandemic days before and after the disease apex day of COVID-19. We obtained the ratio of pandemic data and created multiple regression models for the relationship between before and after the apex day. We then tested our models using data from the first wave of the disease from 14 countries in Europe and the US. We then tested the models using data from these countries from the entire pandemic up to March 30, 2021. Results indicate that the actual number of cases from these countries during the first wave mostly fall in the predicted ranges of liniar regression, excepting Spain and Russia. Similarly, the actual deaths in these countries mostly fall into the range of predicted data. Using the accumulated data up to the day of apex and total accumulated data up to March 30, 2021, the data of case numbers in these countries are falling into the range of predicted data, except for data from Brazil. The actual number of deaths in all the countries are at or below the predicted data. In conclusion, a linear regression model built with real data from countries or regions from early pandemics can predict pandemic scales of the countries where the pandemics occur late. Such a prediction with a high degree of accuracy provides valuable information for governments and the public.