Facebook
TwitterThe following report outlines the workflow used to optimize your Find Hot Spots result:Initial Data Assessment.There were 2933 valid input features.There were 3108 valid input aggregation areas.There were 3108 valid input aggregation areas.There were 66 outlier locations; these will not be used to compute the optimal fixed distance band.Incident AggregationAnalysis was based on the number of points in each polygon cell.Analysis was performed on all aggregation areas.The aggregation process resulted in 3108 weighted areas.Incident Count Properties:Min0.0000Max0.0015Mean0.0001Std. Dev.0.0001Scale of AnalysisThe optimal fixed distance band was based on the average distance to 30 nearest neighbors: 150682.0000 Meters.Hot Spot AnalysisThere are 865 output features statistically significant based on a FDR correction for multiple testing and spatial dependence.OutputRed output features represent hot spots where high incident counts cluster.Blue output features represent cold spots where low incident counts cluster.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Geostatistics analyzes and predicts the values associated with spatial or spatial-temporal phenomena. It incorporates the spatial (and in some cases temporal) coordinates of the data within the analyses. It is a practical means of describing spatial patterns and interpolating values for locations where samples were not taken (and measures the uncertainty of those values, which is critical to informed decision making). This archive contains results of geostatistical analysis of COVID-19 case counts for all available US counties. Test results were obtained with ArcGIS Pro (ESRI). Sources are state health departments, which are scraped and aggregated by the Johns Hopkins Coronavirus Resource Center and then pre-processed by MappingSupport.com.
This update of the Zenodo dataset (version 6) consists of three compressed archives containing geostatistical analyses of SARS-CoV-2 testing data. This dataset utilizes many of the geostatistical techniques used in previous versions of this Zenodo archive, but has been significantly expanded to include analyses of up-to-date U.S. COVID-19 case data (from March 24th to September 8th, 2020):
Archive #1: “1.Geostat. Space-Time analysis of SARS-CoV-2 in the US (Mar24-Sept6).zip” – results of a geostatistical analysis of COVID-19 cases incorporating spatially-weighted hotspots that are conserved over one-week timespans. Results are reported starting from when U.S. COVID-19 case data first became available (March 24th, 2020) for 25 consecutive 1-week intervals (March 24th through to September 6th, 2020). Hotspots, where found, are reported in each individual state, rather than the entire continental United States.
Archive #2: "2.Geostat. Spatial analysis of SARS-CoV-2 in the US (Mar24-Sept8).zip" – the results from geostatistical spatial analyses only of corrected COVID-19 case data for the continental United States, spanning the period from March 24th through September 8th, 2020. The geostatistical techniques utilized in this archive includes ‘Hot Spot’ analysis and ‘Cluster and Outlier’ analysis.
Archive #3: "3.Kriging and Densification of SARS-CoV-2 in LA and MA.zip" – this dataset provides preliminary kriging and densification analysis of COVID-19 case data for certain dates within the U.S. states of Louisiana and Massachusetts.
These archives consist of map files (as both static images and as animations) and data files (including text files which contain the underlying data of said map files [where applicable]) which were generated when performing the following Geostatistical analyses: Hot Spot analysis (Getis-Ord Gi*) [‘Archive #1’: consecutive weeklong Space-Time Hot Spot analysis; ‘Archive #2’: daily Hot Spot Analysis], Cluster and Outlier analysis (Anselin Local Moran's I) [‘Archive #2’], Spatial Autocorrelation (Global Moran's I) [‘Archive #2’], and point-to-point comparisons with Kriging and Densification analysis [‘Archive #3’].
The Word document provided ("Description-of-Archive.Updated-Geostatistical-Analysis-of-SARS-CoV-2 (version 6).docx") details the contents of each file and folder within these three archives and gives general interpretations of these results.
Facebook
TwitterThe New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
Facebook
TwitterHow to Read the map.This map allows you to visualize the trends over time and cases, recoveries, deaths and testing at the regional health unit. The Map shows the relative state of the COVID-19 outbreak in each region. Colour (red to green) shows the time since a new reported case.
7 Day Hot Spots
The map highlights regions with an active outbreak with a "glowing ball". The size of the ball reflects the average number of new cases in the past 7 days as a rate per 100K population.
High
Low
Important InformationNot all data is reported for all regional health units. Data sources are consulted every 24 hours, however not all organizations report on a daily bases. As this data is cumulative, values carry-forward if updates are not provided. Values can go down due to corrected errors as reported. Data SourcesThe source of the data for each regional health unit is listed in the "SourceURL" field.
Looking for the raw data? You can find it here.
Facebook
TwitterIn summer 2020, SARS-CoV-2 was detected on mink farms in Utah. An interagency One Health response was initiated to assess the extent of the outbreak and included sampling animals from or near affected mink farms and testing them for SARS-CoV-2 and non-SARS coronaviruses. Among the 365 animals sampled, including domestic cats, mink, rodents, raccoons, and skunks, 261 (72%) of the animals harbored at least one coronavirus at the time. Among the samples which could be further characterized, 126 alphacoronaviruses and 88 betacoronaviruses (including 74 detections of SARS-CoV-2) were identified. Moreover, at least 10% (n=27) of the corona-virus-positive animals were found to be co-infected with more than one coronavirus. Our findings indicate an unexpectedly high prevalence of coronavirus among the domestic and wild animals tested on mink farms and raise the possibility that commercial animal husbandry operations could be potential hot spots for future trans-species viral spillover and the emergence of new pandemic coronaviruses. Figure 1. Phylogenetic relationships of the identified coronaviruses from mink and other animals from mink farms in Utah. The four genera of coronaviruses are highlighted in different colors. AlphaCoV, alkphacoronavirus; BetaCoV, betacoronavirus; DeltaCoV, deltacoronaviruses; and GammaCoV, gammacoronavirus. Type species for the currently recognized subgenera are annotated according to the nomenclature scheme used in this manuscript with the addition of the ICTV subgenus. Additional viruses, including the closest GenBank entry as identified by the BLAST tool, were included to help delineate relationship. Red circles are viruses identified in this study. Panel A. Full phylogenetic tree (A full-size image is included in Supplementary Figure 1). Red arrows designate the group of nearly identical Utah mink coronavirus strains collapsed into the colored triangle in Panel B. Table 1. Coronavirus distribution among species tested. The species are listed by their common names; Total, the total number of animals of each species tested; Negative, number of each species with no coronavirus detected among the tissues tested; Positive, number of animals positive for coronavirus in at least one tissue; % Pos, percentage of coronavirus positives in each species. Table 2. Detailed tissue panel tested for SARS-CoV-2. The distribution of SARS-CoV-2 RNA detection in the first 96 animals is listed. Tissue, tissue or tissue pools received; Total, total number tested in each category; Negative, number of N1 RT-PCR negatives; Posi-tives, number of N1 RT-PCR positives; % Pos, percentage of tissues positive for corona-virus. Table 3. Summary of coronaviruses identified. The distribution of coronaviruses detected and characterized according to their host is listed. Species, common name of animal species tested; AlphaCoV, number of alphacoronaviruses identified; BetaCoV, number of betacoronaviruses identified; Sequenced, number of viruses identified by sequencing, Unchar, number of coronavirus-positive samples not further characterized. Table 4. SARS-CoV-2 coinfections identified in Utah mammals. The individual animals that are both SARS-CoV-2 positive and infected with a second coronavirus are listed. Animal ID, Unique animal identification number; Common name, common name of animal; Scientific name, scientific name of animal; Sex, F, female, M, male. Unk, un-known; Age, A adult, J juvenile, Unk, unknown; SARS-CoV-2, Neg-N1 RT-PCR nega-tive, Pos-N1 RT-PCR positive, Second strain, genus and common name of the coronavirus, Pan-CoV RT-PCR Equivocal, sample is PCR positive but not further characterized. Supplementary Figure 1. Phylogenetic relationships of the identified coronaviruses from mink farms in Utah. The four genera of coronaviruses are highlighted in different colors. AlphaCoV, alkphacoronavirus; BetaCoV, betacoronavirus; DeltaCoV, deltacoronaviruses; and GammaCoV, gammacoronavirus. Type species for the currently recognized subgenera are annotated according to the nomenclature scheme used in this manuscript with the addition of the ICTV subgenus. Additional viruses, including the closest GenBank entry as identified by the BLAST tool were included to help delineate relationship. Red circles are viruses identified in this study. Supplementary Table 1. List of animals and tissues sampled and RT-PCR test results. Animal ID, unique identifier for each animal; Specimen ID, unique identifier for each tissue; Common name, common name of the animal species; Scientific name, scientific name of the animal species, Sex, F-female, M-male, UNK-unknown; Age, J-juvenile, A-adult, UNK-unknown; Tissue, organ or organ pools tested; Tissue study, X denotes the animals and tissues used in the tissue distribution sub-study; N1 PCR, Ct values from the CDC N1 assay; Pan-CoV PCR, Neg, negative, Pos, positive, Equiv, equivocal; * wild mink. Supplementary Table 2. Summary of coronavirus test results. Animal ID, unique identifier for each animal; Common name, common name of the animal species; Scientific name, scientific name of the animal species, Sex, F-female, M-male, UNK-unknown; Age, J-juvenile, A-adult, UNK-unknown; CoV, Neg-negative, Pos-positive on either one or both RT-PCR tests; SARS-CoV-2, animals positive in the CDC N1 test; AlphaCoV, the tissues positive for alphacoronavirus for each animal is listed; BetaCoV, the tissues positive for betacoronavirus for each animal is listed; C-colon, C/R-colon/rectum pool, H-heart, L-lung, L/S-live/spleen pool, S int-small intestine; Co-infections, Y-yes; PCR only, Y-yes; Virus identified by sequencing, brief name of virus identified.
Facebook
TwitterBackgroundThe coronavirus disease 2019 (COVID-19) pandemic is disrupting routine medical care of cancer patients, including those who have cancer or are undergoing cancer screening. In this study, breast cancer management during the COVID-19 pandemic (BCMP) is reviewed, and the research trends of BCMP are evaluated by quantitative and qualitative evaluation.MethodsIn this study, published studies relating to BCMP from 1 January 2020 to 1 April 2022 were searched from the Web of Science database (WoS). Bibliometric indicators consisted of publications, research hotspots, keywords, authors, journals, institutions, nations, and h-index.ResultsA total of 182 articles investigating BCMP were searched. The United States of America and the University of Rome Tor Vergata were the nation and the institution with the most publications on BCMP. The first three periodicals with leading published BCMP studies were Breast Cancer Research and Treatment, Breast, and In Vivo. Buonomo OC was the most prolific author in this field, publishing nine articles (9/182, 4.94%). The co-keywords analysis of BCMP suggests that the top hotspots and trends in research are screening, surgery, rehabilitation, emotion, diagnosis, treatment, and vaccine management of breast cancer during the pandemic. The hotspot words were divided into six clusters, namely, screening for breast cancer patients in the pandemic, breast cancer surgery in the pandemic, recovery of breast cancer patients in the pandemic, motion effect of the outbreak on breast cancer patients, diagnosis and treatment of breast cancer patients in the pandemic, and vaccination management for breast cancer patients during a pandemic.ConclusionBCMP has received attention from scholars in many nations over the last 3 years. This study revealed significant contributions to BCMP research by nations, institutions, scholars, and journals. The stratified clustering study provided the current status and future trends of BCMP to help physicians with the diagnosis and treatment of breast cancer through the pandemic, and provide a reference for in-depth clinical studies on BCMP.
Facebook
TwitterNotice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.
April 9, 2020
April 20, 2020
April 29, 2020
September 1st, 2020
February 12, 2021
new_deaths column.February 16, 2021
The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.
The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.
The AP is updating this dataset hourly at 45 minutes past the hour.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic
Filter cases by state here
Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac
Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true
Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.
Pull the 100 counties with the highest per-capita confirmed cases here
Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.
The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.
@(https://datawrapper.dwcdn.net/nRyaf/15/)
<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here
This data should be credited to Johns Hopkins University COVID-19 tracking project
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of the ongoing coronavirus disease 2019 (COVID-19) pandemic. Understanding the influence of mutations in the SARS-CoV-2 gene on clinical outcomes is critical for treatment and prevention. Here, we analyzed all high-coverage complete SARS-CoV-2 sequences from GISAID database from January 1, 2020, to January 1, 2021, to mine the mutation hotspots associated with clinical outcome and developed a model to predict the clinical outcome in different epidemic strains. Exploring the cause of mutation based on RNA-dependent RNA polymerase (RdRp) and RNA-editing enzyme, mutation was more likely to occur in severe and mild cases than in asymptomatic cases, especially A > G, C > T, and G > A mutations. The mutations associated with asymptomatic outcome were mainly in open reading frame 1ab (ORF1ab) and N genes; especially R6997P and V30L mutations occurred together and were correlated with asymptomatic outcome with high prevalence. D614G, Q57H, and S194L mutations were correlated with mild and severe outcome with high prevalence. Interestingly, the single-nucleotide variant (SNV) frequency was higher with high percentage of nt14408 mutation in RdRp in severe cases. The expression of ADAR and APOBEC was associated with clinical outcome. The model has shown that the asymptomatic percentage has increased over time, while there is high symptomatic percentage in Alpha, Beta, and Gamma. These findings suggest that mutation in the SARS-CoV-2 genome may have a direct association with clinical outcomes and pandemic. Our result and model are helpful to predict the prevalence of epidemic strains and to further study the mechanism of mutation causing severe disease.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
In response to the impacts of COVID-19, Drive-In WiFi Hotspots provide free temporary, emergency internet access for Washingtonians who do not have broadband service to their homes.
Access is available to all residents with specific emphasis on remote learning for students. Additionally, this service can be used for job searches, telehealth, telework, unemployment filing, and census participation.
The locations listed on this map represent new Drive-In WiFi Hotspot sites located at Washington State University Extension locations, as well as new and existing Washington State Library Drive-In WiFi Hotspots.
Launching primarily as parking lot hotspots in response to the COVID-19 pandemic, the free community Wi-Fi is accessible regardless of how users arrive at the locations. Some sites also offer indoor public access during business hours. Everyone using the sites – outside or inside – must practice social distancing and hygiene precautions, including staying in your vehicle or at least six feet from other users and wearing a mask if necessary.
Each hotspot will have its own security protocol. Some will be open and others will have Children’s Internet Protection Act (CIPA) safe security installed.
Broadband equity is not just a rural challenge. The drive-In Wi-Fi hotspot project addresses underserved and economically disadvantaged communities in urban and suburban areas as well.
More information can be found: https://www.commerce.wa.gov/building-infrastructure/washington-state-drive-in-wifi-hotspots-location-finder/
Facebook
TwitterEpidemiologic COVID-19 data for São Paulo State capital and hotspots cities for disease introduction and spread on April 18th (see Fig 5).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The unique count is the unique number of groups flagged, and the total count is the total number of groups flagged (may be flagged more than once) across a 15-week semester. Using time window k, LS,k and LH,k are the likelihoods of a hotspot group for sections and housing, respectively. The relative risk of the two groups RRk (from Eq 8) is also shown with a 95% confidence interval (CI).
Facebook
TwitterThis feature contain several data layers: 1 depicts the up-to-date COVID-19 cases for Nigeria by states and the 2 shows population density of Nigeria by Local Government Areas; and these were superimposed on each other for easy comparison; 3 is a map of the statistically significant population hot spot and cold spot in Nigeria. All these datasets constitute this well presented COVID-19 dashboard for monitoring Nigeria cases. Data sources include NCDC, WHO, and Africa Geoportal. The COVID-19 data is updated at least once per day, following NCDC update timeline. This layer is created and maintained by DR. NKEKI F. N. and his team (Eugene .A. Atakpiri and Akinde .N. Kolawole) to Support NCDC to fight against the spread of COVID-19 in Nigeria. This layer is opened to the public and free to share. Contact Info: Phone: +23408063131159Email: nkekifndidi@gmail.com
Facebook
TwitterBy Kristen Honey, Chief Data Scientist and COVID-19 Diagnostics Informatics Lead, COVID-19 Testing and Diagnostics Working Group (TDWG); Joshua Prasad, Director of Health Equity Innovation, Office of the Chief Data Officer (OCDO), Jack Bastian, Data Engineer, HHS Protect, Office of the Chief Data Officer (OCDO)
Facebook
TwitterAs of March 10, 2023, the state with the highest number of COVID-19 cases was California. Almost 104 million cases have been reported across the United States, with the states of California, Texas, and Florida reporting the highest numbers.
From an epidemic to a pandemic The World Health Organization declared the COVID-19 outbreak a pandemic on March 11, 2020. The term pandemic refers to multiple outbreaks of an infectious illness threatening multiple parts of the world at the same time. When the transmission is this widespread, it can no longer be traced back to the country where it originated. The number of COVID-19 cases worldwide has now reached over 669 million.
The symptoms and those who are most at risk Most people who contract the virus will suffer only mild symptoms, such as a cough, a cold, or a high temperature. However, in more severe cases, the infection can cause breathing difficulties and even pneumonia. Those at higher risk include older persons and people with pre-existing medical conditions, including diabetes, heart disease, and lung disease. People aged 85 years and older have accounted for around 27 percent of all COVID-19 deaths in the United States, although this age group makes up just two percent of the U.S. population
Facebook
TwitterBy Dan Winchester [source]
This dataset contains the total number of confirmed COVID-19 cases in each English Upper Tier Local Authority over the past eight days. Aggregated from Public Health England data, this dataset provides unprecedented insight into how quickly the virus has been able to spread in local communities throughout England. Despite testing limitations, understanding these localized patterns of infection can help inform important public health decisions by local authorities and healthcare workers alike.
It is essential to bear in mind that this data is likely an underestimation of true infection rates due to limited testing -- it is critical not to underestimate the risk the virus poses on a local scale! Use this dataset at your own discretion with caution and care; consider supplementing it with other health and socio-economic metrics for a holistic picture of regional trends over time.
This dataset features information surrounding GSS codes and names as well as total numbers of recorded COVID-19 cases per English Upper Tier Local Authority on January 5th 2023 (TotalCases_2023-01-05)
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
- Comparing the total cases in each local authority to population density of the region, to identify areas with higher incidence of virus
- Tracking changes in total cases over a period of time to monitor trend shifts and detect possible outbreak hotspots
- Establishing correlations between the spread of COVID-19 and other non-coronavirus related health issues, such as mental health or cardiovascular risk factors
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: utla_by_day.csv | Column name | Description | |:--------------------------|:------------------------------------------------------------------------------------------------------| | GSS_CD | Government Statistical Service code for the local authority. (String) | | GSS_NM | Name of the local authority. (String) | | TotalCases_2023-01-05 | Total number of confirmed COVID-19 cases in the local authority on the 5th of January 2023. (Integer) |
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Dan Winchester.
Facebook
TwitterIn an effort to assist both Fauquier County students and citizens with access to resources they may need while we navigate the Coronavirus Pandemic, Fauquier County is making hotspots available at several locations across the County. Please feel free to use this map to locate the one nearest you.
Facebook
TwitterAs of January 1, 2025, Rome (Lazio) was the Italian province which registered the highest number of coronavirus (COVID-19) cases in the country. Milan (Lombardy) came second in this ranking, while Naples (Campania) and Turin (Piedmont) followed. These four areas are also the four most populated provinces in Italy. The region of Lombardy was the mostly hit by the spread of the virus, recording almost one sixth of all coronavirus cases in the country. The provinces of Milan and Brescia accounted for a large part of this figure. For a global overview, visit Statista's webpage exclusively dedicated to coronavirus, its development, and its impact.
Facebook
Twitterhttps://www.dataflix.com/data360/license/https://www.dataflix.com/data360/license/
The Dataflix COVID dataset is a centralized repository of up-to-date and curated data focused on key tracking metics and U.S. census data. The dataset is publicly-readable & accessible on Google BigQuery – ready for analysis, analytics and machine learning initiatives. The dataset is built on data sourced from trusted sources like CSSE at Johns Hopkins University and government agencies, covering a wide range of metrics including confirmed cases, new cases, % population, mortality rate and deaths, aggregated at various geographic levels including city, county, state and country. New data is published on daily basis. Our objective is to make structured COVID data available for organizations and individuals to help in the fight against COVID-19. Example, health authorities will be able to build reports & dashboards to efficiently deploy vital resources like hospital beds and ventilators as they track the spread of the disease. Or epidemiologists can use the dataset to complement their existing models & datasets, and generate better forecasts of hotspots and trends. Learn more
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the Zenodo archive for the manuscript "Likely community transmission of COVID-19 infections between neighboring, persistent hotspots in Ontario, Canada" (Mucaki EJ, Shirley BC and Rogan PK. F1000Research 2021, 10:1312, DOI: 10.12688/f1000research.75891.1). This study aimed to produce community-level geo-spatial mapping of patterns and clusters of symptoms, and of confirmed COVID-19 cases, in near real-time in order to support decision-making. This was accomplished by area-to-area geostatistical analysis, space-time integration, and spatial interpolation of COVID-19 positive individuals. This archive will contain data and image files from this study, which were too numerous to be included in the manuscript for this study. It also provides all program files pertaining to the Geostatistical Epidemiology Toolbox (Geostatistical analysis software package to be used in ArcGIS), as well as all other scripts described in this manuscript and other software developed (cluster, outlier, streak identification and pairing)..
We also provide a guide which provides a general description of the contents of the four sections in this archive (Documentation_for_Sections_of_Zenodo_Archive.docx). If you have any intent to utilize the data provided in Section 3, we greatly advise you to review this document as it describes the output of all geostatistical analyses performed in this study in detail.
Data Files:
Section 1. "Section_1.Tables_S1_S7.Figures_S1_S11.zip"
This section contains all additional tables and figures described in the manuscript "Likely community transmission of COVID-19 infections between neighboring, persistent hotspots in Ontario, Canada". Additional tables S1 to S7 are presented in an Excel document. These 7 tables provide summary statistics of various geostatistical tests described in the study (“Section 1 – Tables S1-S4”) and lists all identified single and paired high-case cluster streaks (“Section 1 – Tables S5-S7”). This section also contains 11 additional figures referred to in the manuscript (“Section 1 – Figures S1-S11”) both individually and within a Word document which describes them.
Section 2. "Section_2.Localized_Hotspot_Lists.zip"
All localized hotspots (identified through kriging analysis) were catalogued for each municipality evaluated (Hamilton, Kitchener/Waterloo, London, Ottawa, Toronto, Windsor/Essex). These files indicate the FSA in which the hotspot was identified, the date in which it was identified (utilizing 3-day case data at the postal code level), the amount of cases which occurred within the FSA within these 3 dates, the range of cases interpolated by kriging analysis (between 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-50, >50), and whether or not the FSA was deemed a hotspot by Gi* relative to the rest of Ontario on any of the three dates evaluated. Please see Section 4 for map images of these localized hotspots.
Section 3. "Section_3.All-Data_Files.Kriging_GiStar_Local_and_GlobalMorans.2020_2021"
Section 3 – All output files from the geostatistical tests performed in this study are provided in this section. This includes the output from Ontario-wide FSA-level Gi* and Cluster and Outlier analyses, and PC-level Cluster and Outlier, Spatial Autocorrelation, and kriging analysis of 6 municipal regions. It also includes kriging analysis of 7 other municipal regions adjacent to Toronto (Ajax, Brampton, Markham, Mississauga, Pickering, Richmond Hill and Vaughan). This section also provides data files from our analyses of stratified case data (by age, gender, and at-risk condition). All coordinates presented in these data files are given in “PCS_Lambert_Conformal_Conic” format. Case values between 1-5 were masked (appear as “NA”).
Section 4. "Section_4.All_Map_Images_of_Geostat_Analyses.zip"
Sets of image files which map the results of our geostatistical analyses onto a map of Ontario or within the municipalities evaluated (Hamilton, Kitchener/Waterloo, London, Ottawa, Toronto, Windsor/Essex) are provided. This includes: Kriging analysis (PC-level), Local Moran's I cluster and outlier analysis (FSA and PC-level), normal and space-time Gi* analysis, and all images for all analyses performed on stratified data (by age, gender and at-risk condition). Kriging contour maps are also included for 7 other municipal regions adjacent to Toronto (Ajax, Brampton, Markham, Mississauga, Pickering, Richmond Hill and Vaughan).
Software:
This Zenodo archive also provides all program files pertaining to the Geostatistical Epidemiology Toolbox (Geostatistical analysis software package to be used in ArcGIS), as well as all other scripts described in this manuscript. This geostatistical toolbox was developed by CytoGnomix Inc., London ON, Canada and is distributed freely under the terms of the GNU General Public License v3.0. It can be easily modified to accommodate other Canadian provinces and, with some additional effort, other countries.
This distribution of the Geostatistical Epidemiology Toolbox does not include postal code (PC) boundary files (which are required for some of the tools included in the toolbox). The PC boundary shapefiles used to test the toolbox were obtained from DMTI (https://www.dmtispatial.com/canmap/) through the Scholar's Geoportal at the University of Western Ontario (http://geo2.scholarsportal.info/). The distribution of these files (through sharing, sale, donation, transfer, or exchange) is strictly prohibited. However, any equivalent PC boundary shape file should suffice, provided it contains polygon boundaries representing postal code regions (see guide for more details).
Software File 1. "Software.GeostatisticalEpidemiologyToolbox.zip"
The Geostatistical Epidemiology Toolbox is a set of custom Python-based geoprocessing tools which function as any built-in tool in the ArcGIS system. This toolbox implements data preprocessing, geostatistical analysis and post-processing software developed to evaluate the distribution and progression of COVID-19 cases in Canada. The purpose of developing this toolbox is to allow external users without programming knowledge to utilize the software scripts which generated our analyses and was intended to be used to evaluate Canadian datasets. While the toolbox was developed for evaluating the distribution of COVID-19, it could be utilized for other purposes.
The toolbox was developed to evaluate statistically significant distributions of COVID-19 case data at Canadian Forward Sortation Area (FSA) and Postal Code-level in the province of Ontario utilizing geostatistical tools available through the ArcGIS system. These tools include: 1) Standard Gi* analysis (finds areas where cases are significantly spatially clustered), 2) spacetime based Gi* analysis (finds areas where cases are both spatially and temporally clustered), 3) cluster and outlier analysis (determines if high case regions are an regional outlier or part of a case cluster), 4) spatial autocorrelation (determines the cases in a region are clustered overall) and, 5) Empirical Bayesian Kriging analysis (creates contour maps which define the interpolation of COVID-19 cases in measured and unmeasured areas). Post-processing tools are included that import these all of the preceding results into the ArcGIS system and automatically generate PNG images.
This archive also includes a guide ("UserManual_GeostatisticalEpidemiologyToolbox_CytoGnomix.pdf") which describes in detail how to set up the toolbox, how to format input case data, and how to use each tool (describing both the relevant input parameters and the structure of the resultant output files).
Software File 2: “Software.Additional_Programs_for_Cluster_Outlier_Streak_Idendification_and_Pairing.zip"
In the manuscript associated with this archive, Perl scripts were utilized to evaluate postal code-level Cluster and Outlier analysis to identify significantly, highly clustered postal codes over consecutive periods (i.e., high-case cluster “streaks”). The identified streaks are then paired to those in close proximity, based on the neighbors of each postal code from PC centroid data ("paired streaks"). Multinomial logistic regression models were then derived in the R programming language to measure the correlation between the number of cases reported in each paired streak, the interval of time separating each streak, and the physical distance between the two postal codes. Here, we provide the 3 Perl scripts and the R markdown file which perform these tasks:
“Ontario_City_Closest_Postal_Code_Identification.pl”
Using an input file with postal code coordinates (by centroid), this program identifies the nearest neighbors to all postal codes for a given municipal region (the name of this region is entered on the command line). Postal code centroids were calculated in ArcGIS using the “Calculate Geometry” function against DMTI postal code boundary files (not provided). Input from other sources could be used, however, as long as the input includes a list of coordinates with a unique label associated with a particular municipality.
The output of this program (for the same municipal region being evaluated) is required for the following two Perl scripts:
“Local_Morans_Analysis.Recurrent_Clustered_PC_Identifier.pl”
This program uses the output of postal code-level Cluster and Outlier analysis for a municipality (these files are available in a second Zenodo archive: doi.org/10.5281/zenodo.5585812) and the output from “Ontario_City_Closest_Postal_Code_Identification.pl” (for the same municipal region) as input to identify high-case clustered postal codes that occur consecutively over a course of several dates (referred to as high-case cluster “streaks”). The script allows for a single day in which the PC was either not clustered or did not meet the minimum case count threshold of ≥ 6 cases within the 3-day sliding window (i.e. if
Facebook
TwitterContains the following information:COVID cases, case prevalence over different time spans, current COVID hotspots, and number of tests for the ABQ metro area at zip code level. Social vulnerability factors for the ABQ metro area at zip code level. COVID deaths at the small area level. The location of testing sites (updated regularly as new sites and information are found)The spread of COVID, testing, deaths, and PPE supply information by nursing homes (updated regularly)The locations of summer meal sites. This dashboard runs in this app: https://nmcdc.maps.arcgis.com/apps/MapSeries/index.html?appid=1ff0aa71c0ae427cbb5753d08ae19eabThis dashboard runs the following maps:Social Vulnerability Index, Albuquerque Metro Area, Census Tracts & Zip Codes, 2018 - https://nmcdc.maps.arcgis.com/home/item.html?id=850e8f2e7c394fb99041b94f813cb5faCOVID-19 Testing Locations - New Mexico - https://nmcdc.maps.arcgis.com/home/item.html?id=aace827af8fa4d2d9037ce5c7fb0e880COVID Deaths, NM Small Areas - CABQ - https://nmcdc.maps.arcgis.com/home/item.html?id=a56dab27204b4573a7f8d1663bc95844COVID-19 TESTING & CASES by TIME PERIODS, ZIP CODES - v1 - https://nmcdc.maps.arcgis.com/home/item.html?id=14e05ddda38d40cb9746750072d00c80Summer Meal Sites - CABQ - https://nmcdc.maps.arcgis.com/home/item.html?id=5fb8f3e689df4f03ab8be107d04fcd30Nursing Homes, COVID-19 Cases and Deaths, New Mexico and USA - https://nmcdc.maps.arcgis.com/home/item.html?id=8e74a05a32324aa3bcc07e2b1545d446
Facebook
TwitterThe following report outlines the workflow used to optimize your Find Hot Spots result:Initial Data Assessment.There were 2933 valid input features.There were 3108 valid input aggregation areas.There were 3108 valid input aggregation areas.There were 66 outlier locations; these will not be used to compute the optimal fixed distance band.Incident AggregationAnalysis was based on the number of points in each polygon cell.Analysis was performed on all aggregation areas.The aggregation process resulted in 3108 weighted areas.Incident Count Properties:Min0.0000Max0.0015Mean0.0001Std. Dev.0.0001Scale of AnalysisThe optimal fixed distance band was based on the average distance to 30 nearest neighbors: 150682.0000 Meters.Hot Spot AnalysisThere are 865 output features statistically significant based on a FDR correction for multiple testing and spatial dependence.OutputRed output features represent hot spots where high incident counts cluster.Blue output features represent cold spots where low incident counts cluster.