58 datasets found

m
Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First...
data.mendeley.com
Updated Jul 20, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hasmot Ali (2020). Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death [Dataset]. http://doi.org/10.17632/vw427wzzkk.5
Explore at:
Unique identifier
https://doi.org/10.17632/vw427wzzkk.5
Dataset updated
Jul 20, 2020
Authors
Hasmot Ali
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Contain informative data related to COVID-19 pandemic. Specially, figure out about the First Case and First Death information for every single country. The datasets mainly focus on two major fields first one is First Case which consists of information of Date of First Case(s), Number of confirm Case(s) at First Day, Age of the patient(s) of First Case, Last Visited Country and the other one First Death information consist of Date of First Death and Age of the Patient who died first for every Country mentioning corresponding Continent. The datasets also contain the Binary Matrix of spread chain among different country and region.

*This is not a country. This is a ship. The name of the Cruise Ship was not given from the government.
"N+": the age is not specified but greater than N
“No Trace”: some data was not found
“Unspecified”: not available from the authority
“N/A”: for “Last Visited Country(s) of Confirmed Case(s)” column, “N/A” indicates that the confirmed case(s) of those countries do not have any travel history in recent past; in “Age of First Death(s)” column “N/A” indicates that those countries do not have may death case till May 16, 2020.
f
Supporting dataset for the bachelor thesis: Simulating the Spread of...
figshare.com
data.4tu.nl
mp4
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marko Boon; Nikki Steenbakkers; Bert Zwart (2023). Supporting dataset for the bachelor thesis: Simulating the Spread of COVID-19 in the Netherlands [Dataset]. http://doi.org/10.4121/13536614.v1
Explore at:
mp4Available download formats
Unique identifier
https://doi.org/10.4121/13536614.v1
Dataset updated
May 31, 2023
Dataset provided by
4TU.ResearchData
Authors
Marko Boon; Nikki Steenbakkers; Bert Zwart
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Area covered
Netherlands
Description
These files are videos generated by a stochastic simulation that was created by Nikki Steenbakkers under the supervision of Marko Boon and Bert Zwart (all affiliated with Eindhoven University of Technology) for her bachelor final project "Simulating the Spread of COVID-19 in the Netherlands". The report can be found in the TU/e repository of bachelor project reports:https://research.tue.nl/en/studentTheses/simulating-the-spread-of-covid-19-in-the-netherlandsThe report contains more information about the project and the simulation. It explicitly refers to these files.
E
Data from: A Data set for Information Spreading over the News
live.european-language-grid.eu
txt
Updated Nov 28, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). A Data set for Information Spreading over the News [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7719
Explore at:
txtAvailable download formats
Dataset updated
Nov 28, 2021
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract:
Analyzing the spread of information related to a specific event in the news has many potential applications. Consequently, various systems have been developed to facilitate the analysis of information spreadings such as detection of disease propagation and identification of the spreading of fake news through social media. There are several open challenges in the process of discerning information propagation, among them the lack of resources for training and evaluation. This paper describes the process of compiling a corpus from the EventRegistry global media monitoring system. We focus on information spreading in three domains: sports (i.e. the FIFA WorldCup), natural disasters (i.e. earthquakes), and climate change (i.e.global warming). This corpus is a valuable addition to the currently available datasets to examine the spreading of information about various kinds of events.Introduction:Domain-specific gaps in information spreading are ubiquitous and may exist due to economic conditions, political factors, or linguistic, geographical, time-zone, cultural, and other barriers. These factors potentially contribute to obstructing the flow of local as well as international news. We believe that there is a lack of research studies that examine, identify, and uncover the reasons for barriers in information spreading. Additionally, there is limited availability of datasets containing news text and metadata including time, place, source, and other relevant information. When a piece of information starts spreading, it implicitly raises questions such as asHow far does the information in the form of news reach out to the public?Does the content of news remain the same or changes to a certain extent?Do the cultural values impact the information especially when the same news will get translated in other languages?Statistics about datasets:
Statistics about datasets:
--------------------------------------------------------------------------------------------------------------------------------------
# Domain Event Type Articles Per Language Total Articles
1 Sports FIFA World Cup 983-en, 762-sp, 711-de, 10-sl, 216-pt 2679
2 Natural Disaster Earthquake 941-en, 999-sp, 937-de, 19-sl, 251-pt 3194
3 Climate Changes Global Warming 996-en, 298-sp, 545-de, 8-sl, 97-pt 1945
--------------------------------------------------------------------------------------------------------------------------------------
A Twitter Dataset of 70+ million tweets related to COVID-19
zenodo.org
csv, tsv, zip
Updated Apr 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Gerardo Chowell; Gerardo Chowell (2023). A Twitter Dataset of 70+ million tweets related to COVID-19 [Dataset]. http://doi.org/10.5281/zenodo.3732460
Explore at:
csv, tsv, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3732460
Dataset updated
Apr 17, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Gerardo Chowell; Gerardo Chowell
Description
Due to the relevance of the COVID-19 global pandemic, we are releasing our dataset of tweets acquired from the Twitter Stream related to COVID-19 chatter. The first 9 weeks of data (from January 1st, 2020 to March 11th, 2020) contain very low tweet counts as we filtered other data we were collecting for other research purposes, however, one can see the dramatic increase as the awareness for the virus spread. Dedicated data gathering started from March 11th to March 29th which yielded over 4 million tweets a day.

The data collected from the stream captures all languages, but the higher prevalence are: English, Spanish, and French. We release all tweets and retweets on the full_dataset.tsv file (70,569,368 unique tweets), and a cleaned version with no retweets on the full_dataset-clean.tsv file (13,535,912 unique tweets). There are several practical reasons for us to leave the retweets, tracing important tweets and their dissemination is one of them. For NLP tasks we provide the top 1000 frequent terms in frequent_terms.csv, the top 1000 bigrams in frequent_bigrams.csv, and the top 1000 trigrams in frequent_trigrams.csv. Some general statistics per day are included for both datasets in the statistics-full_dataset.tsv and statistics-full_dataset-clean.tsv files.

More details can be found (and will be updated faster at: https://github.com/thepanacealab/covid19_twitter)

As always, the tweets distributed here are only tweet identifiers (with date and time added) due to the terms and conditions of Twitter to re-distribute Twitter data. The need to be hydrated to be used.
Statistically downscaled climate indices from CMIP6 global climate models...
open.canada.ca
data.urbandatacentre.ca
+3more
html, netcdf
Updated Jan 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Environment and Climate Change Canada (2025). Statistically downscaled climate indices from CMIP6 global climate models (CanDCS-U6 & CanDCS-M6) [Dataset]. https://open.canada.ca/data/dataset/764720d5-8c0a-4e1e-93fc-d9e3eb0ab6b3
Explore at:
html, netcdfAvailable download formats
Dataset updated
Jan 28, 2025
Dataset provided by
Environment And Climate Change Canadahttps://www.canada.ca/en/environment-climate-change.html
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Jan 1, 1951 - Dec 31, 2100
Description
Environment and Climate Change Canada’s (ECCC) Climate Research Division (CRD) and the Pacific Climate Impacts Consortium (PCIC) previously produced statistically downscaled climate scenarios based on simulations from climate models that participated in the Coupled Model Intercomparison Project phase 5 (CMIP5) in 2015. ECCC and PCIC have now updated the CMIP5-based downscaled scenarios with two new sets of downscaled scenarios based on the next generation of climate projections from the Coupled Model Intercomparison Project phase 6 (CMIP6). The scenarios are named Canadian Downscaled Climate Scenarios–Univariate method from CMIP6 (CanDCS-U6) and Canadian Downscaled Climate Scenarios–Multivariate method from CMIP6 (CanDCS-M6). CMIP6 climate projections are based on both updated global climate models and new emissions scenarios called “Shared Socioeconomic Pathways” (SSPs). Statistically downscaled datasets have been produced from 26 CMIP6 global climate models (GCMs) under three different emission scenarios (i.e., SSP1-2.6, SSP2-4.5, and SSP5-8.5), with PCIC later adding SSP3-7.0 to the CanDCS-M6 dataset. The CanDCS-U6 was downscaled using the Bias Correction/Constructed Analogues with Quantile mapping version 2 (BCCAQv2) procedure, and the CanDCS-M6 was downscaled using the N-dimensional Multivariate Bias Correction (MBCn) method. The CanDCS-U6 dataset was produced using the same downscaling target data (NRCANmet) as the CMIP5-based downscaled scenarios, while the CanDCS-M6 dataset implements a new target dataset (ANUSPLIN and PNWNAmet blended dataset). Statistically downscaled individual model output and ensembles are available for download. Downscaled climate indices are available across Canada at 10km grid spatial resolution for the 1950-2014 historical period and for the 2015-2100 period following each of the three emission scenarios. A total of 31 climate indices have been calculated using the CanDCS-U6 and CanDCS-M6 datasets. The climate indices include 27 Climdex indices established by the Expert Team on Climate Change Detection and Indices (ETCCDI) and 4 additional indices that are slightly modified from the Climdex indices. These indices are calculated from daily precipitation and temperature values from the downscaled simulations and are available at annual or monthly temporal resolution, depending on the index. Monthly indices are also available in seasonal and annual versions. Note: projected future changes by statistically downscaled products are not necessarily more credible than those by the underlying climate model outputs. In many cases, especially for absolute threshold-based indices, projections based on downscaled data have a smaller spread because of the removal of model biases. However, this is not the case for all indices. Downscaling from GCM resolution to the fine resolution needed for impacts assessment increases the level of spatial detail and temporal variability to better match observations. Since these adjustments are GCM dependent, the resulting indices could have a wider spread when computed from downscaled data as compared to those directly computed from GCM output. In the latter case, it is not the downscaling procedure that makes future projection more uncertain; rather, it is indicative of higher variability associated with finer spatial scale. Individual model datasets and all related derived products are subject to the terms of use (https://pcmdi.llnl.gov/CMIP6/TermsOfUse/TermsOfUse6-1.html) of the source organization.
Z
INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET
data.niaid.nih.gov
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nafiz Sadman (2024). INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4047647
Explore at:
Dataset updated
Jul 19, 2024
Dataset provided by
Kishor Datta Gupta
Nafiz Sadman
Nishat Anjum
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States, Bangladesh
Description
Introduction

There are several works based on Natural Language Processing on newspaper reports. Mining opinions from headlines [ 1 ] using Standford NLP and SVM by Rameshbhaiet. Al.compared several algorithms on a small and large dataset. Rubinet. al., in their paper [ 2 ], created a mechanism to differentiate fake news from real ones by building a set of characteristics of news according to their types. The purpose was to contribute to the low resource data available for training machine learning algorithms. Doumitet. al.in [ 3 ] have implemented LDA, a topic modeling approach to study bias present in online news media.

However, there are not many NLP research invested in studying COVID-19. Most applications include classification of chest X-rays and CT-scans to detect presence of pneumonia in lungs [ 4 ], a consequence of the virus. Other research areas include studying the genome sequence of the virus[ 5 ][ 6 ][ 7 ] and replicating its structure to fight and find a vaccine. This research is crucial in battling the pandemic. The few NLP based research publications are sentiment classification of online tweets by Samuel et el [ 8 ] to understand fear persisting in people due to the virus. Similar work has been done using the LSTM network to classify sentiments from online discussion forums by Jelodaret. al.[ 9 ]. NKK dataset is the first study on a comparatively larger dataset of a newspaper report on COVID-19, which contributed to the virus’s awareness to the best of our knowledge.

2 Data-set Introduction

2.1 Data Collection

We accumulated 1000 online newspaper report from United States of America (USA) on COVID-19. The newspaper includes The Washington Post (USA) and StarTribune (USA). We have named it as “Covid-News-USA-NNK”. We also accumulated 50 online newspaper report from Bangladesh on the issue and named it “Covid-News-BD-NNK”. The newspaper includes The Daily Star (BD) and Prothom Alo (BD). All these newspapers are from the top provider and top read in the respective countries. The collection was done manually by 10 human data-collectors of age group 23- with university degrees. This approach was suitable compared to automation to ensure the news were highly relevant to the subject. The newspaper online sites had dynamic content with advertisements in no particular order. Therefore there were high chances of online scrappers to collect inaccurate news reports. One of the challenges while collecting the data is the requirement of subscription. Each newspaper required $1 per subscriptions. Some criteria in collecting the news reports provided as guideline to the human data-collectors were as follows:

The headline must have one or more words directly or indirectly related to COVID-19.

The content of each news must have 5 or more keywords directly or indirectly related to COVID-19.

The genre of the news can be anything as long as it is relevant to the topic. Political, social, economical genres are to be more prioritized.

Avoid taking duplicate reports.

Maintain a time frame for the above mentioned newspapers.

To collect these data we used a google form for USA and BD. We have two human editor to go through each entry to check any spam or troll entry.

2.2 Data Pre-processing and Statistics

Some pre-processing steps performed on the newspaper report dataset are as follows:

Remove hyperlinks.

Remove non-English alphanumeric characters.

Remove stop words.

Lemmatize text.

While more pre-processing could have been applied, we tried to keep the data as much unchanged as possible since changing sentence structures could result us in valuable information loss. While this was done with help of a script, we also assigned same human collectors to cross check for any presence of the above mentioned criteria.

The primary data statistics of the two dataset are shown in Table 1 and 2.

Table 1: Covid-News-USA-NNK data statistics

No of words per headline

7 to 20

No of words per body content

150 to 2100

Table 2: Covid-News-BD-NNK data statistics No of words per headline

10 to 20

No of words per body content

100 to 1500

2.3 Dataset Repository

We used GitHub as our primary data repository in account name NKK^1. Here, we created two repositories USA-NKK^2 and BD-NNK^3. The dataset is available in both CSV and JSON format. We are regularly updating the CSV files and regenerating JSON using a py script. We provided a python script file for essential operation. We welcome all outside collaboration to enrich the dataset.

3 Literature Review

Natural Language Processing (NLP) deals with text (also known as categorical) data in computer science, utilizing numerous diverse methods like one-hot encoding, word embedding, etc., that transform text to machine language, which can be fed to multiple machine learning and deep learning algorithms.

Some well-known applications of NLP includes fraud detection on online media sites[ 10 ], using authorship attribution in fallback authentication systems[ 11 ], intelligent conversational agents or chatbots[ 12 ] and machine translations used by Google Translate[ 13 ]. While these are all downstream tasks, several exciting developments have been made in the algorithm solely for Natural Language Processing tasks. The two most trending ones are BERT[ 14 ], which uses bidirectional encoder-decoder architecture to create the transformer model, that can do near-perfect classification tasks and next-word predictions for next generations, and GPT-3 models released by OpenAI[ 15 ] that can generate texts almost human-like. However, these are all pre-trained models since they carry huge computation cost. Information Extraction is a generalized concept of retrieving information from a dataset. Information extraction from an image could be retrieving vital feature spaces or targeted portions of an image; information extraction from speech could be retrieving information about names, places, etc[ 16 ]. Information extraction in texts could be identifying named entities and locations or essential data. Topic modeling is a sub-task of NLP and also a process of information extraction. It clusters words and phrases of the same context together into groups. Topic modeling is an unsupervised learning method that gives us a brief idea about a set of text. One commonly used topic modeling is Latent Dirichlet Allocation or LDA[17].

Keyword extraction is a process of information extraction and sub-task of NLP to extract essential words and phrases from a text. TextRank [ 18 ] is an efficient keyword extraction technique that uses graphs to calculate the weight of each word and pick the words with more weight to it.

Word clouds are a great visualization technique to understand the overall ’talk of the topic’. The clustered words give us a quick understanding of the content.

4 Our experiments and Result analysis

We used the wordcloud library^4 to create the word clouds. Figure 1 and 3 presents the word cloud of Covid-News-USA- NNK dataset by month from February to May. From the figures 1,2,3, we can point few information:

In February, both the news paper have talked about China and source of the outbreak.

StarTribune emphasized on Minnesota as the most concerned state. In April, it seemed to have been concerned more.

Both the newspaper talked about the virus impacting the economy, i.e, bank, elections, administrations, markets.

Washington Post discussed global issues more than StarTribune.

StarTribune in February mentioned the first precautionary measurement: wearing masks, and the uncontrollable spread of the virus throughout the nation.

While both the newspaper mentioned the outbreak in China in February, the weight of the spread in the United States are more highlighted through out March till May, displaying the critical impact caused by the virus.

We used a script to extract all numbers related to certain keywords like ’Deaths’, ’Infected’, ’Died’ , ’Infections’, ’Quarantined’, Lock-down’, ’Diagnosed’ etc from the news reports and created a number of cases for both the newspaper. Figure 4 shows the statistics of this series. From this extraction technique, we can observe that April was the peak month for the covid cases as it gradually rose from February. Both the newspaper clearly shows us that the rise in covid cases from February to March was slower than the rise from March to April. This is an important indicator of possible recklessness in preparations to battle the virus. However, the steep fall from April to May also shows the positive response against the attack. We used Vader Sentiment Analysis to extract sentiment of the headlines and the body. On average, the sentiments were from -0.5 to -0.9. Vader Sentiment scale ranges from -1(highly negative to 1(highly positive). There were some cases

where the sentiment scores of the headline and body contradicted each other,i.e., the sentiment of the headline was negative but the sentiment of the body was slightly positive. Overall, sentiment analysis can assist us sort the most concerning (most negative) news from the positive ones, from which we can learn more about the indicators related to COVID-19 and the serious impact caused by it. Moreover, sentiment analysis can also provide us information about how a state or country is reacting to the pandemic. We used PageRank algorithm to extract keywords from headlines as well as the body content. PageRank efficiently highlights important relevant keywords in the text. Some frequently occurring important keywords extracted from both the datasets are: ’China’, Government’, ’Masks’, ’Economy’, ’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’, ’Health’, ’Response’. Keywords extraction acts as a filter allowing quick searches for indicators in case of locating situations of the economy,
w
Fire statistics data tables
gov.uk
s3.amazonaws.com
Updated Jul 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ministry of Housing, Communities and Local Government (2025). Fire statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire-statistics-data-tables
Explore at:
Dataset updated
Jul 10, 2025
Dataset provided by
GOV.UK
Authors
Ministry of Housing, Communities and Local Government
Description

On 1 April 2025 responsibility for fire and rescue transferred from the Home Office to the Ministry of Housing, Communities and Local Government.

This information covers fires, false alarms and other incidents attended by fire crews, and the statistics include the numbers of incidents, fires, fatalities and casualties as well as information on response times to fires. The Ministry of Housing, Communities and Local Government (MHCLG) also collect information on the workforce, fire prevention work, health and safety and firefighter pensions. All data tables on fire statistics are below.

MHCLG has responsibility for fire services in England. The vast majority of data tables produced by the Ministry of Housing, Communities and Local Government are for England but some (0101, 0103, 0201, 0501, 1401) tables are for Great Britain split by nation. In the past the Department for Communities and Local Government (who previously had responsibility for fire services in England) produced data tables for Great Britain and at times the UK. Similar information for devolved administrations are available at https://www.firescotland.gov.uk/about/statistics/" class="govuk-link">Scotland: Fire and Rescue Statistics, https://statswales.gov.wales/Catalogue/Community-Safety-and-Social-Inclusion/Community-Safety" class="govuk-link">Wales: Community safety and https://www.nifrs.org/home/about-us/publications/" class="govuk-link">Northern Ireland: Fire and Rescue Statistics.

If you use assistive technology (for example, a screen reader) and need a version of any of these documents in a more accessible format, please email alternativeformats@communities.gov.uk. Please tell us what format you need. It will help us if you say what assistive technology you use.

Related content

Fire statistics guidance
Fire statistics incident level datasets

Incidents attended

https://assets.publishing.service.gov.uk/media/686d2aa22557debd867cbe14/FIRE0101.xlsx">FIRE0101: Incidents attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 153 KB) Previous FIRE0101 tables

https://assets.publishing.service.gov.uk/media/686d2ab52557debd867cbe15/FIRE0102.xlsx">FIRE0102: Incidents attended by fire and rescue services in England, by incident type and fire and rescue authority (MS Excel Spreadsheet, 2.19 MB) Previous FIRE0102 tables

https://assets.publishing.service.gov.uk/media/686d2aca10d550c668de3c69/FIRE0103.xlsx">FIRE0103: Fires attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 201 KB) Previous FIRE0103 tables

https://assets.publishing.service.gov.uk/media/686d2ad92557debd867cbe16/FIRE0104.xlsx">FIRE0104: Fire false alarms by reason for false alarm, England (MS Excel Spreadsheet, 492 KB) Previous FIRE0104 tables

Dwelling fires attended

https://assets.publishing.service.gov.uk/media/686d2af42cfe301b5fb6789f/FIRE0201.xlsx">FIRE0201: Dwelling fires attended by fire and rescue services by motive, population and nation (MS Excel Spreadsheet, <span class="gem-c-attac
Full dataset for dengue forecasting in Brazil for Infodengue-Mosqlimate...
zenodo.org
data.niaid.nih.gov
zip
Updated Sep 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Flávio Codeço Coelho; Flávio Codeço Coelho; Claudia Torres Codeço; Claudia Torres Codeço; Iasmim Almeida; Iasmim Almeida; Luiz Max Carvalho; Luiz Max Carvalho; Eduardo Correa Araújo; Eduardo Correa Araújo; Leonardo Bastos; Leonardo Bastos; Luã Bida Vacaro; Raquel Martins Lana; Raquel Martins Lana; Luã Bida Vacaro (2024). Full dataset for dengue forecasting in Brazil for Infodengue-Mosqlimate sprint 2024 [Dataset]. http://doi.org/10.5281/zenodo.13328231
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13328231
Dataset updated
Sep 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Flávio Codeço Coelho; Flávio Codeço Coelho; Claudia Torres Codeço; Claudia Torres Codeço; Iasmim Almeida; Iasmim Almeida; Luiz Max Carvalho; Luiz Max Carvalho; Eduardo Correa Araújo; Eduardo Correa Araújo; Leonardo Bastos; Leonardo Bastos; Luã Bida Vacaro; Raquel Martins Lana; Raquel Martins Lana; Luã Bida Vacaro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Brazil
Description
The year 2024 has seen an exceptional number of reported dengue fever cases in various parts of the world. In Brazil, the disease has spread to areas in the south and at altitudes where epidemics were not previously recorded, and the incidence rate has far exceeded that of previous years. The objective of this dataset is to promote, in a standardized way, the training of predictive models with the aim of developing forecast models for dengue in Brazil.
A Twitter Dataset of 100+ million tweets related to COVID-19
zenodo.org
application/gzip, csv +1
Updated Apr 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding; Gerardo Chowell; Gerardo Chowell; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding (2023). A Twitter Dataset of 100+ million tweets related to COVID-19 [Dataset]. http://doi.org/10.5281/zenodo.3735274
Explore at:
application/gzip, tsv, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3735274
Dataset updated
Apr 17, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding; Gerardo Chowell; Gerardo Chowell; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding
Description
Due to the relevance of the COVID-19 global pandemic, we are releasing our dataset of tweets acquired from the Twitter Stream related to COVID-19 chatter. The first 9 weeks of data (from January 1st, 2020 to March 11th, 2020) contain very low tweet counts as we filtered other data we were collecting for other research purposes, however, one can see the dramatic increase as the awareness for the virus spread. Dedicated data gathering started from March 11th to March 30th which yielded over 4 million tweets a day. We have added additional data provided by our new collaborators from January 27th to February 27th, to provide extra longitudinal coverage.

The data collected from the stream captures all languages, but the higher prevalence are: English, Spanish, and French. We release all tweets and retweets on the full_dataset.tsv file (101,400,452 unique tweets), and a cleaned version with no retweets on the full_dataset-clean.tsv file (20,244,746 unique tweets). There are several practical reasons for us to leave the retweets, tracing important tweets and their dissemination is one of them. For NLP tasks we provide the top 1000 frequent terms in frequent_terms.csv, the top 1000 bigrams in frequent_bigrams.csv, and the top 1000 trigrams in frequent_trigrams.csv. Some general statistics per day are included for both datasets in the statistics-full_dataset.tsv and statistics-full_dataset-clean.tsv files.

More details can be found (and will be updated faster at: https://github.com/thepanacealab/covid19_twitter)

As always, the tweets distributed here are only tweet identifiers (with date and time added) due to the terms and conditions of Twitter to re-distribute Twitter data. The need to be hydrated to be used.
COVID19-Dataset-with-100-World-Countries
kaggle.com
Updated Mar 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sami Belkacem (2021). COVID19-Dataset-with-100-World-Countries [Dataset]. https://www.kaggle.com/sambelkacem/covid19-algeria-and-world-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 1, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sami Belkacem
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
World
Description
COVID19-Algeria-and-World-Dataset

A coronavirus dataset with 104 countries constructed from different reliable sources, where each row represents a country, and the columns represent geographic, climate, healthcare, economic, and demographic factors that may contribute to accelerate/slow the spread of the COVID-19. The assumptions for the different factors are as follows:

Geography: some continents/areas may be more affected by the disease

Climate: cold temperatures may promote the spread of the virus

Healthcare: lack of hospital beds/doctors may lead to more human losses

Economy: weak economies (GDP) have fewer means to fight the disease

Demography: older populations may be at higher risk of the disease

The last column represents the number of daily tests performed and the total number of cases and deaths reported each day.

Data description

https://raw.githubusercontent.com/SamBelkacem/COVID19-Algeria-and-World-Dataset/master/Images/Data%20description.png">

Countries in the dataset by geographic coordinates

https://raw.githubusercontent.com/SamBelkacem/COVID19-Algeria-and-World-Dataset/master/Images/Countries%20by%20geographic%20coordinates.png">

Europe: 33 countries

Asia: 28 countries

Africa: 21 countries

North America: 11 countries

South America: 8 countries

Oceania: 3 countries

Statistical description of the data

https://raw.githubusercontent.com/SamBelkacem/COVID19-Algeria-and-World-Dataset/master/Images/Statistical%20description%20of%20the%20data.png">

Data distribution

https://raw.githubusercontent.com/SamBelkacem/COVID19-Algeria-and-World-Dataset/master/Images/Data%20distribution.png">

Download

The dataset is available in an encoded CSV form on GitHub.

Python code

The Python Jupyter Notebook to read and visualize the data is available on nbviewer.

Data update

The dataset is updated every month with the latest numbers of COVID-19 cases, deaths, and tests. The last update was on March 01, 2021.

Data construction

The dataset is constructed from different reliable sources, where each row represents a country, and the columns represent geographic, climate, healthcare, economic, and demographic factors that may contribute to accelerate/slow the spread of the coronavirus. Note that we selected only the main factors for which we found data and that other factors can be used. All data were retrieved from the reliable Our World in Data website, except for data on:

Continents: www.kaggle.com/statchaitya/country-to-continent

Geographic-coordinates: www.kaggle.com/eidanch/counties-geographic-coordinates

Temperatures: www.kaggle.com/berkeleyearth/climate-change-earth-surface-temperature-data

Share of the population over 65 years old: https://data.worldbank.org/indicator/SP.POP.65UP.TO.ZS

GDP/Capita: https://data.worldbank.org/indicator/NY.GDP.PCAP.CD

Citation

If you want to use the dataset please cite the following arXiv paper, more details about the data construction are provided in it.

@article{belkacem_covid-19_2020, title = {COVID-19 data analysis and forecasting: Algeria and the world}, shorttitle = {COVID-19 data analysis and forecasting}, journal = {arXiv preprint arXiv:2007.09755}, author = {Belkacem, Sami}, year = {2020} }

Contact

If you have any question or suggestion, please contact me at this email address: s.belkacem@usthb.dz
T
Iowa Economic Indicators
data.iowa.gov
mydata.iowa.gov
+1more
application/rdfxml +5
Updated Jun 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Iowa Department of Revenue, Research and Analysis Division (2025). Iowa Economic Indicators [Dataset]. https://data.iowa.gov/Economic-Statistics/Iowa-Economic-Indicators/qd3t-kfqg
Explore at:
json, xml, csv, application/rssxml, application/rdfxml, tsvAvailable download formats
Dataset updated
Jun 5, 2025
Dataset authored and provided by
Iowa Department of Revenue, Research and Analysis Division
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Iowa
Description
This dataset provides economic indicators used to monitor Iowa's economy and forecast future direction of economic activity in Iowa.
FIRE0203: previous data tables
gov.uk
Updated Sep 6, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Home Office (2018). FIRE0203: previous data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire0203-previous-data-tables
Explore at:
Dataset updated
Sep 6, 2018
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Home Office
Description
FIRE0203: Dwelling fires by spread of fire and motive (19 September 2024)

https://assets.publishing.service.gov.uk/media/66e2eacd3f1299ce5d5c3d90/fire-statistics-data-tables-fire0203-210923.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (21 September 2023) (MS Excel Spreadsheet, 87.7 KB)

https://assets.publishing.service.gov.uk/media/650ac4d4fbd7bc000dcb51d1/fire-statistics-data-tables-fire0203-290922.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (29 September 2022) (MS Excel Spreadsheet, 83.8 KB)

https://assets.publishing.service.gov.uk/media/63316357e90e0711d7fbfb7b/fire-statistics-data-tables-fire0203-300921.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (30 September 2021) (MS Excel Spreadsheet, 89.3 KB)

https://assets.publishing.service.gov.uk/media/615191b28fa8f561101f390e/fire-statistics-data-tables-fire0203-011020.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (1 October 2020) (MS Excel Spreadsheet, 70.2 KB)

https://assets.publishing.service.gov.uk/media/5f71c632d3bf7f47a36d96cb/fire-statistics-data-tables-fire0203-120919.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (12 September 2019) (MS Excel Spreadsheet, 78.8 KB)

https://assets.publishing.service.gov.uk/media/5d7277d140f0b609283d9f74/fire-statistics-data-tables-fire0203-060918.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (6 September 2018) (MS Excel Spreadsheet, 340 KB)

https://assets.publishing.service.gov.uk/media/5b8d0cc5e5274a0bdab54b22/fire-statistics-data-tables-fire0203.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (12 October 2017) (MS Excel Spreadsheet, 58.5 KB)

Related content

Fire statistics data tables
Fire statistics guidance
Fire statistics
e
Model Output Statistics for SAN DIEGO (LINDBERGH FIELD) (72290)
data.europa.eu
Updated Apr 25, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Model Output Statistics for SAN DIEGO (LINDBERGH FIELD) (72290) [Dataset]. https://data.europa.eu/data/datasets/de-dwd-mosmix-72290
Explore at:
Dataset updated
Apr 25, 2018
Description
DWD’s fully automatic MOSMIX product optimizes and interprets the forecast calculations of the NWP models ICON (DWD) and IFS (ECMWF), combines these and calculates statistically optimized weather forecasts in terms of point forecasts (PFCs). Thus, statistically corrected, updated forecasts for the next ten days are calculated for about 5400 locations around the world. Most forecasting locations are spread over Germany and Europe. MOSMIX forecasts (PFCs) include nearly all common meteorological parameters measured by weather stations. For further information please refer to: [in German: https://www.dwd.de/DE/leistungen/met_verfahren_mosmix/met_verfahren_mosmix.html ] [in English: https://www.dwd.de/EN/ourservices/met_application_mosmix/met_application_mosmix.html ]
FIRE0304: previous data tables
gov.uk
Updated Sep 6, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Home Office (2018). FIRE0304: previous data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire0304-previous-data-tables
Explore at:
Dataset updated
Sep 6, 2018
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Home Office
Description
FIRE0304: Other buildings fire by spread of fire and motive (19 September 2024)

https://assets.publishing.service.gov.uk/media/66e3e6630d913026165c3df6/fire-statistics-data-tables-fire0304-210923.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (21 September 2023) (MS Excel Spreadsheet, 121 KB)

https://assets.publishing.service.gov.uk/media/650ac64c27d43b001491c2b0/fire-statistics-data-tables-fire0304-290922.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (29 September 2022) (MS Excel Spreadsheet, 115 KB)

https://assets.publishing.service.gov.uk/media/63316bef8fa8f51d2a863128/fire-statistics-data-tables-fire0304-300921.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (30 September 2021) (MS Excel Spreadsheet, 118 KB)

https://assets.publishing.service.gov.uk/media/615195dfd3bf7f718c758109/fire-statistics-data-tables-fire0304-011020.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (1 October 2020) (MS Excel Spreadsheet, 168 KB)

https://assets.publishing.service.gov.uk/media/5f71c7438fa8f5188aa288fc/fire-statistics-data-tables-fire0304-120919.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (12 September 2019) (MS Excel Spreadsheet, 112 KB)

https://assets.publishing.service.gov.uk/media/5d7279bce5274a09860c1376/fire-statistics-data-tables-fire0304-060918.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (6 September 2018) (MS Excel Spreadsheet, 837 KB)

https://assets.publishing.service.gov.uk/media/5b8d1052ed915d1ec02ff23d/fire-statistics-data-tables-fire0304.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (12 October 2017) (MS Excel Spreadsheet, 60 KB)

Related content

Fire statistics data tables
Fire statistics guidance
Fire statistics
Data (i.e., evidence) about evidence based medicine
figshare.com
search.datacite.org
png
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jorge H Ramirez (2023). Data (i.e., evidence) about evidence based medicine [Dataset]. http://doi.org/10.6084/m9.figshare.1093997.v24
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1093997.v24
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Jorge H Ramirez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Update — December 7, 2014. – Evidence-based medicine (EBM) is not working for many reasons, for example: 1. Incorrect in their foundations (paradox): hierarchical levels of evidence are supported by opinions (i.e., lowest strength of evidence according to EBM) instead of real data collected from different types of study designs (i.e., evidence). http://dx.doi.org/10.6084/m9.figshare.1122534 2. The effect of criminal practices by pharmaceutical companies is only possible because of the complicity of others: healthcare systems, professional associations, governmental and academic institutions. Pharmaceutical companies also corrupt at the personal level, politicians and political parties are on their payroll, medical professionals seduced by different types of gifts in exchange of prescriptions (i.e., bribery) which very likely results in patients not receiving the proper treatment for their disease, many times there is no such thing: healthy persons not needing pharmacological treatments of any kind are constantly misdiagnosed and treated with unnecessary drugs. Some medical professionals are converted in K.O.L. which is only a puppet appearing on stage to spread lies to their peers, a person supposedly trained to improve the well-being of others, now deceits on behalf of pharmaceutical companies. Probably the saddest thing is that many honest doctors are being misled by these lies created by the rules of pharmaceutical marketing instead of scientific, medical, and ethical principles. Interpretation of EBM in this context was not anticipated by their creators. “The main reason we take so many drugs is that drug companies don’t sell drugs, they sell lies about drugs.” ―Peter C. Gøtzsche “doctors and their organisations should recognise that it is unethical to receive money that has been earned in part through crimes that have harmed those people whose interests doctors are expected to take care of. Many crimes would be impossible to carry out if doctors weren’t willing to participate in them.” —Peter C Gøtzsche, The BMJ, 2012, Big pharma often commits corporate crime, and this must be stopped. Pending (Colombia): Health Promoter Entities (In Spanish: EPS ―Empresas Promotoras de Salud).

Misinterpretations New technologies or concepts are difficult to understand in the beginning, it doesn’t matter their simplicity, we need to get used to new tools aimed to improve our professional practice. Probably the best explanation is here in these videos (credits to Antonio Villafaina for sharing these videos with me). English https://www.youtube.com/watch?v=pQHX-SjgQvQ&w=420&h=315 Spanish https://www.youtube.com/watch?v=DApozQBrlhU&w=420&h=315 ----------------------- Hypothesis: hierarchical levels of evidence based medicine are wrong Dear Editor, I have data to support the hypothesis described in the title of this letter. Before rejecting the null hypothesis I would like to ask the following open question:Could you support with data that hierarchical levels of evidence based medicine are correct? (1,2) Additional explanation to this question: – Only respond to this question attaching publicly available raw data.– Be aware that more than a question this is a challenge: I have data (i.e., evidence) which is contrary to classic (i.e., McMaster) or current (i.e., Oxford) hierarchical levels of evidence based medicine. An important part of this data (but not all) is publicly available. References

Ramirez, Jorge H (2014): The EBM challenge. figshare. http://dx.doi.org/10.6084/m9.figshare.1135873

The EBM Challenge Day 1: No Answers. Competing interests: I endorse the principles of open data in human biomedical research Read this letter on The BMJ – August 13, 2014.http://www.bmj.com/content/348/bmj.g3725/rr/762595Re: Greenhalgh T, et al. Evidence based medicine: a movement in crisis? BMJ 2014; 348: g3725. _ Fileset contents Raw data: Excel archive: Raw data, interactive figures, and PubMed search terms. Google Spreadsheet is also available (URL below the article description). Figure 1. Unadjusted (Fig 1A) and adjusted (Fig 1B) PubMed publication trends (01/01/1992 to 30/06/2014). Figure 2. Adjusted PubMed publication trends (07/01/2008 to 29/06/2014) Figure 3. Google search trends: Jan 2004 to Jun 2014 / 1-week periods. Figure 4. PubMed publication trends (1962-2013) systematic reviews and meta-analysis, clinical trials, and observational studies.
Figure 5. Ramirez, Jorge H (2014): Infographics: Unpublished US phase 3 clinical trials (2002-2014) completed before Jan 2011 = 50.8%. figshare.http://dx.doi.org/10.6084/m9.figshare.1121675 Raw data: "13377 studies found for: Completed | Interventional Studies | Phase 3 | received from 01/01/2002 to 01/01/2014 | Worldwide". This database complies with the terms and conditions of ClinicalTrials.gov: http://clinicaltrials.gov/ct2/about-site/terms-conditions Supplementary Figures (S1-S6). PubMed publication delay in the indexation processes does not explain the descending trends in the scientific output of evidence-based medicine. Acknowledgments I would like to acknowledge the following persons for providing valuable concepts in data visualization and infographics:

Maria Fernanda Ramírez. Professor of graphic design. Universidad del Valle. Cali, Colombia.

Lorena Franco. Graphic design student. Universidad del Valle. Cali, Colombia. Related articles by this author (Jorge H. Ramírez)

Ramirez JH. Lack of transparency in clinical trials: a call for action. Colomb Med (Cali) 2013;44(4):243-6. URL: http://www.ncbi.nlm.nih.gov/pubmed/24892242

Ramirez JH. Re: Evidence based medicine is broken (17 June 2014). http://www.bmj.com/node/759181

Ramirez JH. Re: Global rules for global health: why we need an independent, impartial WHO (19 June 2014). http://www.bmj.com/node/759151

Ramirez JH. PubMed publication trends (1992 to 2014): evidence based medicine and clinical practice guidelines (04 July 2014). http://www.bmj.com/content/348/bmj.g3725/rr/759895 Recommended articles

Greenhalgh Trisha, Howick Jeremy,Maskrey Neal. Evidence based medicine: a movement in crisis? BMJ 2014;348:g3725

Spence Des. Evidence based medicine is broken BMJ 2014; 348:g22

Schünemann Holger J, Oxman Andrew D,Brozek Jan, Glasziou Paul, JaeschkeRoman, Vist Gunn E et al. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies BMJ 2008; 336:1106

Lau Joseph, Ioannidis John P A, TerrinNorma, Schmid Christopher H, OlkinIngram. The case of the misleading funnel plot BMJ 2006; 333:597

Moynihan R, Henry D, Moons KGM (2014) Using Evidence to Combat Overdiagnosis and Overtreatment: Evaluating Treatments, Tests, and Disease Definitions in the Time of Too Much. PLoS Med 11(7): e1001655. doi:10.1371/journal.pmed.1001655

Katz D. A-holistic view of evidence based medicinehttp://thehealthcareblog.com/blog/2014/05/02/a-holistic-view-of-evidence-based-medicine/ ---
e
Model Output Statistics for KIELCE-SUKOW (12570)
data.europa.eu
Updated Apr 25, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Model Output Statistics for KIELCE-SUKOW (12570) [Dataset]. https://data.europa.eu/data/datasets/de-dwd-mosmix-12570
Explore at:
Dataset updated
Apr 25, 2018
Description
DWD’s fully automatic MOSMIX product optimizes and interprets the forecast calculations of the NWP models ICON (DWD) and IFS (ECMWF), combines these and calculates statistically optimized weather forecasts in terms of point forecasts (PFCs). Thus, statistically corrected, updated forecasts for the next ten days are calculated for about 5400 locations around the world. Most forecasting locations are spread over Germany and Europe. MOSMIX forecasts (PFCs) include nearly all common meteorological parameters measured by weather stations. For further information please refer to: [in German: https://www.dwd.de/DE/leistungen/met_verfahren_mosmix/met_verfahren_mosmix.html ] [in English: https://www.dwd.de/EN/ourservices/met_application_mosmix/met_application_mosmix.html ]
A
‘Cricket Dataset’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Cricket Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-cricket-dataset-39db/d978d471/?iid=009-591&v=presentation
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Cricket Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/notkrishna/cricket-statistics-for-all-formats on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

Cricket is a bat-and-ball game played between two teams of eleven players on a field at the centre of which is a 22-yard (20-metre) pitch with a wicket at each end, each comprising two bails balanced on three stumps. The game proceeds when a player on the fielding team, called the bowler, "bowls" (propels) the ball from one end of the pitch towards the wicket at the other end. The batting side's players score runs by striking the bowled ball with a bat and running between the wickets, while the fielding side tries to prevent this by keeping the ball within the field and getting it to either wicket, and also tries to dismiss each batter (so they are "out"). Means of dismissal include being bowled, when the ball hits the stumps and dislodges the bails, and by the fielding side either catching a hit ball before it touches the ground, or hitting a wicket with the ball before a batter can cross the crease line in front of the wicket to complete a run. When ten batters have been dismissed, the innings ends and the teams swap roles. The game is adjudicated by two umpires, aided by a third umpire and match referee in international matches.

Forms of cricket range from Twenty20, with each team batting for a single innings of 20 overs and the game generally lasting three hours, to Test matches played over five days. Traditionally cricketers play in all-white kit, but in limited overs cricket they wear club or team colours. In addition to the basic kit, some players wear protective gear to prevent injury caused by the ball, which is a hard, solid spheroid made of compressed leather with a slightly raised sewn seam enclosing a cork core layered with tightly wound string.

The earliest reference to cricket is in South East England in the mid-16th century. It spread globally with the expansion of the British Empire, with the first international matches in the second half of the 19th century. The game's governing body is the International Cricket Council (ICC), which has over 100 members, twelve of which are full members who play Test matches. The game's rules, the Laws of Cricket, are maintained by Marylebone Cricket Club (MCC) in London. The sport is followed primarily in South Asia, Australasia, the United Kingdom, southern Africa and the West Indies.[1] Women's cricket, which is organised and played separately, has also achieved international standard. The most successful side playing international cricket is Australia, which has won seven One Day International trophies, including five World Cups, more than any other country and has been the top-rated Test side more than any other country.

Content

Cricket as any sport is full of important data and stats. Given, the game is generally is played in three different formats, one day (50 overs for each team to score and bowl), test (no limitations on overs but played for max 5 days with each team having two innings to score), and newest format twenty20 (each team has 20 overs to score).

Dataset contains 9 files (3 for each format). Each group of three files contains best stats for batsmen, bowlers and series/tournaments.

Source https://www.espncricinfo.com/

Play with it as you like.

--- Original source retains full ownership of the source dataset ---
f
Summary of descriptive statistics.
plos.figshare.com
figshare.com
xls
Updated Jun 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yohanes E. Riyanto; Jianlin Zhang (2023). Summary of descriptive statistics. [Dataset]. http://doi.org/10.1371/journal.pone.0232037.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0232037.t001
Dataset updated
Jun 2, 2023
Dataset provided by
PLOS ONE
Authors
Yohanes E. Riyanto; Jianlin Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary of descriptive statistics.
g
Demographics
health.google.com
Updated Oct 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Demographics [Dataset]. https://health.google.com/covid-19/open-data/raw-data
Explore at:
Dataset updated
Oct 7, 2021
Variables measured
key, population, population_male, rural_population, urban_population, population_female, population_density, clustered_population, population_age_00_09, population_age_10_19, and 11 more
Description
Various population statistics, including structured demographics data.
d
Overseas Buddhist Monk Visit to Taiwan Preaching Statistics
data.gov.tw
csv
Updated Jul 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ministry of Culture (2023). Overseas Buddhist Monk Visit to Taiwan Preaching Statistics [Dataset]. https://data.gov.tw/en/datasets/7621
Explore at:
csvAvailable download formats
Dataset updated
Jul 28, 2023
Dataset authored and provided by
Ministry of Culture
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Area covered
Taiwan
Description
This dataset mainly provides statistics on the number of overseas Tibetan monks who have come to Taiwan to spread the Dharma at the Mencius Culture Center.

Facebook

Twitter

Click to copy link

Link copied

Cite

Hasmot Ali (2020). Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death [Dataset]. http://doi.org/10.17632/vw427wzzkk.5

Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death

Explore at:

Unique identifier

https://doi.org/10.17632/vw427wzzkk.5

Dataset updated

Jul 20, 2020

Authors

Hasmot Ali

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Contain informative data related to COVID-19 pandemic. Specially, figure out about the First Case and First Death information for every single country. The datasets mainly focus on two major fields first one is First Case which consists of information of Date of First Case(s), Number of confirm Case(s) at First Day, Age of the patient(s) of First Case, Last Visited Country and the other one First Death information consist of Date of First Death and Age of the Patient who died first for every Country mentioning corresponding Continent. The datasets also contain the Binary Matrix of spread chain among different country and region.

*This is not a country. This is a ship. The name of the Cruise Ship was not given from the government.
"N+": the age is not specified but greater than N
“No Trace”: some data was not found
“Unspecified”: not available from the authority
“N/A”: for “Last Visited Country(s) of Confirmed Case(s)” column, “N/A” indicates that the confirmed case(s) of those countries do not have any travel history in recent past; in “Age of First Death(s)” column “N/A” indicates that those countries do not have may death case till May 16, 2020.

Clear search

Close search

Google apps

Main menu

Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First...

Supporting dataset for the bachelor thesis: Simulating the Spread of...

Data from: A Data set for Information Spreading over the News

A Twitter Dataset of 70+ million tweets related to COVID-19

Statistically downscaled climate indices from CMIP6 global climate models...

INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET

Fire statistics data tables

Related content

Incidents attended

Dwelling fires attended

Full dataset for dengue forecasting in Brazil for Infodengue-Mosqlimate...

A Twitter Dataset of 100+ million tweets related to COVID-19

COVID19-Dataset-with-100-World-Countries

COVID19-Algeria-and-World-Dataset

Data description

Countries in the dataset by geographic coordinates

Statistical description of the data

Data distribution

Download

Python code

Data update

Data construction

Citation

Contact

Iowa Economic Indicators

FIRE0203: previous data tables

Related content

Model Output Statistics for SAN DIEGO (LINDBERGH FIELD) (72290)

FIRE0304: previous data tables

Related content

Data (i.e., evidence) about evidence based medicine

Model Output Statistics for KIELCE-SUKOW (12570)

‘Cricket Dataset’ analyzed by Analyst-2

Context

Content

Summary of descriptive statistics.

Demographics

Overseas Buddhist Monk Visit to Taiwan Preaching Statistics

Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First DeathSee More Versions

Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death