58 datasets found
  1. m

    Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First...

    • data.mendeley.com
    Updated Jul 20, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hasmot Ali (2020). Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death [Dataset]. http://doi.org/10.17632/vw427wzzkk.5
    Explore at:
    Dataset updated
    Jul 20, 2020
    Authors
    Hasmot Ali
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Contain informative data related to COVID-19 pandemic. Specially, figure out about the First Case and First Death information for every single country. The datasets mainly focus on two major fields first one is First Case which consists of information of Date of First Case(s), Number of confirm Case(s) at First Day, Age of the patient(s) of First Case, Last Visited Country and the other one First Death information consist of Date of First Death and Age of the Patient who died first for every Country mentioning corresponding Continent. The datasets also contain the Binary Matrix of spread chain among different country and region.

    *This is not a country. This is a ship. The name of the Cruise Ship was not given from the government.
    "N+": the age is not specified but greater than N
    “No Trace”: some data was not found
    “Unspecified”: not available from the authority
    “N/A”: for “Last Visited Country(s) of Confirmed Case(s)” column, “N/A” indicates that the confirmed case(s) of those countries do not have any travel history in recent past; in “Age of First Death(s)” column “N/A” indicates that those countries do not have may death case till May 16, 2020.

  2. f

    Supporting dataset for the bachelor thesis: Simulating the Spread of...

    • figshare.com
    • data.4tu.nl
    mp4
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marko Boon; Nikki Steenbakkers; Bert Zwart (2023). Supporting dataset for the bachelor thesis: Simulating the Spread of COVID-19 in the Netherlands [Dataset]. http://doi.org/10.4121/13536614.v1
    Explore at:
    mp4Available download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Marko Boon; Nikki Steenbakkers; Bert Zwart
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Area covered
    Netherlands
    Description

    These files are videos generated by a stochastic simulation that was created by Nikki Steenbakkers under the supervision of Marko Boon and Bert Zwart (all affiliated with Eindhoven University of Technology) for her bachelor final project "Simulating the Spread of COVID-19 in the Netherlands". The report can be found in the TU/e repository of bachelor project reports:https://research.tue.nl/en/studentTheses/simulating-the-spread-of-covid-19-in-the-netherlandsThe report contains more information about the project and the simulation. It explicitly refers to these files.

  3. E

    Data from: A Data set for Information Spreading over the News

    • live.european-language-grid.eu
    txt
    Updated Nov 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). A Data set for Information Spreading over the News [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7719
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 28, 2021
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract:

    Analyzing the spread of information related to a specific event in the news has many potential applications. Consequently, various systems have been developed to facilitate the analysis of information spreadings such as detection of disease propagation and identification of the spreading of fake news through social media. There are several open challenges in the process of discerning information propagation, among them the lack of resources for training and evaluation. This paper describes the process of compiling a corpus from the EventRegistry global media monitoring system. We focus on information spreading in three domains: sports (i.e. the FIFA WorldCup), natural disasters (i.e. earthquakes), and climate change (i.e.global warming). This corpus is a valuable addition to the currently available datasets to examine the spreading of information about various kinds of events.Introduction:Domain-specific gaps in information spreading are ubiquitous and may exist due to economic conditions, political factors, or linguistic, geographical, time-zone, cultural, and other barriers. These factors potentially contribute to obstructing the flow of local as well as international news. We believe that there is a lack of research studies that examine, identify, and uncover the reasons for barriers in information spreading. Additionally, there is limited availability of datasets containing news text and metadata including time, place, source, and other relevant information. When a piece of information starts spreading, it implicitly raises questions such as asHow far does the information in the form of news reach out to the public?Does the content of news remain the same or changes to a certain extent?Do the cultural values impact the information especially when the same news will get translated in other languages?Statistics about datasets:

    Statistics about datasets:

    --------------------------------------------------------------------------------------------------------------------------------------

    # Domain Event Type Articles Per Language Total Articles

    1 Sports FIFA World Cup 983-en, 762-sp, 711-de, 10-sl, 216-pt 2679

    2 Natural Disaster Earthquake 941-en, 999-sp, 937-de, 19-sl, 251-pt 3194

    3 Climate Changes Global Warming 996-en, 298-sp, 545-de, 8-sl, 97-pt 1945

    --------------------------------------------------------------------------------------------------------------------------------------

  4. A Twitter Dataset of 70+ million tweets related to COVID-19

    • zenodo.org
    csv, tsv, zip
    Updated Apr 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Gerardo Chowell; Gerardo Chowell (2023). A Twitter Dataset of 70+ million tweets related to COVID-19 [Dataset]. http://doi.org/10.5281/zenodo.3732460
    Explore at:
    csv, tsv, zipAvailable download formats
    Dataset updated
    Apr 17, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Gerardo Chowell; Gerardo Chowell
    Description

    Due to the relevance of the COVID-19 global pandemic, we are releasing our dataset of tweets acquired from the Twitter Stream related to COVID-19 chatter. The first 9 weeks of data (from January 1st, 2020 to March 11th, 2020) contain very low tweet counts as we filtered other data we were collecting for other research purposes, however, one can see the dramatic increase as the awareness for the virus spread. Dedicated data gathering started from March 11th to March 29th which yielded over 4 million tweets a day.

    The data collected from the stream captures all languages, but the higher prevalence are: English, Spanish, and French. We release all tweets and retweets on the full_dataset.tsv file (70,569,368 unique tweets), and a cleaned version with no retweets on the full_dataset-clean.tsv file (13,535,912 unique tweets). There are several practical reasons for us to leave the retweets, tracing important tweets and their dissemination is one of them. For NLP tasks we provide the top 1000 frequent terms in frequent_terms.csv, the top 1000 bigrams in frequent_bigrams.csv, and the top 1000 trigrams in frequent_trigrams.csv. Some general statistics per day are included for both datasets in the statistics-full_dataset.tsv and statistics-full_dataset-clean.tsv files.

    More details can be found (and will be updated faster at: https://github.com/thepanacealab/covid19_twitter)

    As always, the tweets distributed here are only tweet identifiers (with date and time added) due to the terms and conditions of Twitter to re-distribute Twitter data. The need to be hydrated to be used.

  5. Statistically downscaled climate indices from CMIP6 global climate models...

    • open.canada.ca
    • data.urbandatacentre.ca
    • +3more
    html, netcdf
    Updated Jan 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Environment and Climate Change Canada (2025). Statistically downscaled climate indices from CMIP6 global climate models (CanDCS-U6 & CanDCS-M6) [Dataset]. https://open.canada.ca/data/dataset/764720d5-8c0a-4e1e-93fc-d9e3eb0ab6b3
    Explore at:
    html, netcdfAvailable download formats
    Dataset updated
    Jan 28, 2025
    Dataset provided by
    Environment And Climate Change Canadahttps://www.canada.ca/en/environment-climate-change.html
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Time period covered
    Jan 1, 1951 - Dec 31, 2100
    Description

    Environment and Climate Change Canada’s (ECCC) Climate Research Division (CRD) and the Pacific Climate Impacts Consortium (PCIC) previously produced statistically downscaled climate scenarios based on simulations from climate models that participated in the Coupled Model Intercomparison Project phase 5 (CMIP5) in 2015. ECCC and PCIC have now updated the CMIP5-based downscaled scenarios with two new sets of downscaled scenarios based on the next generation of climate projections from the Coupled Model Intercomparison Project phase 6 (CMIP6). The scenarios are named Canadian Downscaled Climate Scenarios–Univariate method from CMIP6 (CanDCS-U6) and Canadian Downscaled Climate Scenarios–Multivariate method from CMIP6 (CanDCS-M6). CMIP6 climate projections are based on both updated global climate models and new emissions scenarios called “Shared Socioeconomic Pathways” (SSPs). Statistically downscaled datasets have been produced from 26 CMIP6 global climate models (GCMs) under three different emission scenarios (i.e., SSP1-2.6, SSP2-4.5, and SSP5-8.5), with PCIC later adding SSP3-7.0 to the CanDCS-M6 dataset. The CanDCS-U6 was downscaled using the Bias Correction/Constructed Analogues with Quantile mapping version 2 (BCCAQv2) procedure, and the CanDCS-M6 was downscaled using the N-dimensional Multivariate Bias Correction (MBCn) method. The CanDCS-U6 dataset was produced using the same downscaling target data (NRCANmet) as the CMIP5-based downscaled scenarios, while the CanDCS-M6 dataset implements a new target dataset (ANUSPLIN and PNWNAmet blended dataset). Statistically downscaled individual model output and ensembles are available for download. Downscaled climate indices are available across Canada at 10km grid spatial resolution for the 1950-2014 historical period and for the 2015-2100 period following each of the three emission scenarios. A total of 31 climate indices have been calculated using the CanDCS-U6 and CanDCS-M6 datasets. The climate indices include 27 Climdex indices established by the Expert Team on Climate Change Detection and Indices (ETCCDI) and 4 additional indices that are slightly modified from the Climdex indices. These indices are calculated from daily precipitation and temperature values from the downscaled simulations and are available at annual or monthly temporal resolution, depending on the index. Monthly indices are also available in seasonal and annual versions. Note: projected future changes by statistically downscaled products are not necessarily more credible than those by the underlying climate model outputs. In many cases, especially for absolute threshold-based indices, projections based on downscaled data have a smaller spread because of the removal of model biases. However, this is not the case for all indices. Downscaling from GCM resolution to the fine resolution needed for impacts assessment increases the level of spatial detail and temporal variability to better match observations. Since these adjustments are GCM dependent, the resulting indices could have a wider spread when computed from downscaled data as compared to those directly computed from GCM output. In the latter case, it is not the downscaling procedure that makes future projection more uncertain; rather, it is indicative of higher variability associated with finer spatial scale. Individual model datasets and all related derived products are subject to the terms of use (https://pcmdi.llnl.gov/CMIP6/TermsOfUse/TermsOfUse6-1.html) of the source organization.

  6. Z

    INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET

    • data.niaid.nih.gov
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nafiz Sadman (2024). INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4047647
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Kishor Datta Gupta
    Nafiz Sadman
    Nishat Anjum
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States, Bangladesh
    Description

    Introduction

    There are several works based on Natural Language Processing on newspaper reports. Mining opinions from headlines [ 1 ] using Standford NLP and SVM by Rameshbhaiet. Al.compared several algorithms on a small and large dataset. Rubinet. al., in their paper [ 2 ], created a mechanism to differentiate fake news from real ones by building a set of characteristics of news according to their types. The purpose was to contribute to the low resource data available for training machine learning algorithms. Doumitet. al.in [ 3 ] have implemented LDA, a topic modeling approach to study bias present in online news media.

    However, there are not many NLP research invested in studying COVID-19. Most applications include classification of chest X-rays and CT-scans to detect presence of pneumonia in lungs [ 4 ], a consequence of the virus. Other research areas include studying the genome sequence of the virus[ 5 ][ 6 ][ 7 ] and replicating its structure to fight and find a vaccine. This research is crucial in battling the pandemic. The few NLP based research publications are sentiment classification of online tweets by Samuel et el [ 8 ] to understand fear persisting in people due to the virus. Similar work has been done using the LSTM network to classify sentiments from online discussion forums by Jelodaret. al.[ 9 ]. NKK dataset is the first study on a comparatively larger dataset of a newspaper report on COVID-19, which contributed to the virus’s awareness to the best of our knowledge.

    2 Data-set Introduction

    2.1 Data Collection

    We accumulated 1000 online newspaper report from United States of America (USA) on COVID-19. The newspaper includes The Washington Post (USA) and StarTribune (USA). We have named it as “Covid-News-USA-NNK”. We also accumulated 50 online newspaper report from Bangladesh on the issue and named it “Covid-News-BD-NNK”. The newspaper includes The Daily Star (BD) and Prothom Alo (BD). All these newspapers are from the top provider and top read in the respective countries. The collection was done manually by 10 human data-collectors of age group 23- with university degrees. This approach was suitable compared to automation to ensure the news were highly relevant to the subject. The newspaper online sites had dynamic content with advertisements in no particular order. Therefore there were high chances of online scrappers to collect inaccurate news reports. One of the challenges while collecting the data is the requirement of subscription. Each newspaper required $1 per subscriptions. Some criteria in collecting the news reports provided as guideline to the human data-collectors were as follows:

    The headline must have one or more words directly or indirectly related to COVID-19.

    The content of each news must have 5 or more keywords directly or indirectly related to COVID-19.

    The genre of the news can be anything as long as it is relevant to the topic. Political, social, economical genres are to be more prioritized.

    Avoid taking duplicate reports.

    Maintain a time frame for the above mentioned newspapers.

    To collect these data we used a google form for USA and BD. We have two human editor to go through each entry to check any spam or troll entry.

    2.2 Data Pre-processing and Statistics

    Some pre-processing steps performed on the newspaper report dataset are as follows:

    Remove hyperlinks.

    Remove non-English alphanumeric characters.

    Remove stop words.

    Lemmatize text.

    While more pre-processing could have been applied, we tried to keep the data as much unchanged as possible since changing sentence structures could result us in valuable information loss. While this was done with help of a script, we also assigned same human collectors to cross check for any presence of the above mentioned criteria.

    The primary data statistics of the two dataset are shown in Table 1 and 2.

    Table 1: Covid-News-USA-NNK data statistics

    No of words per headline

    7 to 20

    No of words per body content

    150 to 2100

    Table 2: Covid-News-BD-NNK data statistics No of words per headline

    10 to 20

    No of words per body content

    100 to 1500

    2.3 Dataset Repository

    We used GitHub as our primary data repository in account name NKK^1. Here, we created two repositories USA-NKK^2 and BD-NNK^3. The dataset is available in both CSV and JSON format. We are regularly updating the CSV files and regenerating JSON using a py script. We provided a python script file for essential operation. We welcome all outside collaboration to enrich the dataset.

    3 Literature Review

    Natural Language Processing (NLP) deals with text (also known as categorical) data in computer science, utilizing numerous diverse methods like one-hot encoding, word embedding, etc., that transform text to machine language, which can be fed to multiple machine learning and deep learning algorithms.

    Some well-known applications of NLP includes fraud detection on online media sites[ 10 ], using authorship attribution in fallback authentication systems[ 11 ], intelligent conversational agents or chatbots[ 12 ] and machine translations used by Google Translate[ 13 ]. While these are all downstream tasks, several exciting developments have been made in the algorithm solely for Natural Language Processing tasks. The two most trending ones are BERT[ 14 ], which uses bidirectional encoder-decoder architecture to create the transformer model, that can do near-perfect classification tasks and next-word predictions for next generations, and GPT-3 models released by OpenAI[ 15 ] that can generate texts almost human-like. However, these are all pre-trained models since they carry huge computation cost. Information Extraction is a generalized concept of retrieving information from a dataset. Information extraction from an image could be retrieving vital feature spaces or targeted portions of an image; information extraction from speech could be retrieving information about names, places, etc[ 16 ]. Information extraction in texts could be identifying named entities and locations or essential data. Topic modeling is a sub-task of NLP and also a process of information extraction. It clusters words and phrases of the same context together into groups. Topic modeling is an unsupervised learning method that gives us a brief idea about a set of text. One commonly used topic modeling is Latent Dirichlet Allocation or LDA[17].

    Keyword extraction is a process of information extraction and sub-task of NLP to extract essential words and phrases from a text. TextRank [ 18 ] is an efficient keyword extraction technique that uses graphs to calculate the weight of each word and pick the words with more weight to it.

    Word clouds are a great visualization technique to understand the overall ’talk of the topic’. The clustered words give us a quick understanding of the content.

    4 Our experiments and Result analysis

    We used the wordcloud library^4 to create the word clouds. Figure 1 and 3 presents the word cloud of Covid-News-USA- NNK dataset by month from February to May. From the figures 1,2,3, we can point few information:

    In February, both the news paper have talked about China and source of the outbreak.

    StarTribune emphasized on Minnesota as the most concerned state. In April, it seemed to have been concerned more.

    Both the newspaper talked about the virus impacting the economy, i.e, bank, elections, administrations, markets.

    Washington Post discussed global issues more than StarTribune.

    StarTribune in February mentioned the first precautionary measurement: wearing masks, and the uncontrollable spread of the virus throughout the nation.

    While both the newspaper mentioned the outbreak in China in February, the weight of the spread in the United States are more highlighted through out March till May, displaying the critical impact caused by the virus.

    We used a script to extract all numbers related to certain keywords like ’Deaths’, ’Infected’, ’Died’ , ’Infections’, ’Quarantined’, Lock-down’, ’Diagnosed’ etc from the news reports and created a number of cases for both the newspaper. Figure 4 shows the statistics of this series. From this extraction technique, we can observe that April was the peak month for the covid cases as it gradually rose from February. Both the newspaper clearly shows us that the rise in covid cases from February to March was slower than the rise from March to April. This is an important indicator of possible recklessness in preparations to battle the virus. However, the steep fall from April to May also shows the positive response against the attack. We used Vader Sentiment Analysis to extract sentiment of the headlines and the body. On average, the sentiments were from -0.5 to -0.9. Vader Sentiment scale ranges from -1(highly negative to 1(highly positive). There were some cases

    where the sentiment scores of the headline and body contradicted each other,i.e., the sentiment of the headline was negative but the sentiment of the body was slightly positive. Overall, sentiment analysis can assist us sort the most concerning (most negative) news from the positive ones, from which we can learn more about the indicators related to COVID-19 and the serious impact caused by it. Moreover, sentiment analysis can also provide us information about how a state or country is reacting to the pandemic. We used PageRank algorithm to extract keywords from headlines as well as the body content. PageRank efficiently highlights important relevant keywords in the text. Some frequently occurring important keywords extracted from both the datasets are: ’China’, Government’, ’Masks’, ’Economy’, ’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’, ’Health’, ’Response’. Keywords extraction acts as a filter allowing quick searches for indicators in case of locating situations of the economy,

  7. w

    Fire statistics data tables

    • gov.uk
    • s3.amazonaws.com
    Updated Jul 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Housing, Communities and Local Government (2025). Fire statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire-statistics-data-tables
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset provided by
    GOV.UK
    Authors
    Ministry of Housing, Communities and Local Government
    Description

    On 1 April 2025 responsibility for fire and rescue transferred from the Home Office to the Ministry of Housing, Communities and Local Government.

    This information covers fires, false alarms and other incidents attended by fire crews, and the statistics include the numbers of incidents, fires, fatalities and casualties as well as information on response times to fires. The Ministry of Housing, Communities and Local Government (MHCLG) also collect information on the workforce, fire prevention work, health and safety and firefighter pensions. All data tables on fire statistics are below.

    MHCLG has responsibility for fire services in England. The vast majority of data tables produced by the Ministry of Housing, Communities and Local Government are for England but some (0101, 0103, 0201, 0501, 1401) tables are for Great Britain split by nation. In the past the Department for Communities and Local Government (who previously had responsibility for fire services in England) produced data tables for Great Britain and at times the UK. Similar information for devolved administrations are available at https://www.firescotland.gov.uk/about/statistics/" class="govuk-link">Scotland: Fire and Rescue Statistics, https://statswales.gov.wales/Catalogue/Community-Safety-and-Social-Inclusion/Community-Safety" class="govuk-link">Wales: Community safety and https://www.nifrs.org/home/about-us/publications/" class="govuk-link">Northern Ireland: Fire and Rescue Statistics.

    If you use assistive technology (for example, a screen reader) and need a version of any of these documents in a more accessible format, please email alternativeformats@communities.gov.uk. Please tell us what format you need. It will help us if you say what assistive technology you use.

    Related content

    Fire statistics guidance
    Fire statistics incident level datasets

    Incidents attended

    https://assets.publishing.service.gov.uk/media/686d2aa22557debd867cbe14/FIRE0101.xlsx">FIRE0101: Incidents attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 153 KB) Previous FIRE0101 tables

    https://assets.publishing.service.gov.uk/media/686d2ab52557debd867cbe15/FIRE0102.xlsx">FIRE0102: Incidents attended by fire and rescue services in England, by incident type and fire and rescue authority (MS Excel Spreadsheet, 2.19 MB) Previous FIRE0102 tables

    https://assets.publishing.service.gov.uk/media/686d2aca10d550c668de3c69/FIRE0103.xlsx">FIRE0103: Fires attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 201 KB) Previous FIRE0103 tables

    https://assets.publishing.service.gov.uk/media/686d2ad92557debd867cbe16/FIRE0104.xlsx">FIRE0104: Fire false alarms by reason for false alarm, England (MS Excel Spreadsheet, 492 KB) Previous FIRE0104 tables

    Dwelling fires attended

    https://assets.publishing.service.gov.uk/media/686d2af42cfe301b5fb6789f/FIRE0201.xlsx">FIRE0201: Dwelling fires attended by fire and rescue services by motive, population and nation (MS Excel Spreadsheet, <span class="gem-c-attac

  8. Full dataset for dengue forecasting in Brazil for Infodengue-Mosqlimate...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Sep 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Flávio Codeço Coelho; Flávio Codeço Coelho; Claudia Torres Codeço; Claudia Torres Codeço; Iasmim Almeida; Iasmim Almeida; Luiz Max Carvalho; Luiz Max Carvalho; Eduardo Correa Araújo; Eduardo Correa Araújo; Leonardo Bastos; Leonardo Bastos; Luã Bida Vacaro; Raquel Martins Lana; Raquel Martins Lana; Luã Bida Vacaro (2024). Full dataset for dengue forecasting in Brazil for Infodengue-Mosqlimate sprint 2024 [Dataset]. http://doi.org/10.5281/zenodo.13328231
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Flávio Codeço Coelho; Flávio Codeço Coelho; Claudia Torres Codeço; Claudia Torres Codeço; Iasmim Almeida; Iasmim Almeida; Luiz Max Carvalho; Luiz Max Carvalho; Eduardo Correa Araújo; Eduardo Correa Araújo; Leonardo Bastos; Leonardo Bastos; Luã Bida Vacaro; Raquel Martins Lana; Raquel Martins Lana; Luã Bida Vacaro
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Brazil
    Description

    The year 2024 has seen an exceptional number of reported dengue fever cases in various parts of the world. In Brazil, the disease has spread to areas in the south and at altitudes where epidemics were not previously recorded, and the incidence rate has far exceeded that of previous years. The objective of this dataset is to promote, in a standardized way, the training of predictive models with the aim of developing forecast models for dengue in Brazil.

  9. A Twitter Dataset of 100+ million tweets related to COVID-19

    • zenodo.org
    application/gzip, csv +1
    Updated Apr 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding; Gerardo Chowell; Gerardo Chowell; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding (2023). A Twitter Dataset of 100+ million tweets related to COVID-19 [Dataset]. http://doi.org/10.5281/zenodo.3735274
    Explore at:
    application/gzip, tsv, csvAvailable download formats
    Dataset updated
    Apr 17, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding; Gerardo Chowell; Gerardo Chowell; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding
    Description

    Due to the relevance of the COVID-19 global pandemic, we are releasing our dataset of tweets acquired from the Twitter Stream related to COVID-19 chatter. The first 9 weeks of data (from January 1st, 2020 to March 11th, 2020) contain very low tweet counts as we filtered other data we were collecting for other research purposes, however, one can see the dramatic increase as the awareness for the virus spread. Dedicated data gathering started from March 11th to March 30th which yielded over 4 million tweets a day. We have added additional data provided by our new collaborators from January 27th to February 27th, to provide extra longitudinal coverage.

    The data collected from the stream captures all languages, but the higher prevalence are: English, Spanish, and French. We release all tweets and retweets on the full_dataset.tsv file (101,400,452 unique tweets), and a cleaned version with no retweets on the full_dataset-clean.tsv file (20,244,746 unique tweets). There are several practical reasons for us to leave the retweets, tracing important tweets and their dissemination is one of them. For NLP tasks we provide the top 1000 frequent terms in frequent_terms.csv, the top 1000 bigrams in frequent_bigrams.csv, and the top 1000 trigrams in frequent_trigrams.csv. Some general statistics per day are included for both datasets in the statistics-full_dataset.tsv and statistics-full_dataset-clean.tsv files.

    More details can be found (and will be updated faster at: https://github.com/thepanacealab/covid19_twitter)

    As always, the tweets distributed here are only tweet identifiers (with date and time added) due to the terms and conditions of Twitter to re-distribute Twitter data. The need to be hydrated to be used.

  10. COVID19-Dataset-with-100-World-Countries

    • kaggle.com
    Updated Mar 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sami Belkacem (2021). COVID19-Dataset-with-100-World-Countries [Dataset]. https://www.kaggle.com/sambelkacem/covid19-algeria-and-world-dataset/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 1, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sami Belkacem
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    World
    Description

    COVID19-Algeria-and-World-Dataset

    A coronavirus dataset with 104 countries constructed from different reliable sources, where each row represents a country, and the columns represent geographic, climate, healthcare, economic, and demographic factors that may contribute to accelerate/slow the spread of the COVID-19. The assumptions for the different factors are as follows:

    • Geography: some continents/areas may be more affected by the disease
    • Climate: cold temperatures may promote the spread of the virus
    • Healthcare: lack of hospital beds/doctors may lead to more human losses
    • Economy: weak economies (GDP) have fewer means to fight the disease
    • Demography: older populations may be at higher risk of the disease

    The last column represents the number of daily tests performed and the total number of cases and deaths reported each day.

    Data description

    https://raw.githubusercontent.com/SamBelkacem/COVID19-Algeria-and-World-Dataset/master/Images/Data%20description.png">

    Countries in the dataset by geographic coordinates

    https://raw.githubusercontent.com/SamBelkacem/COVID19-Algeria-and-World-Dataset/master/Images/Countries%20by%20geographic%20coordinates.png">

    • Europe: 33 countries
    • Asia: 28 countries
    • Africa: 21 countries
    • North America: 11 countries
    • South America: 8 countries
    • Oceania: 3 countries

    Statistical description of the data

    https://raw.githubusercontent.com/SamBelkacem/COVID19-Algeria-and-World-Dataset/master/Images/Statistical%20description%20of%20the%20data.png">

    Data distribution

    https://raw.githubusercontent.com/SamBelkacem/COVID19-Algeria-and-World-Dataset/master/Images/Data%20distribution.png">

    Download

    The dataset is available in an encoded CSV form on GitHub.

    Python code

    The Python Jupyter Notebook to read and visualize the data is available on nbviewer.

    Data update

    The dataset is updated every month with the latest numbers of COVID-19 cases, deaths, and tests. The last update was on March 01, 2021.

    Data construction

    The dataset is constructed from different reliable sources, where each row represents a country, and the columns represent geographic, climate, healthcare, economic, and demographic factors that may contribute to accelerate/slow the spread of the coronavirus. Note that we selected only the main factors for which we found data and that other factors can be used. All data were retrieved from the reliable Our World in Data website, except for data on:

    Citation

    If you want to use the dataset please cite the following arXiv paper, more details about the data construction are provided in it.

    @article{belkacem_covid-19_2020,
      title = {COVID-19 data analysis and forecasting: Algeria and the world},
      shorttitle = {COVID-19 data analysis and forecasting},
      journal = {arXiv preprint arXiv:2007.09755},
      author = {Belkacem, Sami},
      year = {2020}
    }
    

    Contact

    If you have any question or suggestion, please contact me at this email address: s.belkacem@usthb.dz

  11. T

    Iowa Economic Indicators

    • data.iowa.gov
    • mydata.iowa.gov
    • +1more
    application/rdfxml +5
    Updated Jun 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iowa Department of Revenue, Research and Analysis Division (2025). Iowa Economic Indicators [Dataset]. https://data.iowa.gov/Economic-Statistics/Iowa-Economic-Indicators/qd3t-kfqg
    Explore at:
    json, xml, csv, application/rssxml, application/rdfxml, tsvAvailable download formats
    Dataset updated
    Jun 5, 2025
    Dataset authored and provided by
    Iowa Department of Revenue, Research and Analysis Division
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Iowa
    Description

    This dataset provides economic indicators used to monitor Iowa's economy and forecast future direction of economic activity in Iowa.

  12. FIRE0203: previous data tables

    • gov.uk
    Updated Sep 6, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Home Office (2018). FIRE0203: previous data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire0203-previous-data-tables
    Explore at:
    Dataset updated
    Sep 6, 2018
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Home Office
    Description

    FIRE0203: Dwelling fires by spread of fire and motive (19 September 2024)

    https://assets.publishing.service.gov.uk/media/66e2eacd3f1299ce5d5c3d90/fire-statistics-data-tables-fire0203-210923.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (21 September 2023) (MS Excel Spreadsheet, 87.7 KB)

    https://assets.publishing.service.gov.uk/media/650ac4d4fbd7bc000dcb51d1/fire-statistics-data-tables-fire0203-290922.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (29 September 2022) (MS Excel Spreadsheet, 83.8 KB)

    https://assets.publishing.service.gov.uk/media/63316357e90e0711d7fbfb7b/fire-statistics-data-tables-fire0203-300921.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (30 September 2021) (MS Excel Spreadsheet, 89.3 KB)

    https://assets.publishing.service.gov.uk/media/615191b28fa8f561101f390e/fire-statistics-data-tables-fire0203-011020.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (1 October 2020) (MS Excel Spreadsheet, 70.2 KB)

    https://assets.publishing.service.gov.uk/media/5f71c632d3bf7f47a36d96cb/fire-statistics-data-tables-fire0203-120919.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (12 September 2019) (MS Excel Spreadsheet, 78.8 KB)

    https://assets.publishing.service.gov.uk/media/5d7277d140f0b609283d9f74/fire-statistics-data-tables-fire0203-060918.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (6 September 2018) (MS Excel Spreadsheet, 340 KB)

    https://assets.publishing.service.gov.uk/media/5b8d0cc5e5274a0bdab54b22/fire-statistics-data-tables-fire0203.xlsx">FIRE0203: Dwelling fires by spread of fire and motive (12 October 2017) (MS Excel Spreadsheet, 58.5 KB)

    Related content

    Fire statistics data tables
    Fire statistics guidance
    Fire statistics

  13. e

    Model Output Statistics for SAN DIEGO (LINDBERGH FIELD) (72290)

    • data.europa.eu
    Updated Apr 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Model Output Statistics for SAN DIEGO (LINDBERGH FIELD) (72290) [Dataset]. https://data.europa.eu/data/datasets/de-dwd-mosmix-72290
    Explore at:
    Dataset updated
    Apr 25, 2018
    Description

    DWD’s fully automatic MOSMIX product optimizes and interprets the forecast calculations of the NWP models ICON (DWD) and IFS (ECMWF), combines these and calculates statistically optimized weather forecasts in terms of point forecasts (PFCs). Thus, statistically corrected, updated forecasts for the next ten days are calculated for about 5400 locations around the world. Most forecasting locations are spread over Germany and Europe. MOSMIX forecasts (PFCs) include nearly all common meteorological parameters measured by weather stations. For further information please refer to: [in German: https://www.dwd.de/DE/leistungen/met_verfahren_mosmix/met_verfahren_mosmix.html ] [in English: https://www.dwd.de/EN/ourservices/met_application_mosmix/met_application_mosmix.html ]

  14. FIRE0304: previous data tables

    • gov.uk
    Updated Sep 6, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Home Office (2018). FIRE0304: previous data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire0304-previous-data-tables
    Explore at:
    Dataset updated
    Sep 6, 2018
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Home Office
    Description

    FIRE0304: Other buildings fire by spread of fire and motive (19 September 2024)

    https://assets.publishing.service.gov.uk/media/66e3e6630d913026165c3df6/fire-statistics-data-tables-fire0304-210923.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (21 September 2023) (MS Excel Spreadsheet, 121 KB)

    https://assets.publishing.service.gov.uk/media/650ac64c27d43b001491c2b0/fire-statistics-data-tables-fire0304-290922.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (29 September 2022) (MS Excel Spreadsheet, 115 KB)

    https://assets.publishing.service.gov.uk/media/63316bef8fa8f51d2a863128/fire-statistics-data-tables-fire0304-300921.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (30 September 2021) (MS Excel Spreadsheet, 118 KB)

    https://assets.publishing.service.gov.uk/media/615195dfd3bf7f718c758109/fire-statistics-data-tables-fire0304-011020.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (1 October 2020) (MS Excel Spreadsheet, 168 KB)

    https://assets.publishing.service.gov.uk/media/5f71c7438fa8f5188aa288fc/fire-statistics-data-tables-fire0304-120919.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (12 September 2019) (MS Excel Spreadsheet, 112 KB)

    https://assets.publishing.service.gov.uk/media/5d7279bce5274a09860c1376/fire-statistics-data-tables-fire0304-060918.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (6 September 2018) (MS Excel Spreadsheet, 837 KB)

    https://assets.publishing.service.gov.uk/media/5b8d1052ed915d1ec02ff23d/fire-statistics-data-tables-fire0304.xlsx">FIRE0304: Other buildings fire by spread of fire and motive (12 October 2017) (MS Excel Spreadsheet, 60 KB)

    Related content

    Fire statistics data tables
    Fire statistics guidance
    Fire statistics

  15. Data (i.e., evidence) about evidence based medicine

    • figshare.com
    • search.datacite.org
    png
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorge H Ramirez (2023). Data (i.e., evidence) about evidence based medicine [Dataset]. http://doi.org/10.6084/m9.figshare.1093997.v24
    Explore at:
    pngAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jorge H Ramirez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Update — December 7, 2014. – Evidence-based medicine (EBM) is not working for many reasons, for example: 1. Incorrect in their foundations (paradox): hierarchical levels of evidence are supported by opinions (i.e., lowest strength of evidence according to EBM) instead of real data collected from different types of study designs (i.e., evidence). http://dx.doi.org/10.6084/m9.figshare.1122534 2. The effect of criminal practices by pharmaceutical companies is only possible because of the complicity of others: healthcare systems, professional associations, governmental and academic institutions. Pharmaceutical companies also corrupt at the personal level, politicians and political parties are on their payroll, medical professionals seduced by different types of gifts in exchange of prescriptions (i.e., bribery) which very likely results in patients not receiving the proper treatment for their disease, many times there is no such thing: healthy persons not needing pharmacological treatments of any kind are constantly misdiagnosed and treated with unnecessary drugs. Some medical professionals are converted in K.O.L. which is only a puppet appearing on stage to spread lies to their peers, a person supposedly trained to improve the well-being of others, now deceits on behalf of pharmaceutical companies. Probably the saddest thing is that many honest doctors are being misled by these lies created by the rules of pharmaceutical marketing instead of scientific, medical, and ethical principles. Interpretation of EBM in this context was not anticipated by their creators. “The main reason we take so many drugs is that drug companies don’t sell drugs, they sell lies about drugs.” ―Peter C. Gøtzsche “doctors and their organisations should recognise that it is unethical to receive money that has been earned in part through crimes that have harmed those people whose interests doctors are expected to take care of. Many crimes would be impossible to carry out if doctors weren’t willing to participate in them.” —Peter C Gøtzsche, The BMJ, 2012, Big pharma often commits corporate crime, and this must be stopped. Pending (Colombia): Health Promoter Entities (In Spanish: EPS ―Empresas Promotoras de Salud).

    1. Misinterpretations New technologies or concepts are difficult to understand in the beginning, it doesn’t matter their simplicity, we need to get used to new tools aimed to improve our professional practice. Probably the best explanation is here in these videos (credits to Antonio Villafaina for sharing these videos with me). English https://www.youtube.com/watch?v=pQHX-SjgQvQ&w=420&h=315 Spanish https://www.youtube.com/watch?v=DApozQBrlhU&w=420&h=315 ----------------------- Hypothesis: hierarchical levels of evidence based medicine are wrong Dear Editor, I have data to support the hypothesis described in the title of this letter. Before rejecting the null hypothesis I would like to ask the following open question:Could you support with data that hierarchical levels of evidence based medicine are correct? (1,2) Additional explanation to this question: – Only respond to this question attaching publicly available raw data.– Be aware that more than a question this is a challenge: I have data (i.e., evidence) which is contrary to classic (i.e., McMaster) or current (i.e., Oxford) hierarchical levels of evidence based medicine. An important part of this data (but not all) is publicly available. References
    2. Ramirez, Jorge H (2014): The EBM challenge. figshare. http://dx.doi.org/10.6084/m9.figshare.1135873
    3. The EBM Challenge Day 1: No Answers. Competing interests: I endorse the principles of open data in human biomedical research Read this letter on The BMJ – August 13, 2014.http://www.bmj.com/content/348/bmj.g3725/rr/762595Re: Greenhalgh T, et al. Evidence based medicine: a movement in crisis? BMJ 2014; 348: g3725. _ Fileset contents Raw data: Excel archive: Raw data, interactive figures, and PubMed search terms. Google Spreadsheet is also available (URL below the article description). Figure 1. Unadjusted (Fig 1A) and adjusted (Fig 1B) PubMed publication trends (01/01/1992 to 30/06/2014). Figure 2. Adjusted PubMed publication trends (07/01/2008 to 29/06/2014) Figure 3. Google search trends: Jan 2004 to Jun 2014 / 1-week periods. Figure 4. PubMed publication trends (1962-2013) systematic reviews and meta-analysis, clinical trials, and observational studies.
      Figure 5. Ramirez, Jorge H (2014): Infographics: Unpublished US phase 3 clinical trials (2002-2014) completed before Jan 2011 = 50.8%. figshare.http://dx.doi.org/10.6084/m9.figshare.1121675 Raw data: "13377 studies found for: Completed | Interventional Studies | Phase 3 | received from 01/01/2002 to 01/01/2014 | Worldwide". This database complies with the terms and conditions of ClinicalTrials.gov: http://clinicaltrials.gov/ct2/about-site/terms-conditions Supplementary Figures (S1-S6). PubMed publication delay in the indexation processes does not explain the descending trends in the scientific output of evidence-based medicine. Acknowledgments I would like to acknowledge the following persons for providing valuable concepts in data visualization and infographics:
    4. Maria Fernanda Ramírez. Professor of graphic design. Universidad del Valle. Cali, Colombia.
    5. Lorena Franco. Graphic design student. Universidad del Valle. Cali, Colombia. Related articles by this author (Jorge H. Ramírez)
    6. Ramirez JH. Lack of transparency in clinical trials: a call for action. Colomb Med (Cali) 2013;44(4):243-6. URL: http://www.ncbi.nlm.nih.gov/pubmed/24892242
    7. Ramirez JH. Re: Evidence based medicine is broken (17 June 2014). http://www.bmj.com/node/759181
    8. Ramirez JH. Re: Global rules for global health: why we need an independent, impartial WHO (19 June 2014). http://www.bmj.com/node/759151
    9. Ramirez JH. PubMed publication trends (1992 to 2014): evidence based medicine and clinical practice guidelines (04 July 2014). http://www.bmj.com/content/348/bmj.g3725/rr/759895 Recommended articles
    10. Greenhalgh Trisha, Howick Jeremy,Maskrey Neal. Evidence based medicine: a movement in crisis? BMJ 2014;348:g3725
    11. Spence Des. Evidence based medicine is broken BMJ 2014; 348:g22
    12. Schünemann Holger J, Oxman Andrew D,Brozek Jan, Glasziou Paul, JaeschkeRoman, Vist Gunn E et al. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies BMJ 2008; 336:1106
    13. Lau Joseph, Ioannidis John P A, TerrinNorma, Schmid Christopher H, OlkinIngram. The case of the misleading funnel plot BMJ 2006; 333:597
    14. Moynihan R, Henry D, Moons KGM (2014) Using Evidence to Combat Overdiagnosis and Overtreatment: Evaluating Treatments, Tests, and Disease Definitions in the Time of Too Much. PLoS Med 11(7): e1001655. doi:10.1371/journal.pmed.1001655
    15. Katz D. A-holistic view of evidence based medicinehttp://thehealthcareblog.com/blog/2014/05/02/a-holistic-view-of-evidence-based-medicine/ ---
  16. e

    Model Output Statistics for KIELCE-SUKOW (12570)

    • data.europa.eu
    Updated Apr 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Model Output Statistics for KIELCE-SUKOW (12570) [Dataset]. https://data.europa.eu/data/datasets/de-dwd-mosmix-12570
    Explore at:
    Dataset updated
    Apr 25, 2018
    Description

    DWD’s fully automatic MOSMIX product optimizes and interprets the forecast calculations of the NWP models ICON (DWD) and IFS (ECMWF), combines these and calculates statistically optimized weather forecasts in terms of point forecasts (PFCs). Thus, statistically corrected, updated forecasts for the next ten days are calculated for about 5400 locations around the world. Most forecasting locations are spread over Germany and Europe. MOSMIX forecasts (PFCs) include nearly all common meteorological parameters measured by weather stations. For further information please refer to: [in German: https://www.dwd.de/DE/leistungen/met_verfahren_mosmix/met_verfahren_mosmix.html ] [in English: https://www.dwd.de/EN/ourservices/met_application_mosmix/met_application_mosmix.html ]

  17. A

    ‘Cricket Dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Cricket Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-cricket-dataset-39db/d978d471/?iid=009-591&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Cricket Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/notkrishna/cricket-statistics-for-all-formats on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    Cricket is a bat-and-ball game played between two teams of eleven players on a field at the centre of which is a 22-yard (20-metre) pitch with a wicket at each end, each comprising two bails balanced on three stumps. The game proceeds when a player on the fielding team, called the bowler, "bowls" (propels) the ball from one end of the pitch towards the wicket at the other end. The batting side's players score runs by striking the bowled ball with a bat and running between the wickets, while the fielding side tries to prevent this by keeping the ball within the field and getting it to either wicket, and also tries to dismiss each batter (so they are "out"). Means of dismissal include being bowled, when the ball hits the stumps and dislodges the bails, and by the fielding side either catching a hit ball before it touches the ground, or hitting a wicket with the ball before a batter can cross the crease line in front of the wicket to complete a run. When ten batters have been dismissed, the innings ends and the teams swap roles. The game is adjudicated by two umpires, aided by a third umpire and match referee in international matches.

    Forms of cricket range from Twenty20, with each team batting for a single innings of 20 overs and the game generally lasting three hours, to Test matches played over five days. Traditionally cricketers play in all-white kit, but in limited overs cricket they wear club or team colours. In addition to the basic kit, some players wear protective gear to prevent injury caused by the ball, which is a hard, solid spheroid made of compressed leather with a slightly raised sewn seam enclosing a cork core layered with tightly wound string.

    The earliest reference to cricket is in South East England in the mid-16th century. It spread globally with the expansion of the British Empire, with the first international matches in the second half of the 19th century. The game's governing body is the International Cricket Council (ICC), which has over 100 members, twelve of which are full members who play Test matches. The game's rules, the Laws of Cricket, are maintained by Marylebone Cricket Club (MCC) in London. The sport is followed primarily in South Asia, Australasia, the United Kingdom, southern Africa and the West Indies.[1] Women's cricket, which is organised and played separately, has also achieved international standard. The most successful side playing international cricket is Australia, which has won seven One Day International trophies, including five World Cups, more than any other country and has been the top-rated Test side more than any other country.

    Content

    Cricket as any sport is full of important data and stats. Given, the game is generally is played in three different formats, one day (50 overs for each team to score and bowl), test (no limitations on overs but played for max 5 days with each team having two innings to score), and newest format twenty20 (each team has 20 overs to score).

    Dataset contains 9 files (3 for each format). Each group of three files contains best stats for batsmen, bowlers and series/tournaments.

    Source https://www.espncricinfo.com/

    Play with it as you like.

    --- Original source retains full ownership of the source dataset ---

  18. f

    Summary of descriptive statistics.

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yohanes E. Riyanto; Jianlin Zhang (2023). Summary of descriptive statistics. [Dataset]. http://doi.org/10.1371/journal.pone.0232037.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Yohanes E. Riyanto; Jianlin Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary of descriptive statistics.

  19. g

    Demographics

    • health.google.com
    Updated Oct 7, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Demographics [Dataset]. https://health.google.com/covid-19/open-data/raw-data
    Explore at:
    Dataset updated
    Oct 7, 2021
    Variables measured
    key, population, population_male, rural_population, urban_population, population_female, population_density, clustered_population, population_age_00_09, population_age_10_19, and 11 more
    Description

    Various population statistics, including structured demographics data.

  20. d

    Overseas Buddhist Monk Visit to Taiwan Preaching Statistics

    • data.gov.tw
    csv
    Updated Jul 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Culture (2023). Overseas Buddhist Monk Visit to Taiwan Preaching Statistics [Dataset]. https://data.gov.tw/en/datasets/7621
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 28, 2023
    Dataset authored and provided by
    Ministry of Culture
    License

    https://data.gov.tw/licensehttps://data.gov.tw/license

    Area covered
    Taiwan
    Description

    This dataset mainly provides statistics on the number of overseas Tibetan monks who have come to Taiwan to spread the Dharma at the Mencius Culture Center.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Hasmot Ali (2020). Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death [Dataset]. http://doi.org/10.17632/vw427wzzkk.5

Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death

Related Article
Explore at:
Dataset updated
Jul 20, 2020
Authors
Hasmot Ali
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Contain informative data related to COVID-19 pandemic. Specially, figure out about the First Case and First Death information for every single country. The datasets mainly focus on two major fields first one is First Case which consists of information of Date of First Case(s), Number of confirm Case(s) at First Day, Age of the patient(s) of First Case, Last Visited Country and the other one First Death information consist of Date of First Death and Age of the Patient who died first for every Country mentioning corresponding Continent. The datasets also contain the Binary Matrix of spread chain among different country and region.

*This is not a country. This is a ship. The name of the Cruise Ship was not given from the government.
"N+": the age is not specified but greater than N
“No Trace”: some data was not found
“Unspecified”: not available from the authority
“N/A”: for “Last Visited Country(s) of Confirmed Case(s)” column, “N/A” indicates that the confirmed case(s) of those countries do not have any travel history in recent past; in “Age of First Death(s)” column “N/A” indicates that those countries do not have may death case till May 16, 2020.

Search
Clear search
Close search
Google apps
Main menu