Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Population by Country - 2020’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/tanuprabhu/population-by-country-2020 on 28 January 2022.
--- Dataset description provided by original source is as follows ---
I always wanted to access a data set that was related to the world’s population (Country wise). But I could not find a properly documented data set. Rather, I just created one manually.
Now I knew I wanted to create a dataset but I did not know how to do so. So, I started to search for the content (Population of countries) on the internet. Obviously, Wikipedia was my first search. But I don't know why the results were not acceptable. And also there were only I think 190 or more countries. So then I surfed the internet for quite some time until then I stumbled upon a great website. I think you probably have heard about this. The name of the website is Worldometer. This is exactly the website I was looking for. This website had more details than Wikipedia. Also, this website had more rows I mean more countries with their population.
Once I got the data, now my next hard task was to download it. Of course, I could not get the raw form of data. I did not mail them regarding the data. Now I learned a new skill which is very important for a data scientist. I read somewhere that to obtain the data from websites you need to use this technique. Any guesses, keep reading you will come to know in the next paragraph.
https://fiverr-res.cloudinary.com/images/t_main1,q_auto,f_auto/gigs/119580480/original/68088c5f588ec32a6b3a3a67ec0d1b5a8a70648d/do-web-scraping-and-data-mining-with-python.png" alt="alt text">
You are right its, Web Scraping. Now I learned this so that I could convert the data into a CSV format. Now I will give you the scraper code that I wrote and also I somehow found a way to directly convert the pandas data frame to a CSV(Comma-separated fo format) and store it on my computer. Now just go through my code and you will know what I'm talking about.
Below is the code that I used to scrape the code from the website
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3200273%2Fe814c2739b99d221de328c72a0b2571e%2FCapture.PNG?generation=1581314967227445&alt=media" alt="">
Now I couldn't have got the data without Worldometer. So special thanks to the website. It is because of them I was able to get the data.
As far as I know, I don't have any questions to ask. You guys can let me know by finding your ways to use the data and let me know via kernel if you find something interesting
--- Original source retains full ownership of the source dataset ---
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F128750%2F66baee67b3e35bf9656ff816e692527e%2Fsnapshot_worldometer_july4.png?generation=1593988535797227&alt=media" alt="">
The dataset contains data about the numbers of tests, cases, deaths, serious/critical cases, active cases and recovered cases in each country for every day since April 18, and also contains the population of each country to calculate per-capita penetration of the virus
I've removed data from the "Diamond Princess" and "MS Zaandam" since they are not countries
Additionally, an auxiliray table with information about the fraction of the general population at different age groups for every country is added (taken from Wikipedia). This is specifically relevant since COVID-19 death rate is very much age dependent.
The people at "www.worldometers.info" collecting and maintaining this site really are doing very important work "https://www.worldometers.info/coronavirus/#countries">https://www.worldometers.info/coronavirus/#countries
Data about age structure for every country comes from wikipedia
It's possible to use this dataset for various purposes and analyses My goal will be to use the additional data about the number of tests performed in each country to estimate the true death and infection rates of COVID-19
Learning Web Scraping in order to build my own datasets, and this is the first one in the learning process. Let's try and build great datasets in the future for better analysis and predictions.
Scraped the data on March 10, 2020, from https://www.worldometers.info/world-population/population-by-country/ Dataset represents the population count country-wise for a specific time period.
Firstly, Thanks to the Content creator on the website https://www.worldometers.info, who provides reliable data on the internet. Secondly, To the Tutor who taught me how to scrape websites.
Is this dataset valuable? Where can we utilize this dataset in data science?
Based on a comparison of coronavirus deaths in 210 countries relative to their population, Peru had the most losses to COVID-19 up until July 13, 2022. As of the same date, the virus had infected over 557.8 million people worldwide, and the number of deaths had totaled more than 6.3 million. Note, however, that COVID-19 test rates can vary per country. Additionally, big differences show up between countries when combining the number of deaths against confirmed COVID-19 cases. The source seemingly does not differentiate between "the Wuhan strain" (2019-nCOV) of COVID-19, "the Kent mutation" (B.1.1.7) that appeared in the UK in late 2020, the 2021 Delta variant (B.1.617.2) from India or the Omicron variant (B.1.1.529) from South Africa.
The difficulties of death figures
This table aims to provide a complete picture on the topic, but it very much relies on data that has become more difficult to compare. As the coronavirus pandemic developed across the world, countries already used different methods to count fatalities, and they sometimes changed them during the course of the pandemic. On April 16, for example, the Chinese city of Wuhan added a 50 percent increase in their death figures to account for community deaths. These deaths occurred outside of hospitals and went unaccounted for so far. The state of New York did something similar two days before, revising their figures with 3,700 new deaths as they started to include “assumed” coronavirus victims. The United Kingdom started counting deaths in care homes and private households on April 29, adjusting their number with about 5,000 new deaths (which were corrected lowered again by the same amount on August 18). This makes an already difficult comparison even more difficult. Belgium, for example, counts suspected coronavirus deaths in their figures, whereas other countries have not done that (yet). This means two things. First, it could have a big impact on both current as well as future figures. On April 16 already, UK health experts stated that if their numbers were corrected for community deaths like in Wuhan, the UK number would change from 205 to “above 300”. This is exactly what happened two weeks later. Second, it is difficult to pinpoint exactly which countries already have “revised” numbers (like Belgium, Wuhan or New York) and which ones do not. One work-around could be to look at (freely accessible) timelines that track the reported daily increase of deaths in certain countries. Several of these are available on our platform, such as for Belgium, Italy and Sweden. A sudden large increase might be an indicator that the domestic sources changed their methodology.
Where are these numbers coming from?
The numbers shown here were collected by Johns Hopkins University, a source that manually checks the data with domestic health authorities. For the majority of countries, this is from national authorities. In some cases, like China, the United States, Canada or Australia, city reports or other various state authorities were consulted. In this statistic, these separately reported numbers were put together. For more information or other freely accessible content, please visit our dedicated Facts and Figures page.
UPDATED till 10/04/2020 23:59:59
Worldometer Covid-19 Data is available as csv file. Uploading it here for using it in Kaggle kernels and getting insights from the broader DS community.
(2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people - CDC
Country - List of countries affected by covid-19 Total Cases - Cumulative number of confirmed cases till date New Cases - New confirmed cases each day Total Deaths - Cumulative number of deaths till date New Deaths - New death cases each day Total Recovered - Cumulative number of recovered cases till date Active Cases - Cumulative number of recovered cases till date Serious, Critical - Cumulative number of Serious/Critical cases till date Tot Cases/1M pop - Cumulative number of confirmed cases till date per million population Deaths/1M pop - Cumulative number of deaths till date per million population Total Tests - Cumulative number of test till date Tests/1M pop - Cumulative number of test till date per million population
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Associated with manuscript titled: Fifty Muslim-majority countries have fewer COVID-19 cases and deaths than the 50 richest non-Muslim countriesThe objective of this research was to determine the difference in the total number of COVID-19 cases and deaths between Muslim-majority and non-Muslim countries, and investigate reasons for the disparities. Methods: The 50 Muslim-majority countries had more than 50.0% Muslims with an average of 87.5%. The non-Muslim country sample consisted of 50 countries with the highest GDP while omitting any Muslim-majority countries listed. The non-Muslim countries’ average percentage of Muslims was 4.7%. Data pulled on September 18, 2020 included the percentage of Muslim population per country by World Population Review15 and GDP per country, population count, and total number of COVID-19 cases and deaths by Worldometers.16 The data set was transferred via an Excel spreadsheet on September 23, 2020 and analyzed. To measure COVID-19’s incidence in the countries, three different Average Treatment Methods (ATE) were used to validate the results. Results published as a preprint at https://doi.org/10.31235/osf.io/84zq5(15) Muslim Majority Countries 2020 [Internet]. Walnut (CA): World Population Review. 2020- [Cited 2020 Sept 28]. Available from: http://worldpopulationreview.com/country-rankings/muslim-majority-countries (16) Worldometers.info. Worldometer. Dover (DE): Worldometer; 2020 [cited 2020 Sept 28]. Available from: http://worldometers.info
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
COVID-19 statistics from Worldometers. Covers 213 countries/ territories. Recorded as of 22nd May 2020, 14:56 PM IST. The purpose of this data is to understand and analyse the trends of COVID-19, and the extent of its spread.
The new_cases and new_deaths columns pertain to 22/05/2020 only.
All credit goes to Worldometers, and its constituent data gatherers. The official link is here: https://www.worldometers.info/coronavirus/
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
There's a story behind every dataset and here's your opportunity to share yours.
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research. Credits and Information Taken by https://www.worldometers.info/world-population/
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Covid19 in World Countries-Latest Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/anandhuh/covid19-in-world-countrieslatest-data on 12 November 2021.
--- Dataset description provided by original source is as follows ---
This dataset contains Covid-19 data of world countries as on November 10, 2021
Link : https://www.worldometers.info/coronavirus/#countries
Link : https://www.kaggle.com/anandhuh/datasets
Upvote if you find it useful 🙏
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset, titled "Global COVID-19 Statistics - Jan 2025," contains the latest COVID-19 statistics collected from the Worldometer website on Jan 09, 2025. The data includes crucial metrics such as the total number of cases, deaths, recoveries, and active cases for countries around the world. The information is extracted from the comprehensive table provided by Worldometer, which is widely regarded as a reliable source for real-time coronavirus statistics. Source and Collection Date Source: Worldometer Coronavirus Page Date of Collection: Jan 09, 2025
https://github.com/disease-sh/API/blob/master/LICENSEhttps://github.com/disease-sh/API/blob/master/LICENSE
In past 24 hours, Sweden, Europe had N/A new cases, N/A deaths and 18 recoveries.
As the world is fighting against this invisible enemy a lot of data-driven students like me want to study it as well as we can. There is an enormous number of data set available on covid19 today but as a beginner, in this field, I wanted to find some more simple data. So here I come up with this covid19 data set which I scrapped from "https://www.worldometers.info/coronavirus". It is my way of learning by doing. This data is till 5/17/2020. I will keep on updating it.
The dataset contains 194 rows and 12 columns which are described below:-
Country: Contains the name of all Countries. Total_Cases: It contains the total number of cases the country has till 5/17/2020. Total_Deaths: Total number of deaths in that country till 5/17/2020. Total_Recovered: Total number of individuals recovered from covid19. Active_Cases: Total active cases in the country on 5/17/2020. Critical_Cases: Number of patients in critical condition. Cases/Million_Population: Number of cases per million population of that country. Deaths/Million_Population: Number of deaths per million population of that country. Total_Tests: Total number of tests performed 5/17/2020 Tests/Million_Population: Number of tests performed per million population. Population: Population of the country Continent: Continent in which the country lies.
As of May 2, 2023, the outbreak of the coronavirus disease (COVID-19) had been confirmed in almost every country in the world. The virus had infected over 687 million people worldwide, and the number of deaths had reached almost 6.87 million. The most severely affected countries include the U.S., India, and Brazil.
COVID-19: background information COVID-19 is a novel coronavirus that had not previously been identified in humans. The first case was detected in the Hubei province of China at the end of December 2019. The virus is highly transmissible and coughing and sneezing are the most common forms of transmission, which is similar to the outbreak of the SARS coronavirus that began in 2002 and was thought to have spread via cough and sneeze droplets expelled into the air by infected persons.
Naming the coronavirus disease Coronaviruses are a group of viruses that can be transmitted between animals and people, causing illnesses that may range from the common cold to more severe respiratory syndromes. In February 2020, the International Committee on Taxonomy of Viruses and the World Health Organization announced official names for both the virus and the disease it causes: SARS-CoV-2 and COVID-19, respectively. The name of the disease is derived from the words corona, virus, and disease, while the number 19 represents the year that it emerged.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this dataset we can find information related to the population of all the countries listed in the website Worldometers. The dataset is composed, among others, with information like Country, Total Cases, New Cases or TotalDeaths. The dataset was created with the idea to implement it in any project where this information could help to fight against Covid-19.
The dataset contains COVID-19 statistics for the top countries currently affected by the virus. The data was scraped from two popular sites maintaining daily updates on the spread of COVID-19 - https://www.worldometers.info/ and https://en.wikipedia.org/wiki/COVID-19_pandemic
There are two kinds of csv files. One type of files are country wise daily statistics on COVID-19 spread. The data for the following countries is available:-
For each of these countries, the dataset contains the following columns:-
The second type of file is the overall statistics which contains statistics for all the countries affected in the world. This dataset contains the following columns:-
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Estimated population data based on the latest United Nations Population Division estimates and http://www.worldometers.info/world-population/population-by-country/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for GOLD RESERVES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Covid-19 cases per country snapshot
13-Apr-2020 at 14:19 CET
Data source: https://www.worldometers.info/coronavirus/
Obtained by web-scraping
Contains header on 1st row.
Columns:
As the population is alarmingly growing, we need to analyze the growth to take measures
Everything ranges from 1955 to 2020. It can lead to very detailed and in-depth analysis.
Source of the data is taken from worldometer.info and then edited.
Just as the predictions of the next couple of decades are done, we will be able to take measures to increase are resources.
https://github.com/disease-sh/API/blob/master/LICENSEhttps://github.com/disease-sh/API/blob/master/LICENSE
In past 24 hours, India, Asia had 68 new cases, N/A deaths and N/A recoveries.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Population by Country - 2020’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/tanuprabhu/population-by-country-2020 on 28 January 2022.
--- Dataset description provided by original source is as follows ---
I always wanted to access a data set that was related to the world’s population (Country wise). But I could not find a properly documented data set. Rather, I just created one manually.
Now I knew I wanted to create a dataset but I did not know how to do so. So, I started to search for the content (Population of countries) on the internet. Obviously, Wikipedia was my first search. But I don't know why the results were not acceptable. And also there were only I think 190 or more countries. So then I surfed the internet for quite some time until then I stumbled upon a great website. I think you probably have heard about this. The name of the website is Worldometer. This is exactly the website I was looking for. This website had more details than Wikipedia. Also, this website had more rows I mean more countries with their population.
Once I got the data, now my next hard task was to download it. Of course, I could not get the raw form of data. I did not mail them regarding the data. Now I learned a new skill which is very important for a data scientist. I read somewhere that to obtain the data from websites you need to use this technique. Any guesses, keep reading you will come to know in the next paragraph.
https://fiverr-res.cloudinary.com/images/t_main1,q_auto,f_auto/gigs/119580480/original/68088c5f588ec32a6b3a3a67ec0d1b5a8a70648d/do-web-scraping-and-data-mining-with-python.png" alt="alt text">
You are right its, Web Scraping. Now I learned this so that I could convert the data into a CSV format. Now I will give you the scraper code that I wrote and also I somehow found a way to directly convert the pandas data frame to a CSV(Comma-separated fo format) and store it on my computer. Now just go through my code and you will know what I'm talking about.
Below is the code that I used to scrape the code from the website
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3200273%2Fe814c2739b99d221de328c72a0b2571e%2FCapture.PNG?generation=1581314967227445&alt=media" alt="">
Now I couldn't have got the data without Worldometer. So special thanks to the website. It is because of them I was able to get the data.
As far as I know, I don't have any questions to ask. You guys can let me know by finding your ways to use the data and let me know via kernel if you find something interesting
--- Original source retains full ownership of the source dataset ---