https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This World Marriage Dataset provides a comparable and up-to-date set of data on the marital status of the population by age and sex for 232 countries or different regions of the world from 1970 to 2019. There are 271605 rows and 9 columns in this dataset. Each row of the dataset represents a specific age group of men, either divorced or married or Single. The columns include:
Sr. No.: A serial number to identify each entry. Country: The country of focus. Age Group: The age range of the surveyed individuals. Sex: The gender of the surveyed individuals. Marital Status: The marital status of the individuals, categorized as either "Divorced" or "Married" or "Single". Data Process: The method used to collect the data. Data Collection (Start Year): The year when data collection began. Data Collection (End Year): The year when data collection ended. Data Source: The source of the data. This dataset helps to understand the marital status distribution among different age groups of men and women in all over the world from 1970 to 2019.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
This indicator is defined as the percentage of the population living in an overcrowded household (excluding the single-person households). A person is considered as living in an overcrowded household if the household does not have at its disposal a minimum of rooms equal to: - one room for the household; - one room by couple in the household; - one room for each single person aged 18 and more; - one room by pair of single people of the same sex between 12 and 17 years of age; - one room for each single person between 12 and 17 years of age and not included in the previous category; - one room by pair of children under 12 years of age. The indicator is presented by age group.
By Correlates of War Project [source]
The World Religion Project (WRP) is an ambitious endeavor to conduct a comprehensive analysis of religious adherence throughout the world from 1945 to 2010. This cutting-edge project offers unparalleled insight into the religious behavior of people in different countries, regions, and continents during this time period. Its datasets provide important information about the numbers and percentages of adherents across a multitude of different religions, religion families, and non-religious affiliations.
The WRP consists of three distinct datasets: the national religion dataset, regional religion dataset, and global religion dataset. Each is focused on understanding individually specific realms for varied analysis approaches - from individual states to global systems. The national dataset provides data on number of adherents by state as well as percentage population practicing a given faith group in five-year increments; focusing attention to how this number evolves from nation to nation over time. Similarly, regional data is provided at five year intervals highlighting individual region designations with one modification – Pacific Ocean states have been reclassified into their own Oceania category according to Country Code Number 900 or above). Finally at a global level – all states are aggregated in order that we may understand a snapshot view at any five-year interval between 1945‐2010 regarding relationships between religions or religio‐families within one location or transnationally.
This project was developed in three stages: firstly forming a religions tree (a systematic classification), secondly collecting data such as this provided by WRP according to that classification structure – lastly cleaning the data so discrepancies may be reconciled and imported where needed with gaps selected when unknown values were encountered during collection process . We would encourage anyone wishing details undergoing more detailed reading/analysis relating various use applications for these rich datasets - please contact Zeev Maoz (University California Davis) & Errol A Henderson _(Pennsylvania State University)
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
The World Religions Project (WRP) dataset offers a comprehensive look at religious adherence around the world within a single dataset. With this dataset, you can track global religious trends over a period of 65 years and explore how they’ve changed during that time. By exploring the WRP data set, you’ll gain insight into cross-regional and cross-time patterns in religious affiliation around the world.
- Analyzing historical patterns of religious growth and decline across different regions
- Creating visualizations to compare religious adherence in various states, countries, or globally
- Studying the impact of governmental policies on religious participation over time
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: WRP regional data.csv | Column name | Description | |:-----------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------| | Year | Reference year for data collection. (Integer) | | Region | World region according to Correlates Of War (COW) Regional Systemizations with one modification (Oceania category for COW country code ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
65 to 74 years Poverty Rate Statistics for 2022. This is part of a larger dataset covering poverty in On Top of the World, Florida by age, education, race, gender, work experience and more.
This dataset contains counts of deaths for California counties based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.
The final data tables include both deaths that occurred in each California county regardless of the place of residence (by occurrence) and deaths to residents of each California county (by residence), whereas the provisional data table only includes deaths that occurred in each county regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.
The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.
Note: In these datasets, a person is defined as up to date if they have received at least one dose of an updated COVID-19 vaccine. The Centers for Disease Control and Prevention (CDC) recommends that certain groups, including adults ages 65 years and older, receive additional doses.
On 6/16/2023 CDPH replaced the booster measures with a new “Up to Date” measure based on CDC’s new recommendations, replacing the primary series, boosted, and bivalent booster metrics The definition of “primary series complete” has not changed and is based on previous recommendations that CDC has since simplified. A person cannot complete their primary series with a single dose of an updated vaccine. Whereas the booster measures were calculated using the eligible population as the denominator, the new up to date measure uses the total estimated population. Please note that the rates for some groups may change since the up to date measure is calculated differently than the previous booster and bivalent measures.
This data is from the same source as the Vaccine Progress Dashboard at https://covid19.ca.gov/vaccination-progress-data/ which summarizes vaccination data at the county level by county of residence. Where county of residence was not reported in a vaccination record, the county of provider that vaccinated the resident is included. This applies to less than 1% of vaccination records. The sum of county-level vaccinations does not equal statewide total vaccinations due to out-of-state residents vaccinated in California.
These data do not include doses administered by the following federal agencies who received vaccine allocated directly from CDC: Indian Health Service, Veterans Health Administration, Department of Defense, and the Federal Bureau of Prisons.
Totals for the Vaccine Progress Dashboard and this dataset may not match, as the Dashboard totals doses by Report Date and this dataset totals doses by Administration Date. Dose numbers may also change for a particular Administration Date as data is updated.
Previous updates:
On March 3, 2023, with the release of HPI 3.0 in 2022, the previous equity scores have been updated to reflect more recent community survey information. This change represents an improvement to the way CDPH monitors health equity by using the latest and most accurate community data available. The HPI uses a collection of data sources and indicators to calculate a measure of community conditions ranging from the most to the least healthy based on economic, housing, and environmental measures.
Starting on July 13, 2022, the denominator for calculating vaccine coverage has been changed from age 5+ to all ages to reflect new vaccine eligibility criteria. Previously the denominator was changed from age 16+ to age 12+ on May 18, 2021, then changed from age 12+ to age 5+ on November 10, 2021, to reflect previous changes in vaccine eligibility criteria. The previous datasets based on age 16+ and age 5+ denominators have been uploaded as archived tables.
Starting on May 29, 2021 the methodology for calculating on-hand inventory in the shipped/delivered/on-hand dataset has changed. Please see the accompanying data dictionary for details. In addition, this dataset is now down to the ZIP code level.
Cause of death data based on VA interviews were contributed by fourteen INDEPTH HDSS sites in sub-Saharan Africa and eight sites in Asia. The principles of the Network and its constituent population surveillance sites have been described elsewhere [1]. Each HDSS site is committed to long-term longitudinal surveillance of circumscribed populations, typically each covering around 50,000 to 100,000 people. Households are registered and visited regularly by lay field-workers, with a frequency varying from once per year to several times per year. All vital events are registered at each such visit, and any deaths recorded are followed up with verbal autopsy interviews, usually 147 undertaken by specially trained lay interviewers. A few sites were already operational in the 1990s, but in this dataset 95% of the person-time observed related to the period from 2000 onwards, with 58% from 2007 onwards. Two sites, in Nairobi and Ouagadougou, followed urban populations, while the remainder covered areas that were generally more rural in character, although some included local urban centres. Sites covered entire populations, although the Karonga, Malawi, site only contributed VAs for deaths of people aged 12 years and older. Because the sites were not located or designed in a systematic way to be representative of national or regional populations, it is not meaningful to aggregate results over sites.
All cause of death assignments in this dataset were made using the InterVA-4 model version 4.02 [2]. InterVA-4 uses probabilistic modelling to arrive at likely cause(s) of death for each VA case, the workings of the model being based on a combination of expert medical opinion and relevant available data. InterVA-4 is the only model currently available that processes VA data according to the WHO 2012 standard and categorises causes of death according to ICD-10. Since the VA data reported here were collected before the WHO 2012 standard was formulated, they were all retrospectively transformed into the WHO 2012 and InterVA-4 input format for processing.
The InterVA-4 model was applied to the data from each site, yielding, for each case, up to three possible causes of death or an indeterminate result. Each cause for a case is a single record in the dataset. In a minority of cases, for example where symptoms were vague, contradictory or mutually inconsistent, it was impossible for InterVA-4 to determine a cause of death, and these deaths were attributed as entirely indeterminate. For the remaining cases, one to three likely causes and their likelihoods were assigned by InterVA-4, and if the sum of their likelihoods was less than one, the residual component was then assigned as being indeterminate. This was an important process for capturing uncertainty in cause of death outcome(s) from the model at the individual level, thus avoiding over-interpretation of specific causes. As a consequence there were three sources of unattributed cause of death: deaths registered for which VAs were not successfully completed; VAs completed but where the cause was entirely indeterminate; and residual components of deaths attributed as indeterminate.
In this dataset each case has between one and four records, each with its own cause and likelihood. Cases for which VAs were not successfully completed has a single record with the cause of death recorded as “VA not completed” and a likelihood of one. Thus the overall sum of the likelihoods equated to the total number of deaths. Each record also contains a population weighting factor reflecting the ratio of the population fraction for its site, age group, sex and year to the corresponding age group and sex fraction in the standard population (see section on weighting).
In this context, all of these data are secondary datasets derived from primary data collected separately by each participating site. In all cases the primary data collection was covered by site-level ethical approvals relating to on-going demographic surveillance in those specific locations. No individual identity or household location data are included in this secondary data.
Sankoh O, Byass P. The INDEPTH Network: filling vital gaps in global epidemiology. International Journal of Epidemiology 2012; 41:579-588.
Byass P, Chandramohan D, Clark SJ, D’Ambruoso L, Fottrell E, Graham WJ, et al. Strengthening standardised interpretation of verbal autopsy data: the new InterVA-4 tool. Global Health Action 2012; 5:19281.
Demographic surveiallance areas (countries from Africa, Asia and Oceania) of the following HDSSs:
Code Country INDEPTH Centre
BD011 Bangladesh ICDDR-B : Matlab
BD012 Bangladesh ICDDR-B : Bandarban
BD013 Bangladesh ICDDR-B : Chakaria
BD014 Bangladesh ICDDR-B : AMK BF031 Burkina Faso Nouna BF041 Burkina Faso Ouagadougou
CI011 Côte d'Ivoire Taabo ET031 Ethiopia Kilite Awlaelo
GH011 Ghana Navrongo
GH031 Ghana Dodowa
GM011 The Gambia Farafenni ID011 Indonesia Purworejo IN011 India Ballabgarh
IN021 India Vadu
KE011 Kenya Kilifi
KE021 Kenya Kisumu
KE031 Kenya Nairobi
MW011 Malawi Karonga
SN011 Senegal IRD : Bandafassi VN012 Vietnam Hanoi Medical University : Filabavi
ZA011 South Africa Agincourt ZA031 South Africa Africa Centre
Death Cause
Surveillance population Deceased individuals Cause of death
Verbal autopsy-based cause of death data
Rounds per year varies between sites from once to three times per year
No sampling, covers total population in demographic surveillance area
Face-to-face [f2f]
The Verbal Autopsy Questionnaires used by the various sites differed, but in most cases they were a derivation from the original WHO Verbal Autopsy questionnaire.
http://www.who.int/healthinfo/statistics/verbalautopsystandards/en/index1.html
One cause of death record was inserted for every death where a verbal autopsy was not conducted. The cuase of death assigned in these cases is "XX VA not completed"
This indicator is defined as the percentage of the population living in an overcrowded household. A person is considered as living in an overcrowded household if the household does not have at its disposal a minimum of rooms equal to: - one room for the household; - one room by couple in the household; - one room for each single person aged 18 and more; - one room by pair of single people of the same sex between 12 and 17 years of age; - one room for each single person between 12 and 17 years of age and not included in the previous category; - one room by pair of children under 12 years of age. The indicator is presented by sex.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The GDELT Project is the largest, most comprehensive, and highest resolution open database of human society ever created. Just the 2015 data alone records nearly three quarters of a trillion emotional snapshots and more than 1.5 billion location references, while its total archives span more than 215 years, making it one of the largest open-access spatio-temporal datasets in existance and pushing the boundaries of "big data" study of global human society. Its Global Knowledge Graph connects the world's people, organizations, locations, themes, counts, images and emotions into a single holistic network over the entire planet. How can you query, explore, model, visualize, interact, and even forecast this vast archive of human society?
GDELT 2.0 has a wealth of features in the event database which includes events reported in articles published in 65 live translated languages, measurements of 2,300 emotions and themes, high resolution views of the non-Western world, relevant imagery, videos, and social media embeds, quotes, names, amounts, and more.
You may find these code books helpful:
GDELT Global Knowledge Graph Codebook V2.1 (PDF)
GDELT Event Codebook V2.0 (PDF)
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]
. [Fork this kernel to get started][98] to learn how to safely manage analyzing large BigQuery datasets.
You may redistribute, rehost, republish, and mirror any of the GDELT datasets in any form. However, any use or redistribution of the data must include a citation to the GDELT Project and a link to the website (https://www.gdeltproject.org/).
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the US English Language Visual Speech Dataset! This dataset is a collection of diverse, single-person unscripted spoken videos supporting research in visual speech recognition, emotion detection, and multimodal communication.
This visual speech dataset contains 1000 videos in US English language each paired with a corresponding high-fidelity audio track. Each participant is answering a specific question in a video in an unscripted and spontaneous nature.
While recording each video extensive guidelines are kept in mind to maintain the quality and diversity.
The dataset provides comprehensive metadata for each video recording and participant:
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Using an innovative approach that combines geospatial science, remote sensing technology, and machine learning algorithms, LandScan Global is a global population distribution data, at 30 arc seconds (roughly 1km at equator), representing an ambient (24 hour average) population. The LandScan Global algorithm, an R&D 100 Award Winner, uses spatial data, high-resolution imagery exploitation, and a multi-variable dasymetric modeling approach to disaggregate census counts within an administrative boundary. Since no single population distribution model can account for the differences in spatial data availability, quality, scale, and accuracy as well as the differences in cultural settlement practices, LandScan population distribution models are tailored to match the data conditions and geographical nature of each individual country and region. By modeling an ambient population, LandScan Global captures the full potential activity space of people throughout the course of the day and night rather than just a residential location.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘List of Top Data Breaches (2004 - 2021)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/hishaamarmghan/list-of-top-data-breaches-2004-2021 on 14 February 2022.
--- Dataset description provided by original source is as follows ---
This is a dataset containing all the major data breaches in the world from 2004 to 2021
As we know, there is a big issue related to the privacy of our data. Many major companies in the world still to this day face this issue every single day. Even with a great team of people working on their security, many still suffer. In order to tackle this situation, it is only right that we must study this issue in great depth and therefore I pulled this data from Wikipedia to conduct data analysis. I would encourage others to take a look at this as well and find as many insights as possible.
This data contains 5 columns: 1. Entity: The name of the company, organization or institute 2. Year: In what year did the data breach took place 3. Records: How many records were compromised (can include information like email, passwords etc.) 4. Organization type: Which sector does the organization belong to 5. Method: Was it hacked? Were the files lost? Was it an inside job?
Here is the source for the dataset: https://en.wikipedia.org/wiki/List_of_data_breaches
Here is the GitHub link for a guide on how it was scraped: https://github.com/hishaamarmghan/Data-Breaches-Scraping-Cleaning
--- Original source retains full ownership of the source dataset ---
Turkey oak (Quercus cerris L.) is one of the ecologically and economically most important deciduous tree species in the Central and Southeast European regions. The species distribution range covers hundreds of thousands of hectares throughout the Apennine and Balkan Peninsula, the Carpathian Basin to Asia Minor. Turkey oak has long been known exhibit high levels of genetic and phenotypic variation. Recent predictions on climate responses of this species suggest a significant extension of its distribution in Europe under climate change. Since Turkey oak has relative drought-tolerant behavior, it is regarded as a potential alternative for other forest tree species during forestry climate adaptation efforts, not only in its native regions but in Western Europe as well. For this reason, the survey of existing genetic variability, genetic resources and adaptability of this species has great importance. Next-generation sequencing approaches, such as ddRAD-seq (Double digest restriction-site associated DNA sequencing), allow for obtaining high-resolution genome-wide simple nucleotide polymorphisms (SNPs). Based on thousands of SNP markers the genetic structure of populations and the genetic background of adaptation processes can be studied in far more depth than ever before. In this study, we provide highly variable genome-wide SNP data belonging to Turkey oak for the first time. This dataset comprises the SNP data of 88 individuals of eight populations, two from Bulgaria, one from Kosovo and five from Hungary, respectively. The high-resolution genome-wide markers are suitable to infer genetic diversity, differentiation, population structure and to investigate selection and local adaptation. The dataset accessible at: https://doi.org/10.5281/zenodo.7568727
By Nicky Forster [source]
The dataset contains data points such as the cumulative count of people who have received at least one dose of the vaccine, new doses administered on a specific date, cumulative count of doses distributed in the country, percentage of population that has completed the full vaccine series, cumulative count of Pfizer and Moderna vaccine doses administered in each state, seven-day rolling averages for new doses administered and distributed, among others.
It also provides insights into the vaccination status at both national and state levels. The dataset includes information on the percentage of population that has received at least one dose of the vaccine, percentage of population that has completed the full vaccine series, cumulative counts per 100k population for both distributed and administered doses.
Additionally, it presents data specific to each state, including their abbreviation and name. It outlines details such as cumulative counts per 100k population for both distributed and administered doses in each state. Furthermore, it indicates if there were instances where corrections resulted in single-day negative counts.
The dataset is compiled from daily snapshots obtained from CDC's COVID Data Tracker. Please note that there may be reporting delays by healthcare providers up to 72 hours after administering a dose.
This comprehensive dataset serves various purposes including tracking vaccination progress over time across different locations within the United States. It can be used by researchers, policymakers or anyone interested in analyzing trends related to COVID-19 vaccination efforts at both national and state levels
Familiarize Yourself with the Columns: Take a look at the available columns in this dataset to understand what information is included. These columns provide details such as state abbreviations, state names, dates of data snapshots, cumulative counts of doses distributed and administered, people who have received at least one dose or completed the vaccine series, percentages of population coverage, manufacturer-specific data, and seven-day rolling averages.
Explore Cumulative Counts: The dataset includes cumulative counts that show the total number of doses distributed or administered over time. You can analyze these numbers to track trends in vaccination progress in different states or regions.
Analyze Daily Counts: The dataset also provides daily counts of new vaccine doses distributed and administered on specific dates. By examining these numbers, you can gain insights into vaccination rates on a day-to-day basis.
Study Population Coverage Metrics: Metrics such as pct_population_received_at_least_one_dose and pct_population_series_complete give you an understanding of how much of each state's population has received at least one dose or completed their vaccine series respectively.
Utilize Manufacturer Data: The columns related to Pfizer and Moderna provide information about the number of doses administered for each manufacturer separately. By analyzing this data, you can compare vaccination rates between different vaccines.
Consider Rolling Averages: The seven-day rolling average columns allow you to smooth out fluctuations in daily counts by calculating an average over a week's time window. This can help identify long-term trends more accurately.
Compare States: You can compare vaccination progress between different states by filtering the dataset based on state names or abbreviations. This way, you can observe variations in distribution and administration rates among different regions.
Visualize the Data: Creating charts and graphs will help you visualize the data more effectively. Plotting trends over time or comparing different metrics for various states can provide powerful visual representations of vaccination progress.
Stay Informed: Keep in mind that this dataset is continuously updated as new data becomes available. Make sure to check for any updates or refreshed datasets to obtain the most recent information on COVID-19 vaccine distributions and administrations
- Vaccination Analysis: This dataset can be used to analyze the progress of COVID-19 vaccinations in the United States. By examining the cumulative counts of doses distributed and administered, as well as the number of people who have received at least one dose or completed the vaccine series, researchers and policymakers can assess how effectively vaccines are being rolled out and monitor...
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The data is collected from OWID (Our World in Data) GitHub repository, which is updated on daily bases.
This dataset contains only one file vaccinations.csv
, which contains the records of vaccination doses received by people from all the countries.
* location
: name of the country (or region within a country).
* iso_code
: ISO 3166-1 alpha-3 – three-letter country codes.
* date
: date of the observation.
* total_vaccinations
: total number of doses administered. This is counted as a single dose, and may not equal the total number of people vaccinated, depending on the specific dose regime (e.g. people receive multiple doses). If a person receives one dose of the vaccine, this metric goes up by 1. If they receive a second dose, it goes up by 1 again.
* total_vaccinations_per_hundred
: total_vaccinations
per 100 people in the total population of the country.
* daily_vaccinations_raw
: daily change in the total number of doses administered. It is only calculated for consecutive days. This is a raw measure provided for data checks and transparency, but we strongly recommend that any analysis on daily vaccination rates be conducted using daily_vaccinations
instead.
* daily_vaccinations
: new doses administered per day (7-day smoothed). For countries that don't report data on a daily basis, we assume that doses changed equally on a daily basis over any periods in which no data was reported. This produces a complete series of daily figures, which is then averaged over a rolling 7-day window. An example of how we perform this calculation can be found here.
* daily_vaccinations_per_million
: daily_vaccinations
per 1,000,000 people in the total population of the country.
* people_vaccinated
: total number of people who received at least one vaccine dose. If a person receives the first dose of a 2-dose vaccine, this metric goes up by 1. If they receive the second dose, the metric stays the same.
* people_vaccinated_per_hundred
: people_vaccinated
per 100 people in the total population of the country.
* people_fully_vaccinated
: total number of people who received all doses prescribed by the vaccination protocol. If a person receives the first dose of a 2-dose vaccine, this metric stays the same. If they receive the second dose, the metric goes up by 1.
* people_fully_vaccinated_per_hundred
: people_fully_vaccinated
per 100 people in the total population of the country.
Note: for people_vaccinated
and people_fully_vaccinated
we are dependent on the necessary data being made available, so we may not be able to make these metrics available for some countries.
This data collected by Our World in Data
which gets updated daily on their Github.
Possible uses for this dataset could include: - Sentiment analysis in a variety of forms - Statistical analysis over time.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The idea is to have a very simple time series dataset to be used for experiments with easy but effective visualizations on actual data. It is amazing how much a single graph can comunicate syntehetically a lot of information.
The dataset was downloaded from the National Centers for Environmental Information (NCEI), the data is in the public domain and can be used freely. If interested in generating a similar dataset from another station you can start from the Search Tool select Daily Summaries, the time range of interest, search for Cities and in the Search Term put the city you're looking for. When selected you need to add to Cart like an order but there is no charge for ordering data from Climate Data Online as explained in their FAQs.
Thanks to National Centers for Environmental Information for collecting and making available for free meteorological data from many stations all over the world. In case using the same dataset or generating a new one from NCEI you need to cite the origin.
Mostly to see how many different effective visualizations can be generated from a very simple dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India Census: Population: Age: 18 data was reported at 27,958,147.000 Person in 2011. This records an increase from the previous number of 27,686,902.000 Person for 2001. India Census: Population: Age: 18 data is updated yearly, averaging 27,686,902.000 Person from Mar 1991 (Median) to 2011, with 3 observations. The data reached an all-time high of 27,958,147.000 Person in 2011 and a record low of 23,656,856.000 Person in 1991. India Census: Population: Age: 18 data remains active status in CEIC and is reported by Census of India. The data is categorized under India Premium Database’s Demographic – Table IN.GAD002: Census: Population: by Single Age.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This World Marriage Dataset provides a comparable and up-to-date set of data on the marital status of the population by age and sex for 232 countries or different regions of the world from 1970 to 2019. There are 271605 rows and 9 columns in this dataset. Each row of the dataset represents a specific age group of men, either divorced or married or Single. The columns include:
Sr. No.: A serial number to identify each entry. Country: The country of focus. Age Group: The age range of the surveyed individuals. Sex: The gender of the surveyed individuals. Marital Status: The marital status of the individuals, categorized as either "Divorced" or "Married" or "Single". Data Process: The method used to collect the data. Data Collection (Start Year): The year when data collection began. Data Collection (End Year): The year when data collection ended. Data Source: The source of the data. This dataset helps to understand the marital status distribution among different age groups of men and women in all over the world from 1970 to 2019.