40 datasets found

World Bank: Education Data
kaggle.com
zip
Updated Mar 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Bank (2019). World Bank: Education Data [Dataset]. https://www.kaggle.com/datasets/theworldbank/world-bank-intl-education
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 20, 2019
Dataset provided by
World Bank Grouphttp://www.worldbank.org/
Authors
World Bank
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

The World Bank is an international financial institution that provides loans to countries of the world for capital projects. The World Bank's stated goal is the reduction of poverty. Source: https://en.wikipedia.org/wiki/World_Bank

Content

This dataset combines key education statistics from a variety of sources to provide a look at global literacy, spending, and access.

For more information, see the World Bank website.

Fork this kernel to get started with this dataset.

Acknowledgements

https://bigquery.cloud.google.com/dataset/bigquery-public-data:world_bank_health_population

http://data.worldbank.org/data-catalog/ed-stats

https://cloud.google.com/bigquery/public-data/world-bank-education

Citation: The World Bank: Education Statistics

Dataset Source: World Bank. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @till_indeman from Unplash.

Inspiration

Of total government spending, what percentage is spent on education?
Death in the United States
kaggle.com
zip
Updated Aug 3, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centers for Disease Control and Prevention (2017). Death in the United States [Dataset]. https://www.kaggle.com/datasets/cdc/mortality
Explore at:
zip(766333584 bytes)Available download formats
Dataset updated
Aug 3, 2017
Dataset authored and provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
Every year the CDC releases the country’s most detailed report on death in the United States under the National Vital Statistics Systems. This mortality dataset is a record of every death in the country for 2005 through 2015, including detailed information about causes of death and the demographic background of the deceased.

It's been said that "statistics are human beings with the tears wiped off." This is especially true with this dataset. Each death record represents somebody's loved one, often connected with a lifetime of memories and sometimes tragically too short.

Putting the sensitive nature of the topic aside, analyzing mortality data is essential to understanding the complex circumstances of death across the country. The US Government uses this data to determine life expectancy and understand how death in the U.S. differs from the rest of the world. Whether you’re looking for macro trends or analyzing unique circumstances, we challenge you to use this dataset to find your own answers to one of life’s great mysteries.

Overview

This dataset is a collection of CSV files each containing one year's worth of data and paired JSON files containing the code mappings, plus an ICD 10 code set. The CSVs were reformatted from their original fixed-width file formats using information extracted from the CDC's PDF manuals using this script. Please note that this process may have introduced errors as the text extracted from the pdf is not a perfect match. If you have any questions or find errors in the preparation process, please leave a note in the forums. We hope to publish additional years of data using this method soon.

A more detailed overview of the data can be found here. You'll find that the fields are consistent within this time window, but some of data codes change every few years. For example, the 113_cause_recode entry 069 only covers ICD codes (I10,I12) in 2005, but by 2015 it covers (I10,I12,I15). When I post data from years prior to 2005, expect some of the fields themselves to change as well.

All data comes from the CDC’s National Vital Statistics Systems, with the exception of the Icd10Code, which are sourced from the World Health Organization.

Project ideas

The CDC's mortality data was the basis of a widely publicized paper, by Anne Case and Nobel prize winner Angus Deaton, arguing that middle-aged whites are dying at elevated rates. One of the criticisms against the paper is that it failed to properly account for the exact ages within the broad bins available through the CDC's WONDER tool. What do these results look like with exact/not-binned age data?

Similarly, how sensitive are the mortality trends being discussed in the news to the choice of bin-widths?

As noted above, the data preparation process could have introduced errors. Can you find any discrepancies compared to the aggregate metrics on WONDER? If so, please let me know in the forums!

WONDER is cited in numerous economics, sociology, and public health research papers. Can you find any papers whose conclusions would be altered if they used the exact data available here rather than binned data from Wonder?

Differences from the first version of the dataset

This version of the dataset was prepared in a completely different many. This has allowed us to provide a much larger volume of data and ensure that codes are available for every field.

We've replaced the batch of sql files with a single JSON per year. Kaggle's platform currently offer's better support for JSON files, and this keeps the number of files manageable.

A tutorial kernel providing a quick introduction to the new format is available here.

Lastly, I apologize if the transition has interrupted anyone's work! If need be, you can still download v1.
d
Mass Killings in America, 2006 - present
data.world
csv, zip
Updated Oct 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Associated Press (2025). Mass Killings in America, 2006 - present [Dataset]. https://data.world/associatedpress/mass-killings-public
Explore at:
zip, csvAvailable download formats
Dataset updated
Oct 22, 2025
Authors
The Associated Press
Time period covered
Jan 1, 2006 - Oct 12, 2025
Area covered

Description
THIS DATASET WAS LAST UPDATED AT 2:12 AM EASTERN ON OCT. 22

OVERVIEW

2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.

In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.

A total of 229 people died in mass killings in 2019.

The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.

One-third of the offenders died at the scene of the killing or soon after, half from suicides.

About this Dataset

The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.

The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.

This data will be updated periodically and can be used as an ongoing resource to help cover these events.

Using this Dataset

To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:

Mass killings by year

Mass shootings by year

To get these counts just for your state:

Filter killings by state

Definition of "mass murder"

Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.

This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”

Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.

Methodology

Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.

Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.

In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.

Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.

Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.

This project started at USA TODAY in 2012.

Contacts

Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.
d
International Data Base
dknet.org
neuinfo.org
+2more
Updated Jan 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). International Data Base [Dataset]. http://identifiers.org/RRID:SCR_013139
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_013139
Dataset updated
Jan 29, 2022
Description
A computerized data set of demographic, economic and social data for 227 countries of the world. Information presented includes population, health, nutrition, mortality, fertility, family planning and contraceptive use, literacy, housing, and economic activity data. Tabular data are broken down by such variables as age, sex, and urban/rural residence. Data are organized as a series of statistical tables identified by country and table number. Each record consists of the data values associated with a single row of a given table. There are 105 tables with data for 208 countries. The second file is a note file, containing text of notes associated with various tables. These notes provide information such as definitions of categories (i.e. urban/rural) and how various values were calculated. The IDB was created in the U.S. Census Bureau''s International Programs Center (IPC) to help IPC staff meet the needs of organizations that sponsor IPC research. The IDB provides quick access to specialized information, with emphasis on demographic measures, for individual countries or groups of countries. The IDB combines data from country sources (typically censuses and surveys) with IPC estimates and projections to provide information dating back as far as 1950 and as far ahead as 2050. Because the IDB is maintained as a research tool for IPC sponsor requirements, the amount of information available may vary by country. As funding and research activity permit, the IPC updates and expands the data base content. Types of data include: * Population by age and sex * Vital rates, infant mortality, and life tables * Fertility and child survivorship * Migration * Marital status * Family planning Data characteristics: * Temporal: Selected years, 1950present, projected demographic data to 2050. * Spatial: 227 countries and areas. * Resolution: National population, selected data by urban/rural * residence, selected data by age and sex. Sources of data include: * U.S. Census Bureau * International projects (e.g., the Demographic and Health Survey) * United Nations agencies Links: * ICPSR: http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/08490
Social Insurance Programs in Richest Quintile
kaggle.com
Updated Jan 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Social Insurance Programs in Richest Quintile [Dataset]. https://www.kaggle.com/datasets/thedevastator/coverage-of-social-insurance-programs-in-richest
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 7, 2023
Dataset provided by
Kaggle
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Coverage of Social Insurance Programs in Richest Quintile

Percent of Population Eligible

By data.world's Admin [source]

About this dataset

This dataset offers a unique insight into the coverage of social insurance programs for the wealthiest quintile of populations around the world. It reveals how many individuals in each country are receiving support from old age contributory pensions, disability benefits, and social security and health insurance benefits such as occupational injury benefits, paid sick leave, maternity leave, and more. This data provides an invaluable resource to understand the health and well-being of those most financially privileged in society – often having greater impact on decision making than other groups. With up-to-date figures from 2019-05-11 this dataset is invaluable in uncovering where there is work to be done for improved healthcare provision in each country across the world

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

Understand the context: Before you begin analyzing this dataset, it is important to understand the information that it provides. Take some time to read the description of what is included in the dataset, including a clear understanding of the definitions and scope of coverage provided with each data point.

Examine the data: Once you have a general understanding of this dataset's contents, take some time to explore its contents in more depth. What specific questions does this dataset help answer? What kind of insights does it provide? Are there any missing pieces?

Clean & Prepare Data: After you've preliminarily examined its content, start preparing your data for further analysis and visualization. Clean up any formatting issues or irregularities present in your data set by correcting typos and eliminating unnecessary rows or columns before working with your chosen programming language (I prefer R for data manipulation tasks). Additionally, consider performing necessary transformations such as sorting or averaging values if appropriate for the findings you wish to draw from your analysis.

Visualize Results: Once you've cleaned and prepared your data, use visualizations such as charts, graphs or tables to reveal patterns within it that support specific conclusions about how insurance coverage under social programs vary among different groups within society's quintiles - based on age groups etc.. This type of visualization allows those who aren't familiar with programming to process complex information quickly and accurately than when displayed numerically in tabular form only!

5 Final Analysis & Export Results: Finally export your visuals into presentation-ready formats (e.g., PDFs) which can be shared with colleagues! Additionally use these results as part of a narrative conclusion report providing an accurate assessment and meaningful interpretation about how social insurance programs vary between different members within society's quintiles (i..e., accordingest vs poorest), along with potential policy implications relevant for implementing effective strategies that improve access accordingly!

Research Ideas

Analyzing the effectiveness of social insurance programs by comparing the coverage levels across different geographic areas or socio-economic groups;

Estimating the economic impact of social insurance programs on local and national economies by tracking spending levels and revenues generated;

Identifying potential problems with access to social insurance benefits, such as racial or gender disparities in benefit coverage

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: coverage-of-social-insurance-programs-in-richest-quintile-of-population-1.csv

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit data.world's Admin.
Statewide Death Profiles
data.chhs.ca.gov
data.ca.gov
+3more
csv, zip
Updated Oct 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Public Health (2025). Statewide Death Profiles [Dataset]. https://data.chhs.ca.gov/dataset/statewide-death-profiles
Explore at:
csv(164006), csv(200270), csv(2026589), csv(5401561), csv(463460), csv(5034), csv(16301), csv(4689434), csv(419332), zip, csv(429224)Available download formats
Dataset updated
Oct 2, 2025
Dataset authored and provided by
California Department of Public Healthhttps://www.cdph.ca.gov/
Description
This dataset contains counts of deaths for California as a whole based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

The final data tables include both deaths that occurred in California regardless of the place of residence (by occurrence) and deaths to California residents (by residence), whereas the provisional data table only includes deaths that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.
D
ARCHIVED: COVID-19 Cases by Population Characteristics Over Time
data.sfgov.org
healthdata.gov
+1more
csv, xlsx, xml
Updated Sep 11, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). ARCHIVED: COVID-19 Cases by Population Characteristics Over Time [Dataset]. https://data.sfgov.org/Health-and-Social-Services/ARCHIVED-COVID-19-Cases-by-Population-Characterist/j7i3-u9ke
Explore at:
xlsx, xml, csvAvailable download formats
Dataset updated
Sep 11, 2023
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
A. SUMMARY This archived dataset includes data for population characteristics that are no longer being reported publicly. The date on which each population characteristic type was archived can be found in the field “data_loaded_at”.

B. HOW THE DATASET IS CREATED Data on the population characteristics of COVID-19 cases are from:  * Case interviews  * Laboratories  * Medical providers    These multiple streams of data are merged, deduplicated, and undergo data verification processes.  

Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. * The population estimates for the "Other" or “Multi-racial” groups should be considered with caution. The Census definition is likely not exactly aligned with how the City collects this data. For that reason, we do not recommend calculating population rates for these groups.

Gender * The City collects information on gender identity using these guidelines.

Skilled Nursing Facility (SNF) occupancy * A Skilled Nursing Facility (SNF) is a type of long-term care facility that provides care to individuals, generally in their 60s and older, who need functional assistance in their daily lives.  * This dataset includes data for COVID-19 cases reported in Skilled Nursing Facilities (SNFs) through 12/31/2022, archived on 1/5/2023. These data were identified where “Characteristic_Type” = ‘Skilled Nursing Facility Occupancy’.

Sexual orientation * The City began asking adults 18 years old or older for their sexual orientation identification during case interviews as of April 28, 2020. Sexual orientation data prior to this date is unavailable. * The City doesn’t collect or report information about sexual orientation for persons under 12 years of age. * Case investigation interviews transitioned to the California Department of Public Health, Virtual Assistant information gathering beginning December 2021. The Virtual Assistant is only sent to adults who are 18+ years old. https://www.sfdph.org/dph/files/PoliciesProcedures/COM9_SexualOrientationGuidelines.pdf">Learn more about our data collection guidelines pertaining to sexual orientation.

Comorbidities * Underlying conditions are reported when a person has one or more underlying health conditions at the time of diagnosis or death.

Homelessness Persons are identified as homeless based on several data sources: * self-reported living situation * the location at the time of testing * Department of Public Health homelessness and health databases * Residents in Single-Room Occupancy hotels are not included in these figures. These methods serve as an estimate of persons experiencing homelessness. They may not meet other homelessness definitions.

Single Room Occupancy (SRO) tenancy * SRO buildings are defined by the San Francisco Housing Code as having six or more "residential guest rooms" which may be attached to shared bathrooms, kitchens, and living spaces. * The details of a person's living arrangements are verified during case interviews.

Transmission Type * Information on transmission of COVID-19 is based on case interviews with individuals who have a confirmed positive test. Individuals are asked if they have been in close contact with a known COVID-19 case. If they answer yes, transmission category is recorded as contact with a known case. If they report no contact with a known case, transmission category is recorded as community transmission. If the case is not interviewed or was not asked the question, they are counted as unknown.

C. UPDATE PROCESS This dataset has been archived and will no longer update as of 9/11/2023.

D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).

This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of cases on each date.

New cases are the count of cases within that characteristic group where the positive tests were collected on that specific specimen collection date. Cumulative cases are the running total of all San Francisco cases in that characteristic group up to the specimen collection date listed.

This data may not be immediately available for recently reported cases. Data updates as more information becomes available.

To explore data on the total number of cases, use the ARCHIVED: COVID-19 Cases Over Time dataset.

E. CHANGE LOG
9/11/2023 - data on COVID-19 cases by population characteristics over time are no longer being updated. The date on which each population characteristic type was archived can be found in the field “data_loaded_at”.
6/6/2023 - data on cases by transmission type have been removed. See section ARCHIVED DATA for more detail.
5/16/2023 - data on cases by sexual orientation, comorbidities, homelessness, and single room occupancy have been removed. See section ARCHIVED DATA for more detail.
4/6/2023 - the State implemented system updates to improve the integrity of historical data.
2/21/2023 - system updates to improve reliability and accuracy of cases data were implemented.
1/31/2023 - updated “population_estimate” column to reflect the 2020 Census Bureau American Community Survey (ACS) San Francisco Population estimates.
1/5/2023 - data on SNF cases removed. See section ARCHIVED DATA for more detail.
3/23/2022 - ‘Native American’ changed to ‘American Indian or Alaska Native’ to align with the census.
1/22/2022 - system updates to improve timeliness and accuracy of cases and deaths data were implemented.
7/15/2022 - reinfections added to cases dataset. See section SUMMARY for more information on how reinfections are identified.
d
Crash Data
catalog.data.gov
data.townofcary.org
+1more
Updated Oct 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cary (2025). Crash Data [Dataset]. https://catalog.data.gov/dataset/crash-data
Explore at:
Dataset updated
Oct 18, 2025
Dataset provided by
Cary
Description
This dataset contains crash information from the last five years to the current date. The data is based on the National Incident Based Reporting System (NIBRS). The data is dynamic, allowing for additions, deletions and modifications at any time, resulting in more accurate information in the database. Due to ongoing and continuous data entry, the numbers of records in subsequent extractions are subject to change.About Crash DataThe Cary Police Department strives to make crash data as accurate as possible, but there is no avoiding the introduction of errors into this process, which relies on data furnished by many people and that cannot always be verified. As the data is updated on this site there will be instances of adding new incidents and updating existing data with information gathered through the investigative process.Not surprisingly, crash data becomes more accurate over time, as new crashes are reported and more information comes to light during investigations.This dynamic nature of crash data means that content provided here today will probably differ from content provided a week from now. Likewise, content provided on this site will probably differ somewhat from crime statistics published elsewhere by the Town of Cary, even though they draw from the same database.About Crash LocationsCrash locations reflect the approximate locations of the crash. Certain crashes may not appear on maps if there is insufficient detail to establish a specific, mappable location.
Q
Data for: The Pandemic Journaling Project, Phase One (PJP-1)
data.qdr.syr.edu
3gp +22
Updated Feb 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah S. Willen; Sarah S. Willen; Katherine A. Mason; Katherine A. Mason (2024). Data for: The Pandemic Journaling Project, Phase One (PJP-1) [Dataset]. http://doi.org/10.5064/F6PXS9ZK
Explore at:
jpeg(-1), jpeg(64787), png(-1), jpeg(2635904), jpeg(2809706), jpeg(3128025), jpeg(3522579), mp4a(609792), jpeg(2715246), jpeg(564843), mp4a(1607020), jpeg(29277), jpeg(411392), jpeg(3219184), html(64045635), jpeg(1455187), jpeg(3953592), jpeg(445647), jpeg(3079564), png(858132), jpeg(3262275), jpeg(5268315), jpeg(1173279), mp4a(4746585), mp4a(506955), jpeg(2228793), jpeg(2399356), jpeg(1847185), png(1487656), mp4a(3329780), mp4a(1503462), bin(-1), jpeg(3226310), mp4a(2843558), jpeg(3161075), jpeg(2535033), jpeg(1814204), mp4a(1403036), jpeg(6831581), jpeg(3500892), jpeg(2063706), jpeg(2867362), jpeg(36303), mp4a(608702), jpeg(2174907), jpeg(2775382), mpga(3119325), pdf(-1), html(28046914), jpeg(2571274), qt(642282), gif(-1), bin(1475326), jpeg(1669679), jpeg(288031), mp4(16611275), jpeg(3758294), mp4a(1316029), mp4a(2192000), jpeg(51905), mpga(3284435), jpeg(47621), jpeg(806714), jpeg(3720630), mp4a(2496251), jpeg(2320221), jpeg(4266931), jpeg(3779944), jpeg(2036741), jpeg(73283), jpeg(460192), jpeg(81002), jpeg(1794407), jpeg(843851), jpeg(134732), bin(1324105), mp4(-1), html(3785552), bin(446182), jpeg(126557), jpeg(112141), jpeg(99013), jpeg(2763037), jpeg(2904103), mp4a(3455446), jpeg(2690540), mpga(3655410), jpeg(2348580), mp4a(8043573), jpeg(4103780), mp4a(2090318), jpeg(3309302), xlsx(34600), jpeg(3101557), qt(-1), jpeg(2597912), jpeg(197952), jpeg(528533), jpeg(2484777), jpeg(17026260), jpeg(31091), jpeg(1143472), jpeg(2705547), jpeg(4634609), mp4a(2427794), mp4a(865561), qt(6530289), jpeg(2750981), mp4a(431473), jpeg(4477949), jpeg(5588285), mp4a(1258547), jpeg(44679), jpeg(5718836), jpeg(2169748), mp4a(4727052), jpeg(4410466), jpeg(359020), jpeg(319878), jpeg(3348421), jpeg(2742034), jpeg(479908), jpeg(2871901), jpeg(754914), mpga(3369080), audio/vnd.dlna.adts(2291450), bin(925606), mp4a(1468479), mp4a(3505956), mp4a(934968), jpeg(94576), mp4a(954136), png(1217841), png(259675), jpeg(2768465), jpeg(7435869), mp4a(558160), jpeg(452676), jpeg(2614435), jpeg(2295874), jpeg(2985176), jpeg(2382774), jpeg(1836889), mp4a(714107), jpeg(3058184), png(4809397), png(291188), jpeg(476581), bin(315174), mp4a(963668), mp4a(1691796), jpeg(305566), jpeg(2340053), mp4a(1416194), jpeg(2187251), mp4a(1480696), jpeg(1224621), jpeg(799339), jpeg(2106618), mp4a(2234556), html(59903646), jpeg(1502693), jpeg(496111), mp4a(710717), pdf(791867), jpeg(2320307), mp4a(2723319), jpeg(2588596), qt(6524117), jpeg(706630), jpeg(1797399), jpeg(3578041), png(34340), jpeg(413917), jpeg(2018007), mp4a(1822023), mp4a(546214), jpeg(104863), png(505848), jpeg(3999644), jpeg(2202086), jpeg(1779668), webm(2501579), jpeg(3644901), mpga(61021), xlsx(19458121), jpeg(3678114), jpeg(3195259), mp4a(5998805), mp4a(1089264), mpga(1223745), png(79931), ogv(921344), mp4a(5290770), mp4a(537339), mp4a(2522582), mp4a(2757638), mp4a(902919), mp4a(3664250), jpeg(293524), jpeg(1611225), jpeg(78426), audio/vnd.dlna.adts(3577011), jpeg(1425684), jpeg(2114989), png(2239184), jpeg(3532208), jpeg(2599799), jpeg(4051592), mp4a(766677), bin(1140735), mp4a(1950073), jpeg(2482637), mp4a(9461846), mp4a(886225), mp4a(2275458), jpeg(3964175), png(7323654), mp4a(3407172), jpeg(1662239), jpeg(2738720), jpeg(2680408), jpeg(875989), mp4a(1135778), jpeg(3063173), mp4a(1044083), mp4a(3068302), jpeg(4586435), jpeg(944028), jpeg(65604), jpeg(803886), mp4a(3207845), jpeg(9303719), jpeg(1178560), mpga(1096992), mp4a(273265), jpeg(37593), jpeg(148529), jpeg(516395), html(799294), mp4a(1064123), jpeg(647105), jpeg(3412037), bin(3742158), jpeg(2343745), jpeg(2242087), jpeg(1153242), mp4a(700840), mp4a(614290), png(674974), mp4a(462181), mp4a(3341713), mp4a(5455315), bin(1700382), png(7882498), jpeg(3098020), jpeg(2781328), mp4a(3763168), jpeg(4431416), mp4a(1614389), jpeg(287296), jpeg(2681973), jpeg(2107304), pdf(332485), jpeg(2635452), audio/vnd.dlna.adts(3058005), mp4a(2448226), mp4a(1805349), mp4a(4150285), mp4a(204164), jpeg(2606693), jpeg(2626157), mp4a(1459294), jpeg(566696), jpeg(2543785), mp4a(369050), mp4(30391500), jpeg(4579297), jpeg(5172226), jpeg(1548860), mp4a(944403), html(640739), jpeg(147544), jpeg(3964519), jpeg(1776724), mp4a(2984325), bin(1595391), jpeg(320684), bin(48838), jpeg(4079596), jpeg(2144716), mp4a(1642287), bin(616420), jpeg(4110243), html(799551), png(1792687), mp4a(962844), jpeg(2625613), jpeg(2666985), jpeg(2722455), jpeg(36852), jpeg(40164), jpeg(111950), mp4a(1235641), mp4a(101692), mp4a(489606), mp4a(1202077), mp4a(4721088), jpeg(63112), jpeg(3627878), mp4a(2368173), jpeg(6463999), mp4a(558864), jpeg(2818575), jpeg(950258), jpeg(4870478), jpeg(4661936), mp4a(828006), png(135414), jpeg(1511423), mpga(2579649), mpga(6283555), jpeg(39553), pdf(141529), bin(1084358), jpeg(379064), jpeg(1305368), mpga(625262), jpeg(4847317), bin(116966), wav(3184824), png(166019), jpeg(804562), jpeg(443742), jpeg(2216857), jpeg(539445), jpeg(2166243), png(1796101), jpeg(1875257), png(1640881), jpeg(2545361), png(441607), jpeg(2890369), mp4a(441334), jpeg(3591325), jpeg(130755), png(170479), mp4a(2620611), mp4a(4518524), mp4a(6386348), jpeg(2467582), mp4a(1084240), jpeg(95788), jpeg(2619585), mp4(8919033), jpeg(4410537), bin(1049901), jpeg(4145168), jpeg(1015520), png(108417), jpeg(11074031), mp4a(1034473), html(479151), jpeg(2543166), jpeg(1867990), jpeg(1688053), html(640918), jpeg(3761476), mp4a(2043016), mp4a(1327650), bin(443069), mp4a(8236358), jpeg(3333029), mp4a(4192934), jpeg(1964105), jpeg(3303164), jpeg(7390050), jpeg(3982230), jpeg(3033149), mp4a(705651), jpeg(45398), jpeg(1013777), jpeg(3386166), jpeg(3610339), jpeg(79582), jpeg(2749667), jpeg(3103944), jpeg(197437), jpeg(1240130), mp4a(3140356), mp4a(2218267), jpeg(5765324), jpeg(103691), jpeg(83984), jpeg(4445333), mp4a(634555), png(2280208), jpeg(3823557), jpeg(704279), mp4a(1632575), jpeg(2986691), bin(481830), jpeg(2921224), docx(-1), mp4a(5352815), ogv(650885), jpeg(421521), jpeg(3832698), html(3025837), audio/vnd.dlna.adts(3763036), bin(161414), jpeg(3634921), jpeg(175071), png(156532), jpeg(38705), jpeg(2969378), png(1059022), mp4a(1110381), bin(1812775), jpeg(1434922), bin(1048366), audio/vnd.dlna.adts(1787003), mp4a(795300), jpeg(2146419), jpeg(3113325), png(2690433), jpeg(2955817), jpeg(1950597), jpeg(180961), jpeg(2921263), png(1187248), jpeg(3661093), bin(1638526), mp4a(3258141), mp4a(2299616), audio/vnd.dlna.adts(6828390), png(4625953), jpeg(1806678), mp4a(1442751), jpeg(3484297), mp4a(581212), jpeg(2358438), jpeg(5251366), mp4a(856519), jpeg(895955), mp4a(225192), jpeg(1857109), png(396961), jpeg(6504102), jpeg(3550057), bin(642950), bin(726730), jpeg(2937002), jpeg(2241215), jpeg(2848793), jpeg(114301), jpeg(6851150), jpeg(5412996), jpeg(5099807), jpeg(2352338), mp4a(1108249), jpeg(59955), jpeg(597941), png(822965), png(279993), mp4a(649729), jpeg(5327907), html(41982439), jpeg(3926818), jpeg(3811126), mpga(3150075), mp4a(851987), jpeg(2161975), jpeg(3049221), mp4(14723059), mp4a(1166746), jpeg(3929963), jpeg(32386), bin(647846), jpeg(943529), png(3558483), mp4a(496459), jpeg(554775), jpeg(673727), jpeg(1234744), mp4a(1614229), bin(1077286), jpeg(2321955), mp4(15102498), jpeg(1138223), jpeg(2821667), mp4a(4957829), jpeg(5267053), jpeg(3746852), xlsx(66430625), png(1781350), mp4(13377154), jpeg(2521556), jpeg(4363031), jpeg(38838), jpeg(1177161), jpeg(5648135), jpeg(3860593), jpeg(3191081), jpeg(4074964), jpeg(2592942), jpeg(70743), jpeg(47092), jpeg(17155), mp4a(5461865), jpeg(317565), jpeg(154225), jpeg(2641570), jpeg(1432979), jpeg(2996468), jpeg(2537158), jpeg(2126839), mp4a(3445663), jpeg(524301), jpeg(2577631), mp4a(999933), jpeg(212728), jpeg(3050628), jpeg(67402), jpeg(4528980), jpeg(48108), jpeg(2849620), mp4a(799189), jpeg(977868), mp4a(1114948), mp4a(1538194), jpeg(3539999), jpeg(732964), mp4a(1159815), jpeg(177432), png(5221994), mp4a(120084), jpeg(4880331), jpeg(2634063), jpeg(1018097), webp(-1), bin(878982), jpeg(5596898), png(356862), jpeg(33015), mp4a(1665024), jpeg(1110786), xlsx(27165), jpeg(2034603), jpeg(2410690), mp4a(2172212), jpeg(287142), jpeg(865631), jpeg(4371438), mp4a(505909), bin(2410811), mp4a(416617), qt(5205385), jpeg(1642459), jpeg(1864894), mp4a(1275342), jpeg(4389684), mp4a(1216743), jpeg(1645086), mp4a(1917929), jpeg(2202466), jpeg(3415224), mp4a(2687040), jpeg(4168896), jpeg(3608610), mp4a(847604), jpeg(2952649), jpeg(1632186), jpeg(482523), jpeg(3260717), wav(2205734), ogv(332111), mp4a(3028452), jpeg(5449171), jpeg(2190017), html(646595), jpeg(2046616), jpeg(363257), bin(2539604), audio/vnd.dlna.adts(13530010), html(8779436), mp4a(3988517), html(710893), bin(2108773), mp4a(938780), mp4a(1632058), mp4a(1781328), jpeg(6006498), mp4a(2011577), png(1867628), jpeg(3578276), qt(1377580), bin(498661), jpeg(3959637), jpeg(3553188), mp4a(1566800), html(9536819), jpeg(1795067), bin(593638), jpeg(68405), jpeg(937156), jpeg(4183531), mpga(1488238), jpeg(864405), jpeg(1365686), docx(12339), jpeg(578317), xlsx(52077), html(523486), jpeg(7547441), mp4a(1930783), jpeg(58628), mp4a(1145760), jpeg(3167708), mp4(31660079), jpeg(2489302), mp4a(1666611), xlsx(82776), jpeg(1827086), jpeg(1844434), jpeg(4555773), jpeg(3299756), mp4a(1140725), mp4a(531377), mp4a(3139464), mp4(24994984), ogv(408137), jpeg(2440831), png(497108), xlsx(88927), jpeg(859100), jpeg(3121852), png(3396851), mp4a(337657), jpeg(1938676), mpga(3748682), jpeg(3010539), png(618010), jpeg(120170), mp4a(691616), jpeg(4782980), jpeg(1882397), mp4a(847950), mp4a(579012), jpeg(3477933), jpeg(3332206), jpeg(1777340), jpeg(1779300), jpeg(3324446), bin(2111272), jpeg(134273), jpeg(2327041), mp4a(2112621), jpeg(2028706), jpeg(2253098), jpeg(87256), jpeg(4748410), jpeg(2262473), mp4a(3061773), jpeg(3853660), jpeg(489701), jpeg(2016316), mp4(48601545), jpeg(4110324), mp4a(750884), mp4a(1666390), jpeg(2729939), jpeg(887373), pdf(122363), mp4a(760877), jpeg(5047594), jpeg(3513429), mp4a(701592), mp4a(24233), jpeg(3878593), jpeg(955964), jpeg(1959028), mp4a(573738), jpeg(1607988), jpeg(121889), mp4a(1115213), bin(1173798), jpeg(6732180), jpeg(1945789), jpeg(5423032), jpeg(252261), jpeg(3546392), jpeg(1587693), jpeg(1303230), jpeg(1050632), mp4a(2957441), mp4a(2682346), bin(564582), jpeg(117534), jpeg(417971), jpeg(3639631), jpeg(3283728), bin(234118), png(2037576), jpeg(3095107), png(1185912), jpeg(3003672), mp4a(1307438), jpeg(142223), jpeg(6401219), bin(2429287), jpeg(3129315), jpeg(111760), jpeg(749493), mpga(5172750), jpeg(67155), mp4a(1303543), audio/vnd.dlna.adts(4340557), jpeg(3978187), jpeg(2696452), mp4a(1505002), jpeg(1750030), jpeg(7505927), jpeg(2638934), jpeg(3812323), bin(818310), jpeg(571235), jpeg(3256481), mp4a(1374945), png(357625), jpeg(5542820), mp4a(1981377), mp4a(2469218), jpeg(4044906), jpeg(37019), jpeg(1134103), bin(632006), jpeg(85234), mp4(11623573), bin(1030438), audio/vnd.dlna.adts(11278413), mp4a(6956199), xlsx(48995), mp4a(10021109), xlsx(224948556), jpeg(41894), jpeg(85137), bin(3540340), jpeg(1280936), xlsx(189425), bin(546822), html(1075544), png(1790553), mp4a(8341651), mp4a(1347344), jpeg(1837571), qt(2398526), jpeg(488375), png(652644), bin(709318), mp4a(512559), jpeg(1660933), mp4a(903487), jpeg(2355965), jpeg(3175474), mp4a(3235128), pdf(213974), jpeg(3105125), mp4a(1264503), jpeg(817070), jpeg(2858948), bin(1019282), jpeg(3172013), jpeg(2118129), png(856929), jpeg(3172905), mp4a(2083812), jpeg(3950185), 3gp(4189257), webp(13654), jpeg(3985986), jpeg(22928), html(496815), jpeg(2221272), jpeg(4526887), jpeg(3917797), jpeg(1579597), jpeg(4260674), jpeg(3155291), jpeg(939502), jpeg(3169133), jpeg(68283), jpeg(145275), audio/vnd.dlna.adts(4820134), mp4a(1195465), html(1694054), jpeg(155887), mp4a(3274925), mp4a(4613589), mpga(2386117), jpeg(41185), mp4a(1086359), mp4a(1151555), bin(1960531), jpeg(2149916), jpeg(2564893), wmv(50197262), mp4(26601787), jpeg(1997912), jpeg(2729245), mp4a(729599), mpga(3484030), jpeg(4728142), jpeg(5043578), mp4a(873556), mp4a(660082), jpeg(13696858), mp4a(1555980), jpeg(45747), jpeg(3178887), qt(28706733), jpeg(4509448), bin(381126), mp4a(661507), jpeg(495339), jpeg(138394), jpeg(85114), mpga(1449626), mp4a(3615513), jpeg(6130051), mp4a(13214859), mp4a(1702996), mp4a(562777), jpeg(2551565), mp4a(1176775), jpeg(16753), mpga(1784266), jpeg(377428), jpeg(3136525), mp4a(1115669), jpeg(64481), mp4a(2548754), jpeg(32021), bin(3983879), jpeg(1629680), pdf(121390), jpeg(2243229), jpeg(3134307), html(38240607), jpeg(8644181), jpeg(4566822), mpga(379781), mp4a(2068903), jpeg(599871), mp4a(8995283), jpeg(2507441), bin(1544294), jpeg(254462), jpeg(1915392), jpeg(1595555), mp4a(1073809), jpeg(40514), jpeg(535219), mp4a(1617110), xlsx(20756300), bin(1869989), jpeg(2381586), jpeg(35883), mpga(4061915), jpeg(917468), jpeg(3052078), mp4a(1901851), jpeg(131612), jpeg(1507898), jpeg(130590), jpeg(133876), jpeg(180752), jpeg(3552912), jpeg(172352), mp4a(2419697), mp4a(331293), jpeg(1583799), jpeg(840041), mp4a(1611680), bin(328166), jpeg(219612), jpeg(1656656), jpeg(4653342), mp4a(5608105), jpeg(2201474), wav(2818960), mp4a(936086), pdf(91460), mp4a(1601130), jpeg(659500), jpeg(100391), jpeg(2812452), mp4a(5629529), jpeg(1816312), jpeg(71716), pdf(295280), jpeg(2911219), jpeg(2471054), docx(31188), jpeg(4659509), png(105272), mp4a(959231), mp4a(1516084), mpga(5970561), jpeg(3668632), mp4a(1739564), jpeg(2058883), jpeg(1901789), mp4a(3134928), mp4a(1152026), jpeg(3523727), mp4a(760909), mp4a(1248111), mp4a(984328), audio/vnd.dlna.adts(934543), jpeg(2193720), jpeg(1401200), bin(919270), jpeg(529647), mp4a(1608171), mp4a(5154628), jpeg(1040846), mp4a(2360919), mp4a(1273706), jpeg(1766662), mp4a(291843), jpeg(3199783), jpeg(4440461), mp4a(2354743), html(983166), jpeg(4653818), jpeg(3216327), jpeg(12340), png(24722), jpeg(68398), audio/vnd.dlna.adts(9495356), mp4a(1911363), jpeg(363586), jpeg(3277514), jpeg(2684588), png(795810), mp4a(1244456), jpeg(59161), jpeg(1603743), mp4a(611153), jpeg(2500101), jpeg(3468457), mp4a(843462), jpeg(4005962), mp4a(912224), 3gp(5920182), jpeg(1714504), jpeg(2280388), mpga(4640203), jpeg(3332571), mp4a(1269110), jpeg(1788844), mp4a(4350631), mp4a(1496135), bin(1772535), mpga(371534), jpeg(4221720), mp4a(1486515), mp4a(3758180), jpeg(3413660), jpeg(3451347), mp4(6993330), bin(152038), jpeg(3535829), jpeg(3234324), tiff(-1), jpeg(2251269), jpeg(2600986), bin(1606725), bin(1615540), jpeg(629961), mp4a(1364069), jpeg(849628), jpeg(2384630), jpeg(854035), jpeg(1059910), mp4a(432261), jpeg(6803436), qt(2010499), mp4a(1222788), png(252350), mp4a(561403), mp4a(1301355), jpeg(78430), jpeg(153294), jpeg(3111015), jpeg(3506560), mp4a(1614765), mp4a(4359255), mp4a(1609908), jpeg(3129756), jpeg(1440858), jpeg(24096), mpga(6606764), mp4a(219517), wav(16120364), mp4a(1071439), jpeg(3293381), jpeg(112899), jpeg(2875869), jpeg(4948125), mp4a(1615299), png(3496115), mp4a(1986411), png(586680), jpeg(1897709), jpeg(2273020), jpeg(4022260), jpeg(377213), mp4a(1702687), html(4191543), jpeg(1398077), jpeg(2079488), jpeg(31946), jpeg(1243971), jpeg(2389859), qt(574596), mp4a(532776), jpeg(2730221), mp4a(510562), jpeg(2968414), mp4a(2145487), jpeg(496123), jpeg(4274950), png(548620), jpeg(2124741), png(5709270), jpeg(5322032), mp4a(304846), jpeg(2969836), jpeg(5084546), jpeg(173417), mpga(2814171), pdf(308146), png(7879), png(2155793), jpeg(1568444), jpeg(107669), jpeg(3844552), jpeg(5050854), mp4(59931145), jpeg(26777), bin(3681626), mp4a(1124596), txt(186920), jpeg(520311), bin(416102), mp4a(7284061), jpeg(40281), jpeg(657555), png(1437413), jpeg(2534845), jpeg(445866), jpeg(1237900), jpeg(4250838), bin(156966), tsv(733), qt(3177780), bin(864966), jpeg(11690), mp4a(3045602), mp4a(2449349), bin(748148), jpeg(1825738), jpeg(1990482), mpga(1190436), mp4a(5845364), mp4a(1448064), jpeg(3171202), bin(2501650), jpeg(2273265), mp4a(619603), jpeg(951877), jpeg(63914), mp4a(1271334), jpeg(1976245), mpga(4817983), jpeg(331201), jpeg(129869), jpeg(7445743), jpeg(5717518), jpeg(2968114), mp4a(693312), mp4a(264471), jpeg(5399866), jpeg(71431), jpeg(1519243), jpeg(1593696), mp4(4106014), mp4a(705329), mp4a(1148157), jpeg(6046515), mp4a(916096), jpeg(333207), jpeg(3138702), jpeg(417572), mpga(5269701), jpeg(145637), mp4a(802505), png(1017305), jpeg(17907), jpeg(3598845), jpeg(1155643), jpeg(2638302), mp4a(822545), bin(1493618), bin(906790), jpeg(154930), jpeg(953837), zip(11659935), mp4a(1214837), mp4a(1016151), mp4a(3515351), mp4a(3839771), mp4a(1256085), jpeg(4031381), mpga(3309399), jpeg(290224), png(459262), jpeg(48326), jpeg(4736590), jpeg(1964763), jpeg(2042850), jpeg(14911972), jpeg(981139), mp4(8726495), jpeg(455010), mp4a(2202351), jpeg(72668), mpga(970535), jpeg(12825578), mp4a(1931894), jpeg(1726579), jpeg(3996799), jpeg(2413680), jpeg(2299059), png(1038072), mp4a(1467032), jpeg(732955), jpeg(145129), jpeg(4057705), jpeg(1575841), mpga(4266613), jpeg(3444896), mp4a(1095447), jpeg(2423812), 3gp(11381321), png(477408), mp4a(1358807), pdf(155079), jpeg(822164), mp4a(3978276), png(316363), jpeg(3336796), bin(1495558), jpeg(874390), jpeg(278529), jpeg(942247), pdf(129862), jpeg(4954268), jpeg(2572775), jpeg(3062482), qt(89399945), jpeg(2128499), jpeg(2849921), png(1019045), mp4a(3170368), mpga(4747435), jpeg(1371393), jpeg(3550211), mp4a(942819), jpeg(2313418), jpeg(4887470), jpeg(91125), mp4a(2439271), jpeg(2764753), mp4a(3002959), bin(729766), jpeg(798303), bin(2204684)Available download formats
Unique identifier
https://doi.org/10.5064/F6PXS9ZK
Dataset updated
Feb 15, 2024
Dataset provided by
Qualitative Data Repository
Authors
Sarah S. Willen; Sarah S. Willen; Katherine A. Mason; Katherine A. Mason
License
https://qdr.syr.edu/policies/qdr-restricted-access-conditionshttps://qdr.syr.edu/policies/qdr-restricted-access-conditions
Time period covered
May 29, 2020 - May 31, 2022
Area covered
Europe, Mexico, United States, Central America, Canada
Description
Project Summary This dataset contains all qualitative and quantitative data collected in the first phase of the Pandemic Journaling Project (PJP). PJP is a combined journaling platform and interdisciplinary, mixed-methods research study developed by two anthropologists, with support from a team of colleagues and students across the social sciences, humanities, and health fields. PJP launched in Spring 2020 as the COVID-19 pandemic was emerging in the United States. PJP was created in order to “pre-design an archive” of COVID-19 narratives and experiences open to anyone around the world. The project is rooted in a commitment to democratizing knowledge production, in the spirit of “archival activism” and using methods of “grassroots collaborative ethnography” (Willen et al. 2022; Wurtz et al. 2022; Zhang et al 2020; see also Carney 2021). The motto on the PJP website encapsulates these commitments: “Usually, history is written only by the powerful. When the history of COVID-19 is written, let’s make sure that doesn’t happen.” (A version of this Project Summary with links to the PJP website and other relevant sites is included in the public documentation of the project at QDR.) In PJP’s first phase (PJP-1), the project provided a digital space where participants could create weekly journals of their COVID-19 experiences using a smartphone or computer. The platform was designed to be accessible to as wide a range of potential participants as possible. Anyone aged 15 or older, living anywhere in the world, could create journal entries using their choice of text, images, and/or audio recordings. The interface was accessible in English and Spanish, but participants could submit text and audio in any language. PJP-1 ran on a weekly basis from May 2020 to May 2022. Data Overview This Qualitative Data Repository (QDR) project contains all journal entries and closed-ended survey responses submitted during PJP-1, along with accompanying descriptive and explanatory materials. The dataset includes individual journal entries and accompanying quantitative survey responses from more than 1,800 participants in 55 countries. Of nearly 27,000 journal entries in total, over 2,700 included images and over 300 are audio files. All data were collected via the Qualtrics survey platform. PJP-1 was approved as a research study by the Institutional Review Board (IRB) at the University of Connecticut. Participants were introduced to the project in a variety of ways, including through the PJP website as well as professional networks, PJP’s social media accounts (on Facebook, Instagram, and Twitter) , and media coverage of the project. Participants provided a single piece of contact information — an email address or mobile phone number — which was used to distribute weekly invitations to participate. This contact information has been stripped from the dataset and will not be accessible to researchers. PJP uses a mixed-methods research approach and a dynamic cohort design. After enrolling in PJP-1 via the project’s website, participants received weekly invitations to contribute to their journals via their choice of email or SMS (text message). Each weekly invitation included a link to that week’s journaling prompts and accompanying survey questions. Participants could join at any point, and they could stop participating at any point as well. They also could stop participating and later restart. Retention was encouraged with a monthly raffle of three $100 gift cards. All individuals who had contributed that month were eligible. Regardless of when they joined, all participants received the project’s narrative prompts and accompanying survey questions in the same order. In Week 1, before contributing their first journal entries, participants were presented with a baseline survey that collected demographic information, including political leanings, as well as self-reported data about COVID-19 exposure and physical and mental health status. Some of these survey questions were repeated at periodic intervals in subsequent weeks, providing quantitative measures of change over time that can be analyzed in conjunction with participants' qualitative entries. Surveys employed validated questions where possible. The core of PJP-1 involved two weekly opportunities to create journal entries in the format of their choice (text, image, and/or audio). Each week, journalers received a link with an invitation to create one entry in response to a recurring narrative prompt (“How has the COVID-19 pandemic affected your life in the past week?”) and a second journal entry in response to their choice of two more tightly focused prompts. Typically the pair of prompts included one focusing on subjective experience (e.g., the impact of the pandemic on relationships, sense of social connectedness, or mental health) and another with an external focus (e.g., key sources of scientific information, trust in government, or COVID-19’s economic impact). Each week,...
Metadatabase of available data on drivers, pressures, biodiversity,...
zenodo.org
bin
Updated Oct 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laetitia M. Navarro; Laetitia M. Navarro (2024). Metadatabase of available data on drivers, pressures, biodiversity, ecosystem services and conservation actions [Dataset]. http://doi.org/10.5281/zenodo.14008205
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14008205
Dataset updated
Oct 30, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Laetitia M. Navarro; Laetitia M. Navarro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We identified and document 137 datasets and databases on European biodiversity, ecosystem services, the drivers and pressures affecting them, and the mechanisms put in place to address these. These datasets represent nearly 2000 variables and metrics that can be used directly by researchers, land managers and decision-makers for example for spatial planning in conservation or be further integrated into biodiversity and ecosystem services models.

This metadatabase and associated tables supports Deliverable 3.1 of the NaturaConnect Horizon Europe project (D3.1 Report and data on the biodiversity, protected areas and environmental and socioeconomic data available for the project. Including data gap analysis).

Content

1. Typology.xlsx - Table presenting the typology used to classify and document the datasets and databases within the metadatabase. The typology used to classify those datasets and the variables and metrics within them is built on the DPSIR framework (Drivers, Pressures, State, Impact, Response), the Threats Classification Scheme (version 3.3) of the International Union for Conservation of Nature (IUCN), as well as the Essential Biodiversity Variables and Essential Ecosystem Services Variables frameworks (EBVs and EESVs respectively).

2. MetaDatabase.xlsx - MetaDatabase documenting the datasets and databases identified in the context of the NaturaConnect project. This metadatabase documents for each dataset or database:

General information on each entry, that is its name, the corresponding component of the data typology, for instance if the data concerns biodiversity, or pressures on biodiversity. This section also documents the type of information or metrics contained in the entry and their unit as well as the realm (Terrestrial or Freshwater) covered by the data. In many cases, an entry will contain data on more than one variable or product, in which case we labelled it as “multiple” in the general information and list all individual metrics and their unites in a separate table.

Biological information: if the entry relates to data on biodiversity or ecosystem services, this section is used to inform about the biological entity and taxonomic resolution of the data (e.g. species), the coverage of the biological entity (e.g. amphibians), and the coverage of Essential Variables (EBV or EESV – e.g. species traits).

Non-biological information: for entries that provide data on drivers, pressures or responses, we document the entity (e.g. type of pressure) and the coverage or scope of the entity.

Temporal information: we describe the temporal extent of each entry and their temporal resolution for those that are repeated measurements in time.

Spatial information: This section of the metadatabase documents, for the entries that are spatially explicit, which is the spatial scope (e.g. global, national), the spatial extent (e.g. EU28, Spain), and the spatial resolution of the data.

Method: for each entry, we document whether the data is modelled, interpreted or raw, as well as the dependencies with other datasets. Specifically, we identify if the data is also shared or used in another dataset (either documented in the metadatabase or not).

Accessibility: this last part of the metadatabase documents the links to (and references of) the data, and, when appropriate, the scientific publication accompanying them. We also keep track of the curator and contact person as well as the last update of the entry. This section is also used to document the data format (e.g. NetCDF, csv), licensing and whether the data can be accessed via an Application Programming Interface (API) or other tool.

3. DetailedMetrics.xlsx - Table containing all the metrics and variables from the datasets documented in the metadatabase. The metrics are mapped to the data typology, and when appropriate to the corresponding Essential Biodiversity Variable or Essential Ecosystem Service Variable. This table documents the name of the metric, or field, as given in the source material, its type (e.g. number, categorical, characters) and when appropriate, its unit. When the information is provided in the source material, the table also contains a definition of the metric as well as the different options given in the case of categorical data.

Method - Databases and Datasets identification

The entries of the metadatabase were identified through three main approaches.

First, a list of online catalogues and repositories was produced and scoped for relevant datasets or databases: European Environment Agency Datahub, European Environment Agency EIONET Central Data Repository, COPERNICUS Land Monitoring Service, Essential Biodiversity Variables - EBV data portal of the Group on Earth Observations Biodiversity Observation Network, Open Traits Network Catalogue, Open Environmental Data Cube Europe, NASA’s Earth Data, NASA’s SEDAC (Socioeconomic data and application center), Euro-Lex (access to European Union Law), JRC - ESDAC (European Soil Data Center), Database of European Vegetation Habitats and Flora, ESA (European Space Agency) Climate Office.

Second, a survey was sent out to all NaturaConnect consortium members in the third quarter of 2022 to identify both their needs and uses of data across the data typology. This allowed to identify (and document) additional datasets either used or produced within the consortium.

Lastly, the research team punctually added scientific publications of large-scale datasets, although it is important to highlight that this is not resulting from a systematic survey effort of the literature.
Movies Performance and Feature Statistics
kaggle.com
Updated Jan 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Movies Performance and Feature Statistics [Dataset]. https://www.kaggle.com/datasets/thedevastator/movies-performance-and-feature-statistics
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 16, 2023
Dataset provided by
Kaggle
Authors
The Devastator
Description
Movies Performance and Feature Statistics

Analyzing Box Office Performance, Rating and Audience Reactions

By Yashwanth Sharaff [source]

About this dataset

This dataset contains essential characteristics of a variety of movies, including basic pieces of information such as the movie's title and budget, as well as performance indicators like the movie's MPAA rating, gross revenue, release date, genre, runtime, rating count and summary. With this data set we can better understand the film industry and uncover insights on how different features and performance metrics impact one another to guarantee a movie's success. The movies dataset also helps you make informed decisions about which features are key indicators in setting up a high-grossing feature film

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

To get the most out of this data set you need to understand what each column in it represents. The ‘Title’ column gives you the title of the movie which can be used for further search or exploration on popular streaming services and websites that are dedicated to providing detailed information about movies. The ‘MPAA Rating’ lists any Motion Picture Association (MPAA) rating for a movie which consists of G (General Audiences), PG (Parental Guidance Suggested), PG-13 (Parents Strongly Cautioned), R (Under 17 Requires Accompanying Parent or Guardian) etc. The 'Budget' column give you an approximate idea about how much a particular production cost while the 'Gross' columns depicts its earnings if it was released in theaters while its successor 'Release Date' reveals when each film has been released or is going to release in future. The columns 'Genre', 'Runtime', and ‘Rating Count’ cover subjects such as what type of movie is it? Every genre will have an associated runtime limit along with rating count which refers to number people who have rated/reviewed a particular flick whether on IMDB or other streaming services as well as paper mediums like newspapers . Last but not least summary field states an overview of what we can expect from film so take this in account before watching anything especially if include children members in your family.

So go ahead - start exploring this interesting dataset today!

Research Ideas

Creating a box office prediction model using budget, genre, release date and MPAA rating

Using the summary data to create a sentiment analysis tool for movie reviews

Building a recommendation engine for users based on their prior ratings and what other users with similar tastes have rated as highly

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: movies.csv | Column name | Description | |:-----------------|:-------------------------------------------------------------------------------| | Title | The title of the movie. (String) | | MPAA Rating | The Motion Picture Association of America (MPAA) rating of the movie. (String) | | Budget | The budget of the movie in US dollars. (Integer) | | Gross | The gross revenue of the movie in US dollars. (Integer) | | Release Date | The date the movie was released. (Date) | | Genre | The genre of the movie. (String) | | Runtime | The length of the movie in minutes. (Integer) | | Rating Count | The number of ratings the movie has received. (Integer) | | Summary | A brief summary of the movie. (String) |

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Yashwanth Sharaff.
Real-time Covid 19 Data
kaggle.com
zip
Updated Aug 11, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaurav Dutta (2020). Real-time Covid 19 Data [Dataset]. https://www.kaggle.com/gauravduttakiit/covid-19
Explore at:
zip(5221838 bytes)Available download formats
Dataset updated
Aug 11, 2020
Authors
Gaurav Dutta
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Coronavirus disease 2019 (COVID-19) time series listing confirmed cases, reported deaths and reported recoveries. Data is disaggregated by country (and sometimes subregion). Coronavirus disease (COVID-19) is caused by the Severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) and has had a worldwide effect. On March 11 2020, the World Health Organization (WHO) declared it a pandemic, pointing to the over 118,000 cases of the Coronavirus illness in over 110 countries and territories around the world at the time.

This dataset includes time series data tracking the number of people affected by COVID-19 worldwide, including:

- confirmed tested cases of Coronavirus infection

the number of people who have reportedly died while sick with Coronavirus

the number of people who have reportedly recovered from it
Data from: UNESCO World Heritage Sites Dataset
kaggle.com
Updated Dec 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). UNESCO World Heritage Sites Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/unesco-world-heritage-sites-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 19, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
Area covered
World
Description
UNESCO World Heritage Sites Dataset

UNESCO World Heritage Sites Dataset

By Throwback Thursday [source]

About this dataset

How to use the dataset

Here are some tips on how to make the most out of this dataset:

Data Exploration:

Begin by understanding the structure and contents of the dataset. Evaluate the number of rows (sites) and columns (attributes) available.

Check for missing values or inconsistencies in data entry that may impact your analysis.

Assess column descriptions to understand what information is included in each attribute.

Geographical Analysis:

Leverage geographical features such as latitude and longitude coordinates provided in this dataset.

Plot these sites on a map using any mapping software or library like Google Maps or Folium for Python. Visualizing their distribution can provide insights into patterns based on location, climate, or cultural factors.

Analyzing Attributes:

Familiarize yourself with different attributes available for analysis. Possible attributes include Name, Description, Category, Region, Country, etc.

Understand each attribute's format and content type (categorical, numerical) for better utilization during data analysis.

Exploring Categories & Regions:

Look at unique categories mentioned in the Category column (e.g., Cultural Site, Natural Site) to explore specific interests. This could help identify clusters within particular heritage types across countries/regions worldwide.

Analyze regions with high concentrations of heritage sites using data visualizations like bar plots or word clouds based on frequency counts.

Identify Trends & Patterns:

Discover recurring themes across various sites by analyzing descriptive text attributes such as names and descriptions.

Identify patterns and correlations between attributes by performing statistical analysis or utilizing machine learning techniques.

Comparison:

Compare different attributes to gain a deeper understanding of the sites.

For example, analyze the number of heritage sites per country/region or compare the distribution between cultural and natural heritage sites.

Additional Data Sources:

Use this dataset as a foundation to combine it with other datasets for in-depth analysis. There are several sources available that provide additional data on UNESCO World Heritage Sites, such as travel blogs, official tourism websites, or academic research databases.

Remember to cite this dataset appropriately if you use it in

Research Ideas

Travel Planning: This dataset can be used to identify and plan visits to UNESCO World Heritage sites around the world. It provides information about the location, category, and date of inscription for each site, allowing users to prioritize their travel destinations based on personal interests or preferences.

Cultural Preservation: Researchers or organizations interested in cultural preservation can use this dataset to analyze trends in UNESCO World Heritage site listings over time. By studying factors such as geographical distribution, types of sites listed, and inscription dates, they can gain insights into patterns of cultural heritage recognition and protection.

Statistical Analysis: The dataset can be used for statistical analysis to explore various aspects related to UNESCO World Heritage sites. For example, it could be used to examine the correlation between a country's economic indicators (such as GDP per capita) and the number or type of World Heritage sites it possesses. This analysis could provide insights into the relationship between economic development and cultural preservation efforts at a global scale

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Throwback Thursday.
N
Albanian Population Distribution Data - United States States (2019-2023)
neilsberg.com
csv, json
Updated Oct 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Albanian Population Distribution Data - United States States (2019-2023) [Dataset]. https://www.neilsberg.com/insights/lists/albanian-population-in-united-states-by-state/
Explore at:
json, csvAvailable download formats
Dataset updated
Oct 1, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Variables measured
Albanian Population Count, Albanian Population Percentage, Albanian Population Share of United States
Measurement technique
To measure the rank and respective trends, we initially gathered data from the five most recent American Community Survey (ACS) 5-Year Estimates. We then analyzed and categorized the data for each of the origins / ancestries identified by the U.S. Census Bureau. It is possible that a small population exists but was not reported or captured due to limitations or variations in Census data collection and reporting. We ensured that the population estimates used in this dataset pertain exclusively to the identified origins / ancestries and do not rely on any ethnicity classification, unless explicitly required. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

This list ranks the 50 states in the United States by Albanian population, as estimated by the United States Census Bureau. It also highlights population changes in each state over the past five years.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 5-Year Estimates, including:

2019-2023 American Community Survey 5-Year Estimates

2014-2018 American Community Survey 5-Year Estimates

2009-2013 American Community Survey 5-Year Estimates

Variables / Data Columns

Rank by Albanian Population: This column displays the rank of state in the United States by their Albanian population, using the most recent ACS data available.

State: The State for which the rank is shown in the previous column.

Albanian Population: The Albanian population of the state is shown in this column.

% of Total State Population: This shows what percentage of the total state population identifies as Albanian. Please note that the sum of all percentages may not equal one due to rounding of values.

% of Total United States Albanian Population: This tells us how much of the entire United States Albanian population lives in that state. Please note that the sum of all percentages may not equal one due to rounding of values.

5 Year Rank Trend: This column displays the rank trend across the last 5 years.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
E
SUPERSEDED - Views on sharing mental and physical health data among people...
finddatagovscot.dtechtive.com
dtechtive.com
+1more
pdf, txt, xlsx
Updated Oct 11, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Edinburgh. Centre for Clinical Brain Sciences (2021). SUPERSEDED - Views on sharing mental and physical health data among people with and without experience of mental illness [Dataset]. http://doi.org/10.7488/ds/3146
Explore at:
txt(0.001 MB), txt(0.0166 MB), pdf(3.249 MB), xlsx(0.8737 MB)Available download formats
Unique identifier
https://doi.org/10.7488/ds/3146
Dataset updated
Oct 11, 2021
Dataset provided by
University of Edinburgh. Centre for Clinical Brain Sciences
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
UNITED KINGDOM
Description
This dataset contains responses from an online survey of 2187 participants primarily located in the UK. All participants stated that they had used the UK National Health Service (NHS) at some time in their lives. The data were collected between December 2018 and August 2019. Participants' views on data sharing - this dataset contains information about people's willingness to share mental and physical health data for research purposes. It also includes information on willingness to share other types of data, such as financial information. The dataset includes participants' responses to questions relating to mental health data sharing, including the trustworthiness of organisations which use such data, how much the presence of different governance measures (such as deidentification, opt-out, etc.) would alter their views, and whether they would be less likely to access NHS mental health services if they knew their data might be shared with researchers. Participants' satisfaction and interaction with UK mental and physical health services - the dataset includes information regarding participants' views on and interaction with NHS services. This includes ratings of satisfaction at first contact and in the previous 12 months, frequency of use, and type of treatment received. Information about participants - the dataset includes information about participants' mental and physical health, including whether or not they have experience with specific mental health conditions, and how they would rate their mental and physical health at the time of the survey. There is also basic demographic information about the participants (e.g. age, gender, location etc.). ## This item has been replaced by the one which can be found at https://hdl.handle.net/10283/4467 ##
Global social media subscriptions comparison 2023
statista.com
es.statista.com
+1more
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stacy Jo Dixon, Global social media subscriptions comparison 2023 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description
Social media companies are starting to offer users the option to subscribe to their platforms in exchange for monthly fees. Until recently, social media has been predominantly free to use, with tech companies relying on advertising as their main revenue generator. However, advertising revenues have been dropping following the COVID-induced boom. As of July 2023, Meta Verified is the most costly of the subscription services, setting users back almost 15 U.S. dollars per month on iOS or Android. Twitter Blue costs between eight and 11 U.S. dollars per month and ensures users will receive the blue check mark, and have the ability to edit tweets and have NFT profile pictures. Snapchat+, drawing in four million users as of the second quarter of 2023, boasts a Story re-watch function, custom app icons, and a Snapchat+ badge.
NYC Parks events listings
kaggle.com
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). NYC Parks events listings [Dataset]. https://www.kaggle.com/datasets/thedevastator/uncovering-the-hidden-event-spaces-of-nyc-parks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 14, 2022
Dataset provided by
Kaggle
Authors
The Devastator
Description
Uncovering the Hidden Event Spaces of NYC Parks!

Mapping Event Locations, Accessibility, and Boroughs

By data.world's Admin [source]

About this dataset

Immerse yourself in NYC Parks events listings! This comprehensive dataset makes available the most recent records from 2013 and beyond, detailing information on events taking place in public parks throughout New York City. Beyond basic event data such as category, dates and times of activity, this dataset also offers further details such as organisers, labels, images associated with events or even YouTube video links related to them. Whether you are looking for a peaceful gathering hour or a thrilling outdoor adventure experience, this dataset provides you with all the necessary information on NYC Parks event listing!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

In this guide, we’ll walk you through how to use this dataset to find information about event spaces in NYC parks.

First off, if you have an idea of what type of event space or park you would like to explore then it will be helpful to use the search bar at the top left of the page. You can search for a specific park or city/borough name here. Clicking on any resulting options will bring up relevant information with regard to accessing an events space within that area.

If you’re looking for more general information about events spaces across NYC parks then scroll down and look at the summary table below which provides a brief description of all records in this dataset along with their related columns (e.g., Name, Latitude etc). The accessible column is particularly important as it tells users which areas are physically accessible while marking 'F' - as False otherwise indicating its not an easily accessible place within any given park/city/borough area covered by this dataset.

You can modify your query parameters by selecting columns listed on top interface shelf for further refining your results based on your unique needs (for example; if you need only those events spaces that are physically accessible). Data from multiple columns can also be combined together too making searching easier and accurate (for example; Brooklyn + nyc accessibility false filter) according to our research criteria needs through several combinations at once!

Finally clicking “Outer Join + Filter” button on top right side next above table takes user into advanced query editing mode – where further filtering is possible lets say if user wanted see particular boroughs having Location 1 OR address mentioning complete physical address lines without any postal codes- flexibility & accuracy here is endless too!

For more detailed instructions please refer our Data Documentation section –and Don't forget we have team member's ready 24 hours a day who are more than willing answer questions should one arise in need help anytime!. We invite everyone take part exploration beyond limits & let us know want like hear most loved ;) Happy exploring & discovering!

Research Ideas

Creating map visualizations or heat maps to highlight event density in neighborhoods within the five boroughs of NYC.

Analyzing trends over time of event categories within the different boroughs (e.g., how has the number of sports events increased/decreased in comparison to cultural events?).

Generating dynamic reports that identify the most accessible NYC parks for those with mobility impairments and create easy-to-use indexes that can be used as a reference when organizing an outdoor activity or outing

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: nyc-parks-events-listing-event-locations-1.csv | Column name | Description | |:---------------|:-------------------------------------------------------------------------------| | name | The name of the event. (String) | | lat | The latitude coordinate of the event location. (Float) | | long | The longitude coordinate of the event location. (Float) | | address | The address of the event location. (String) | | zip | The zip code of the event location. (...
Data from: Population estimation from mobile network traffic metadata
zenodo.org
application/gzip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghazaleh Khodabandelou; Vincent Gauthier; Vincent Gauthier; Mounim El Yacoubi; Marco Fiore; Ghazaleh Khodabandelou; Mounim El Yacoubi; Marco Fiore (2020). Population estimation from mobile network traffic metadata [Dataset]. http://doi.org/10.5281/zenodo.1067032
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1067032
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ghazaleh Khodabandelou; Vincent Gauthier; Vincent Gauthier; Mounim El Yacoubi; Marco Fiore; Ghazaleh Khodabandelou; Mounim El Yacoubi; Marco Fiore
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Please cite our paper if you publish material based on those datasets

G. Khodabandelou, V. Gauthier, M. El-Yacoubi, M. Fiore, "Estimation of Static and Dynamic Urban Populations with Mobile Network Metadata", in IEEE Trans. on Mobile Computing, 2018 (in Press). 10.1109/TMC.2018.2871156

Abstract

Communication-enabled devices that are physically carried by individuals are today pervasive,
which opens unprecedented opportunities for collecting digital metadata about the mobility of large populations. In this paper, we propose a novel methodology for the estimation of people density at metropolitan scales, using subscriber presence metadata collected by a mobile operator. We show that our approach suits the estimation of static population densities, i.e., of the distribution of dwelling units per urban area contained in traditional censuses. Specifically, it achieves higher accuracy than that granted by previous equivalent solutions. In addition, our approach enables the estimation of dynamic population densities, i.e., the time-varying distributions of people in a conurbation. Our results build on significant real-world mobile network metadata and relevant ground-truth information in multiple urban scenarios.

Dataset Columns

This dataset cover one month of data taken during the month of April 2015 for three Italian cities: Rome, Milan, Turin. The raw data has been provided during the Telecom Italia Big Data Challenge (http://www.telecomitalia.com/tit/en/innovazione/archivio/big-data-challenge-2015.html)

1. grid_id: the coordinate of the grid can be retrieved with the shapefile of a given city
2. date: format Y-M-D H:M:S
4. landuse_label: the land use label has been computed by through method described in [2]
5. population: Census population of a given grid block as defined by the Istituto nazionale di statistica (ISTAT https://www.istat.it/en/censuses) in 2011
6. estimation: Dynamics density population estimation (in person) as the result of the method described in [1]
7. area: surface of the "grid id" considered in km^2
8. geometry: the shape of the area considered with the EPSG:3003 coordinate system (only with quilt)

Note

Due to legal constraints, we cannot share directly the original data from the Telecom Italia Big Data Challenge we used to build this dataset.

Easy access to this dataset with quilt

Install the dataset repository:

$ quilt install vgauthier/DynamicPopEstimate

Use the dataset with a Panda Dataframe

>>> from quilt.data.vgauthier import DynamicPopEstimate
>>> import pandas as pd
>>> df = pd.DataFrame(DynamicPopEstimate.rome())

Use the dataset with a GeoPanda Dataframe

>>> from quilt.data.vgauthier import DynamicPopEstimate
>>> import geopandas as gpd
>>> df = gpd.DataFrame(DynamicPopEstimate.rome())

References

[1] G. Khodabandelou, V. Gauthier, M. El-Yacoubi, M. Fiore, "Population estimation from mobile network traffic metadata", in proc of the 17th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM), pp. 1 - 9, 2016.

[2] A. Furno, M. Fiore, R. Stanica, C. Ziemlicki, and Z. Smoreda, "A tale of ten cities: Characterizing signatures of mobile traffic in urban areas," IEEE Transactions on Mobile Computing, Volume: 16, Issue: 10, 2017.
World Income Inequality Database
kaggle.com
zip
Updated Oct 20, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arman (2020). World Income Inequality Database [Dataset]. https://www.kaggle.com/mannmann2/world-income-inequality-database
Explore at:
zip(693569 bytes)Available download formats
Dataset updated
Oct 20, 2020
Authors
Arman
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
World
Description
Source: https://www.wider.unu.edu/database/wiid User Guide: https://www.wider.unu.edu/sites/default/files/WIID/PDF/WIID-User_Guide_06MAY2020.pdf

The World Income Inequality Database (WIID) contains information on income inequality in various countries and is maintained by the United Nations University-World Institute for Development Economics Research (UNU-WIDER). The database was originally compiled during 1997-99 for the research project Rising Income Inequality and Poverty Reduction, directed by Giovanni Andrea Corina. A revised and updated version of the database was published in June 2005 as part of the project Global Trends in Inequality and Poverty, directed by Tony Shorrocks and Guang Hua Wan. The database was revised in 2007 and a new version was launched in May 2008.

The database contains data on inequality in the distribution of income in various countries. The central variable in the dataset is the Gini index, a measure of income distribution in a society. In addition, the dataset contains information on income shares by quintile or decile. The database contains data for 159 countries, including some historical entities. The temporal coverage varies substantially across countries. For some countries there is only one data entry; in other cases there are over 100 data points. The earliest entry is from 1867 (United Kingdom), the latest from 2003. The majority of the data (65%) cover the years from 1980 onwards. The 2008 update (version WIID2c) includes some major updates and quality improvements, in fact leading to a reduced number of variables in the new version. The new version has 334 new observations and several revisions/ corrections made in 2007 and 2008.
Covid-19 in italy
kaggle.com
Updated Apr 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hwaida Alsiari (2020). Covid-19 in italy [Dataset]. https://www.kaggle.com/hwaidaalsiari/covid19-in-italy/kernels
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 18, 2020
Dataset provided by
Kaggle
Authors
Hwaida Alsiari
Area covered
Italy
Description
Context

This data was gathered as part of the data mining project for the General Assembly Data Science course. using the API from https://rapidapi.com/astsiatsko/api/coronavirus-monitor .

Covid-19

The Covid-19 is a contagious coronavirus that hailed from Wuhan, China. This new strain of the virus has strike fear in many countries as cities are quarantined and hospitals are overcrowded. This dataset will help us understand how Covid-19 in Italy.

On March 8, 2020 - Italy’s prime minister announced a sweeping coronavirus quarantine early Sunday, restricting the movements of about a quarter of the country’s population in a bid to limit contagions at the epicenter of Europe’s outbreak.

### High Light: - Spread to various overtime in Italy - Try to predict the spread of COVID-19 ahead of time to take preventive measures

Content

id: id number

total_cases: the total number of cases have the coronavirus

new_cases: the number of new cases with coronavirus in this day and time

active_cases: Number of active cases with coronavirus

total_deaths: the total deaths numbers by a coronavirus

new_deaths: the number of new deaths in this day and time

total_recovered: the number of recovered from the coronavirus

serious_critical: numbe of the people have the coronavirus in serious critical

total_cases_per1m: the number of confirmed cases per 1 million people than China

record_date: Date of notification - YYYY-MM-DD HH:MM:SS

Inspiration

https://www.livescience.com/why-italy-coronavirus-deaths-so-high.html

Facebook

Twitter

Click to copy link

Link copied

Cite

World Bank (2019). World Bank: Education Data [Dataset]. https://www.kaggle.com/datasets/theworldbank/world-bank-intl-education

World Bank: Education Data

World Bank: Education Data (BigQuery Dataset)

Explore at:

45 scholarly articles cite this dataset (View in Google Scholar)

zip(0 bytes)Available download formats

Dataset updated

Mar 20, 2019

Dataset provided by

World Bank Grouphttp://www.worldbank.org/

Authors

World Bank

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

The World Bank is an international financial institution that provides loans to countries of the world for capital projects. The World Bank's stated goal is the reduction of poverty. Source: https://en.wikipedia.org/wiki/World_Bank

Content

This dataset combines key education statistics from a variety of sources to provide a look at global literacy, spending, and access.

For more information, see the World Bank website.

Fork this kernel to get started with this dataset.

Acknowledgements

https://bigquery.cloud.google.com/dataset/bigquery-public-data:world_bank_health_population

http://data.worldbank.org/data-catalog/ed-stats

https://cloud.google.com/bigquery/public-data/world-bank-education

Citation: The World Bank: Education Statistics

Dataset Source: World Bank. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @till_indeman from Unplash.

Inspiration

Of total government spending, what percentage is spent on education?

Clear search

Close search

Google apps

Main menu

World Bank: Education Data

Context

Content

Acknowledgements

Inspiration

Death in the United States

Overview

Project ideas

Differences from the first version of the dataset

Mass Killings in America, 2006 - present

OVERVIEW

About this Dataset

Using this Dataset

Definition of "mass murder"

Methodology

Contacts

International Data Base

Social Insurance Programs in Richest Quintile

Coverage of Social Insurance Programs in Richest Quintile

Percent of Population Eligible

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

Statewide Death Profiles

ARCHIVED: COVID-19 Cases by Population Characteristics Over Time

Crash Data

Data for: The Pandemic Journaling Project, Phase One (PJP-1)

Metadatabase of available data on drivers, pressures, biodiversity,...

Movies Performance and Feature Statistics

Movies Performance and Feature Statistics

Analyzing Box Office Performance, Rating and Audience Reactions

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

Real-time Covid 19 Data

Data from: UNESCO World Heritage Sites Dataset

UNESCO World Heritage Sites Dataset

UNESCO World Heritage Sites Dataset

About this dataset

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

Albanian Population Distribution Data - United States States (2019-2023)

About this dataset

Content

Inspiration

SUPERSEDED - Views on sharing mental and physical health data among people...

Global social media subscriptions comparison 2023

NYC Parks events listings

Uncovering the Hidden Event Spaces of NYC Parks!

Mapping Event Locations, Accessibility, and Boroughs

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Data from: Population estimation from mobile network traffic metadata

World Income Inequality Database

Covid-19 in italy

Context

Covid-19

Content