This dataset provides highly detailed (Block Level) views of various demographics for Manhattan, New York city. this dataset includes information on age, race, sex, income, housing, and various other attributes. This data comes from the 2000 Us Census and was joined to the Census Tiger line files to create the output. enjoy!
This dataset shows the number of hospital admissions for influenza-like illness, pneumonia, or include ICD-10-CM code (U07.1) for 2019 novel coronavirus. Influenza-like illness is defined as a mention of either: fever and cough, fever and sore throat, fever and shortness of breath or difficulty breathing, or influenza. Patients whose ICD-10-CM code was subsequently assigned with only an ICD-10-CM code for influenza are excluded. Pneumonia is defined as mention or diagnosis of pneumonia. Baseline data represents the average number of people with COVID-19-like illness who are admitted to the hospital during this time of year based on historical counts. The average is based on the daily avg from the rolling same week (same day +/- 3 days) from the prior 3 years. Percent change data represents the change in count of people admitted compared to the previous day. Data sources include all hospital admissions from emergency department visits in NYC. Data are collected electronically and transmitted to the NYC Health Department hourly. This dataset is updated daily. All identifying health information is excluded from the dataset.
This dataset includes all valid felony, misdemeanor, and violation crimes reported to the New York City Police Department (NYPD) for all complete quarters so far this year (2017). For additional details, please see the attached data dictionary in the ‘About’ section.
By data.world's Admin [source]
Immerse yourself in NYC Parks events listings! This comprehensive dataset makes available the most recent records from 2013 and beyond, detailing information on events taking place in public parks throughout New York City. Beyond basic event data such as category, dates and times of activity, this dataset also offers further details such as organisers, labels, images associated with events or even YouTube video links related to them. Whether you are looking for a peaceful gathering hour or a thrilling outdoor adventure experience, this dataset provides you with all the necessary information on NYC Parks event listing!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
In this guide, we’ll walk you through how to use this dataset to find information about event spaces in NYC parks.
First off, if you have an idea of what type of event space or park you would like to explore then it will be helpful to use the search bar at the top left of the page. You can search for a specific park or city/borough name here. Clicking on any resulting options will bring up relevant information with regard to accessing an events space within that area.
If you’re looking for more general information about events spaces across NYC parks then scroll down and look at the summary table below which provides a brief description of all records in this dataset along with their related columns (e.g., Name, Latitude etc). The accessible column is particularly important as it tells users which areas are physically accessible while marking 'F' - as False otherwise indicating its not an easily accessible place within any given park/city/borough area covered by this dataset.
You can modify your query parameters by selecting columns listed on top interface shelf for further refining your results based on your unique needs (for example; if you need only those events spaces that are physically accessible). Data from multiple columns can also be combined together too making searching easier and accurate (for example; Brooklyn + nyc accessibility false filter) according to our research criteria needs through several combinations at once!
Finally clicking “Outer Join + Filter” button on top right side next above table takes user into advanced query editing mode – where further filtering is possible lets say if user wanted see particular boroughs having Location 1 OR address mentioning complete physical address lines without any postal codes- flexibility & accuracy here is endless too!
For more detailed instructions please refer our Data Documentation section –and Don't forget we have team member's ready 24 hours a day who are more than willing answer questions should one arise in need help anytime!. We invite everyone take part exploration beyond limits & let us know want like hear most loved ;) Happy exploring & discovering!
- Creating map visualizations or heat maps to highlight event density in neighborhoods within the five boroughs of NYC.
- Analyzing trends over time of event categories within the different boroughs (e.g., how has the number of sports events increased/decreased in comparison to cultural events?).
- Generating dynamic reports that identify the most accessible NYC parks for those with mobility impairments and create easy-to-use indexes that can be used as a reference when organizing an outdoor activity or outing
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: nyc-parks-events-listing-event-locations-1.csv | Column name | Description | |:---------------|:-------------------------------------------------------------------------------| | name | The name of the event. (String) | | lat | The latitude coordinate of the event location. (Float) | | long | The longitude coordinate of the event location. (Float) | | address | The address of the event location. (String) | | zip | The zip code of the event location. (...
This dataset shows daily confirmed and probable cases of COVID-19 in New York City by date of specimen collection. Total cases has been calculated as the sum of daily confirmed and probable cases. Seven-day averages of confirmed, probable, and total cases are also included in the dataset. A person is classified as a confirmed COVID-19 case if they test positive with a nucleic acid amplification test (NAAT, also known as a molecular test; e.g. a PCR test). A probable case is a person who meets the following criteria with no positive molecular test on record: a) test positive with an antigen test, b) have symptoms and an exposure to a confirmed COVID-19 case, or c) died and their cause of death is listed as COVID-19 or similar. As of June 9, 2021, people who meet the definition of a confirmed or probable COVID-19 case >90 days after a previous positive test (date of first positive test) or probable COVID-19 onset date will be counted as a new case. Prior to June 9, 2021, new cases were counted ≥365 days after the first date of specimen collection or clinical diagnosis. Any person with a residence outside of NYC is not included in counts. Data is sourced from electronic laboratory reporting from the New York State Electronic Clinical Laboratory Reporting System to the NYC Health Department. All identifying health information is excluded from the dataset.
These data are used to evaluate the overall number of confirmed and probable cases by day (seven day average) to track the trajectory of the pandemic. Cases are classified by the date that the case occurred. NYC COVID-19 data include people who live in NYC. Any person with a residence outside of NYC is not included.
This dataset includes the daily number of families and individuals residing in the Department of Homeless Services (DHS) shelter system and the daily number of families applying to the DHS shelter system.
This dataset includes data starting from 01/03/2021. For older records, please refer to https://data.cityofnewyork.us/d/dwrg-kzni
This file shows bars and clubs in the New York City MSA. locations were pulled from multiple data sources. This isn't a full listing of bars in the NYC area, but all bars do have a user rating with them. This dataset has been migrated from our Geocommons platform, and lacks a description from the original posting user. This is not a Fortiusone provided dataset. Please keep this in mind, and make of the dataset what you will. Thank you for visiting Finder!
This dataset contains information on antibody testing for COVID-19: the number of people who received a test, the number of people with positive results, the percentage of people tested who tested positive, and the rate of testing per 100,000 people, stratified by ZIP Code Tabulation Area (ZCTA) neighborhood poverty group. These data can also be accessed here: https://github.com/nychealth/coronavirus-data/blob/master/totals/antibody-by-poverty.csv Exposure to COVID-19 can be detected by measuring antibodies to the disease in a person’s blood, which can indicate that a person may have had an immune response to the virus. Antibodies are proteins produced by the body’s immune system that can be found in the blood. People can test positive for antibodies after they have been exposed, sometimes when they no longer test positive for the virus itself. It is important to note that the science around COVID-19 antibody tests is evolving rapidly and there is still much uncertainty about what individual antibody test results mean for a single person and what population-level antibody test results mean for understanding the epidemiology of COVID-19 at a population level. These data only provide information on people tested. People receiving an antibody test do not reflect all people in New York City; therefore, these data may not reflect antibody prevalence among all New Yorkers. Increasing instances of screening programs further impact the generalizability of these data, as screening programs influence who and how many people are tested over time. Examples of screening programs in NYC include: employers screening their workers (e.g., hospitals), and long-term care facilities screening their residents. In addition, there may be potential biases toward people receiving an antibody test who have a positive result because people who were previously ill are preferentially seeking testing, in addition to the testing of persons with higher exposure (e.g., health care workers, first responders.) Neighborhood-level poverty groups were classified in a manner consistent with Health Department practices to describe and monitor disparities in health in NYC. Neighborhood poverty measures are defined as the percentage of people earning below the Federal Poverty Threshold (FPT) within a ZCTA. The standard cut-points for defining categories of neighborhood-level poverty in NYC are: • Low: <10% of residents in ZCTA living below the FPT • Medium: 10% to <20% • High: 20% to <30% • Very high: ≥30% residents living below the FPT The ZCTAs used for classification reflect the first non-missing address within NYC for each person reported with an antibody test result. Rates were calculated using interpolated intercensal population estimates updated in 2019. These rates differ from previously reported rates based on the 2000 Census or previous versions of population estimates. The Health Department produced these population estimates based on estimates from the U.S. Census Bureau and NYC Department of City Planning. Rates for poverty were calculated using direct standardization for age at diagnosis and weighting by the US 2000 standard population. Antibody tests are categorized based on the date of specimen collection and are aggregated by full weeks starting each Sunday and ending on Saturday. For example, a person whose blood was collected for antibody testing on Wednesday, May 6 would be categorized as tested during the week ending May 9. A person tested twice in one week would only be counted once in that week. This dataset includes testing data beginning April 5, 2020. Data are updated daily, and the dataset preserves historical records and source data changes, so each extract date reflects the current copy of the data as of that date. For example, an extract date of 11/04/2020 and extract date of 11/03/2020 will both contain all records as they were as of that extract date. Without filtering or grouping by extract date, an analysis will almost certain
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This directory contains data on over 4.5 million Uber pickups in New York City from April to September 2014, and 14.3 million more Uber pickups from January to June 2015. Trip-level data on 10 other for-hire vehicle (FHV) companies, as well as aggregated data for 329 FHV companies, is also included. All the files are as they were received on August 3, Sept. 15 and Sept. 22, 2015.
FiveThirtyEight obtained the data from the NYC Taxi & Limousine Commission (TLC) by submitting a Freedom of Information Law request on July 20, 2015. The TLC has sent us the data in batches as it continues to review trip data Uber and other HFV companies have submitted to it. The TLC's correspondence with FiveThirtyEight is included in the files TLC_letter.pdf
, TLC_letter2.pdf
and TLC_letter3.pdf
. TLC records requests can be made here.
This data was used for four FiveThirtyEight stories: Uber Is Serving New York’s Outer Boroughs More Than Taxis Are, Public Transit Should Be Uber’s New Best Friend, Uber Is Taking Millions Of Manhattan Rides Away From Taxis, and Is Uber Making NYC Rush-Hour Traffic Worse?.
The dataset contains, roughly, four groups of files:
There are six files of raw data on Uber pickups in New York City from April to September 2014. The files are separated by month and each has the following columns:
Date/Time
: The date and time of the Uber pickupLat
: The latitude of the Uber pickupLon
: The longitude of the Uber pickupBase
: The TLC base company code affiliated with the Uber pickupThese files are named:
uber-raw-data-apr14.csv
uber-raw-data-aug14.csv
uber-raw-data-jul14.csv
uber-raw-data-jun14.csv
uber-raw-data-may14.csv
uber-raw-data-sep14.csv
Also included is the file uber-raw-data-janjune-15.csv
This file has the following columns:
Dispatching_base_num
: The TLC base company code of the base that dispatched the UberPickup_date
: The date and time of the Uber pickupAffiliated_base_num
: The TLC base company code affiliated with the Uber pickuplocationID
: The pickup location ID affiliated with the Uber pickupThe Base
codes are for the following Uber bases:
B02512 : Unter B02598 : Hinter B02617 : Weiter B02682 : Schmecken B02764 : Danach-NY B02765 : Grun B02835 : Dreist B02836 : Drinnen
For coarse-grained location information from these pickups, the file taxi-zone-lookup.csv
shows the taxi Zone
(essentially, neighborhood) and Borough
for each locationID
.
The dataset also contains 10 files of raw data on pickups from 10 for-hire vehicle (FHV) companies. The trip information varies by company, but can include day of trip, time of trip, pickup location, driver's for-hire license number, and vehicle's for-hire license number.
These files are named:
American_B01362.csv
Diplo_B01196.csv
Highclass_B01717.csv
Skyline_B00111.csv
Carmel_B00256.csv
Federal_02216.csv
Lyft_B02510.csv
Dial7_B00887.csv
Firstclass_B01536.csv
Prestige_B01338.csv
There is also a file other-FHV-data-jan-aug-2015.csv
containing daily pickup data for 329 FHV companies from January 2015 through August 2015.
The file Uber-Jan-Feb-FOIL.csv
contains aggregated daily Uber trip statistics in January and February 2015.
This dataset shows daily confirmed and probable cases of COVID-19 in New York City by date of specimen collection. Total cases has been calculated as the sum of daily confirmed and probable cases. Seven-day averages of confirmed, probable, and total cases are also included in the dataset. A person is classified as a confirmed COVID-19 case if they test positive with a nucleic acid amplification test (NAAT, also known as a molecular test; e.g. a PCR test). A probable case is a person who meets the following criteria with no positive molecular test on record: a) test positive with an antigen test, b) have symptoms and an exposure to a confirmed COVID-19 case, or c) died and their cause of death is listed as COVID-19 or similar. As of June 9, 2021, people who meet the definition of a confirmed or probable COVID-19 case >90 days after a previous positive test (date of first positive test) or probable COVID-19 onset date will be counted as a new case. Prior to June 9, 2021, new cases were counted ≥365 days after the first date of specimen collection or clinical diagnosis. Any person with a residence outside of NYC is not included in counts. Data is sourced from electronic laboratory reporting from the New York State Electronic Clinical Laboratory Reporting System to the NYC Health Department. All identifying health information is excluded from the dataset. These data are used to evaluate the overall number of confirmed and probable cases by day (seven day average) to track the trajectory of the pandemic. Cases are classified by the date that the case occurred. NYC COVID-19 data include people who live in NYC. Any person with a residence outside of NYC is not included.
This dataset contains information on antibody testing for COVID-19: the number of people who received a test, the number of people with positive results, the percentage of people tested who tested positive, and the rate of testing per 100,000 people, stratified by modified ZIP Code Tabulation Area (ZCTA) of residence. Modified ZCTA reflects the first non-missing address within NYC for each person reported with an antibody test result. This unit of geography is similar to ZIP codes but combines census blocks with smaller populations to allow more stable estimates of population size for rate calculation. It can be challenging to map data that are reported by ZIP Code. A ZIP Code doesn’t refer to an area, but rather a collection of points that make up a mail delivery route. Furthermore, there are some buildings that have their own ZIP Code, and some non-residential areas with ZIP Codes. To deal with the challenges of ZIP Codes, the Health Department uses ZCTAs which solidify ZIP codes into units of area. Often, data reported by ZIP code are actually mapped by ZCTA. The ZCTA geography was developed by the U.S. Census Bureau. These data can also be accessed here: https://github.com/nychealth/coronavirus-data/blob/master/totals/antibody-by-modzcta.csv Exposure to COVID-19 can be detected by measuring antibodies to the disease in a person’s blood, which can indicate that a person may have had an immune response to the virus. Antibodies are proteins produced by the body’s immune system that can be found in the blood. People can test positive for antibodies after they have been exposed, sometimes when they no longer test positive for the virus itself. It is important to note that the science around COVID-19 antibody tests is evolving rapidly and there is still much uncertainty about what individual antibody test results mean for a single person and what population-level antibody test results mean for understanding the epidemiology of COVID-19 at a population level.
These data only provide information on people tested. People receiving an antibody test do not reflect all people in New York City; therefore, these data may not reflect antibody prevalence among all New Yorkers. Increasing instances of screening programs further impact the generalizability of these data, as screening programs influence who and how many people are tested over time. Examples of screening programs in NYC include: employers screening their workers (e.g., hospitals), and long-term care facilities screening their residents.
In addition, there may be potential biases toward people receiving an antibody test who have a positive result because people who were previously ill are preferentially seeking testing, in addition to the testing of persons with higher exposure (e.g., health care workers, first responders)
Rates were calculated using interpolated intercensal population estimates updated in 2019. These rates differ from previously reported rates based on the 2000 Census or previous versions of population estimates. The Health Department produced these population estimates based on estimates from the U.S. Census Bureau and NYC Department of City Planning.
Antibody tests are categorized based on the date of specimen collection and are aggregated by full weeks starting each Sunday and ending on Saturday. For example, a person whose blood was collected for antibody testing on Wednesday, May 6 would be categorized as tested during the week ending May 9. A person tested twice in one week would only be counted once in that week. This dataset includes testing data beginning April 5, 2020.
Data are updated daily, and the dataset preserves historical records and source data changes, so each extract date reflects the current copy of the data as of that date. For example, an extract date of 11/04/2020 and extract date of 11/03/2020 will both contain all records as they were as of that extract date. Without filtering or grouping by extract date, an analysis will almost certainly be miscalculating or counting the same values multiple times. To analyze the most current data, only use the latest extract date. Antibody tests that are missing dates are not included in the dataset; as dates are identified, these events are added. Lags between occurrence and report of cases and tests can be assessed by comparing counts and rates across multiple data extract dates.
For further details, visit:
• https://www1.nyc.gov/site/doh/covid/covid-19-data.page
• https://github.com/nychealth/coronavirus-data
• https://data.cityofnewyork.us/Health/Modified-Zip-Code-Tabulation-Areas-MODZCTA-/pri4-ifjk
This dataset has been migrated from our Geocommons platform, and lacks a description from the original posting user. This is not a Fortiusone provided dataset. Please keep this in mind, and make of the dataset what you will. Thank you for visiting Finder!
List of every shooting incident that occurred in NYC during the current calendar year. This is a breakdown of every shooting incident that occurred in NYC during the current calendar year. This data is manually extracted every quarter and reviewed by the Office of Management Analysis and Planning before being posted on the NYPD website. Each record represents a shooting incident in NYC and includes information about the event, the location and time of occurrence. In addition, information related to suspect and victim demographics is also included. This data can be used by the public to explore the nature of police enforcement activity. Please refer to the attached data footnotes for additional information about this dataset.
This dataset was created by wgdesign2
Released under Data files © Original Authors
Note: As of November 10, 2023, this dataset has been archived. For the current version of this data, please visit: https://health.data.ny.gov/d/gikn-znjh
This dataset reports daily on the number of people vaccinated by New York providers with at least one dose and with a complete COVID-19 vaccination series overall since December 14, 2020. New York providers include hospitals, mass vaccination sites operated by the State or local governments, pharmacies, and other providers registered with the State to serve as points of distribution.
This dataset is created by the New York State Department of Health from data reported to the New York State Immunization Information System (NYSIIS) and the New York City Citywide Immunization Registry (NYC CIR). County-level vaccination data is based on data reported to NYSIIS and NYC CIR by the providers administering vaccines. Residency is self-reported by the individual being vaccinated. This data does not include vaccine administered through Federal entities or performed outside of New York State to New York residents. NYSIIS and CIR data is used for county-level statistics. New York State Department of Health requires all New York State vaccination providers to report all COVID-19 vaccination administration data to NYSIIS and NYC CIR within 24 hours of administration.
This dataset includes the daily number of families and individuals residing in the Department of Homeless Services (DHS) shelter system and the daily number of families applying to the DHS shelter system.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
As a participant to NYC Taxi Trip Duration, I'm providing additional data, to help extracting many new usefull features.
To do so I'm using a high performance routing engine designed to run on OpenStreetMap data.
Having the whole blind test data, I decided also to share a small amount concerning erroneous samples (less than 0.15%), so competitors can focus matching real world data and to not try to fit randomness.
Note: The steps files are big so I split them into two parts. Part 1, Part2
Description of different tables used.
id: Record id
distance: Route distance (m)
duration: OSRM trip duration (s)
motorway, trunk, primary, secondary, tertiary, unclassified, residential:
The proportion spent on different kind of roads (% of total distance)
nTrafficSignals: The number of traffic signals.
nCrossing: The number of pedestrian crossing.
nStop: The number of stop signs.
nIntersection: The number of intersections, if you are OSRM user, intersection have different meaning than the one used in OSRM.
*Intersection can be crossroad, but not a highway exit...
srcCounty, dstCounty: Pickup/Dropoff county.
NA: Not in NYC
1: Brooklyn
2: Queens
3: Staten Island
4: Manhattan
5: Bronx
For each trip we saved all the ways (route portion).
id: train/test id.
wayId: Way id, you can check the way using www.openstreetmap.org/way/*wayId*
portion: The proportion of the total distance
It contains encoded nodes (lon/lat coordinates), of the used ways.
wayId: The way identification.
polyline: Encoded polylines.
id: Same as original data.
bug: kind of the bug (0=none)
Trip duration higher than 1 day;
Drop off on the day after pickup, and trip duration higher than 6h;
Drop off time at 00:00:00 and vendor_id eq 2.
trip_duration: Taxi trip duration
This dataset displays all the hazardous waste sites in the United States and it's Territories as of 5.08. The data comes from the Agency for Toxic Substances and Disease Registry(ATSDR). The dataset contains information about the site: Site ID Site Name CERCLIS # Address City State County Latitude Longitude Population Region # Congressional Districts Federal Facility National Priorities List Status Ownership Status Classification For more information go to the Agency for Toxic Substances and Disease Registry(ATSDR)website at http://www.atsdr.cdc.gov
Note: Data elements were retired from HERDS on 10/6/23 and this dataset was archived.
This dataset includes the cumulative number and percent of healthcare facility-reported fatalities for patients with lab-confirmed COVID-19 disease by reporting date and age group. This dataset does not include fatalities related to COVID-19 disease that did not occur at a hospital, nursing home, or adult care facility. The primary goal of publishing this dataset is to provide users with information about healthcare facility fatalities among patients with lab-confirmed COVID-19 disease.
The information in this dataset is also updated daily on the NYS COVID-19 Tracker at https://www.ny.gov/covid-19tracker.
The data source for this dataset is the daily COVID-19 survey through the New York State Department of Health (NYSDOH) Health Electronic Response Data System (HERDS). Hospitals, nursing homes, and adult care facilities are required to complete this survey daily. The information from the survey is used for statewide surveillance, planning, resource allocation, and emergency response activities. Hospitals began reporting for the HERDS COVID-19 survey in March 2020, while Nursing Homes and Adult Care Facilities began reporting in April 2020. It is important to note that fatalities related to COVID-19 disease that occurred prior to the first publication dates are also included.
The fatality numbers in this dataset are calculated by assigning age groups to each patient based on the patient age, then summing the patient fatalities within each age group, as of each reporting date. The statewide total fatality numbers are calculated by summing the number of fatalities across all age groups, by reporting date. The fatality percentages are calculated by dividing the number of fatalities in each age group by the statewide total number of fatalities, by reporting date. The fatality numbers represent the cumulative number of fatalities that have been reported as of each reporting date.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Who amongst us doesn't small talk about the weather every once in a while?
The goal of this dataset is to elevate this small talk to medium talk.
Just kidding, I actually originally decided to collect this dataset in order to demonstrate basic signal processing concepts, such as filtering, Fourier transform, auto-correlation, cross-correlation, etc..., (for a data analysis course I'm currently preparing).
I wanted to demonstrate these concepts on signals that we all have intimate familiarity with and hope that this way these concepts will be better understood than with just made up signals.
The weather is excellent for demonstrating these kinds of concepts as it contains periodic temporal structure with two very different periods (daily and yearly).
http://www.sciencehub4kids.com/wp-content/uploads/2015/08/The-four-seasons.jpg" alt="a nice 4 seasons image">
The dataset contains ~5 years of high temporal resolution (hourly measurements) data of various weather attributes, such as temperature, humidity, air pressure, etc.
This data is available for 30 US and Canadian Cities, as well as 6 Israeli cities.
I've organized the data according to a common time axis for easy use.
Each attribute has it's own file and is organized such that the rows are the time axis (it's the same time axis for all files), and the columns are the different cities (it's the same city ordering for all files as well).
Additionally, for each city we also have the country, latitude and longitude information in a separate file.
The dataset was aquired using Weather API on the OpenWeatherMap website, and is available under the ODbL License.
Weather data is both intrinsically interesting, and also potentially useful when correlated with other types of data.
For example, Wildfire spread is potentially related to weather conditions, demand for cabs is famously known to be correlated with weather conditions (here, here and here you can find NYC cab ride data), and use of city bikes is probably also correlated with weather in interesting ways (check out this Austin dataset, this SF dataset, this Montreal dataset, and this NYC dataset).
Traffic is also probably related to weather.
Another potentially interesting source of correlation is between weather and crime. Here are a few crime datasets on kaggle of cities present in this weather dataset: Chicago, Philadelphia, Los Angeles, Vancouver, Austin, NYC
There are many other potentially interesting connections between everyday life and the weather that we can explore together with the help of this dataset. Have fun!
This dataset provides highly detailed (Block Level) views of various demographics for Manhattan, New York city. this dataset includes information on age, race, sex, income, housing, and various other attributes. This data comes from the 2000 Us Census and was joined to the Census Tiger line files to create the output. enjoy!