Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset compiles a comprehensive database containing 90,327 street segments in New York City, covering their street design features, streetscape design, Vision Zero treatments, and neighborhood land use. It has two scales-street and street segment group (aggregation of same type of street at neighborhood). This dataset is derived based on all publicly available data, most from NYC Open Data. The detailed methods can be found in the published paper, Pedestrian and Car Occupant Crash Casualties Over a 9-Year Span of Vision Zero in New York City. To use it, please refer to the metadata file for more information and cite our work. A full list of raw data source can be found below:
Motor Vehicle Collisions – NYC Open Data: https://data.cityofnewyork.us/Public-Safety/Motor-Vehicle-Collisions-Crashes/h9gi-nx95
Citywide Street Centerline (CSCL) – NYC Open Data: https://data.cityofnewyork.us/City-Government/NYC-Street-Centerline-CSCL-/exjm-f27b
NYC Building Footprints – NYC Open Data: https://data.cityofnewyork.us/Housing-Development/Building-Footprints/nqwf-w8eh
Practical Canopy for New York City: https://zenodo.org/record/6547492
New York City Bike Routes – NYC Open Data: https://data.cityofnewyork.us/Transportation/New-York-City-Bike-Routes/7vsa-caz7
Sidewalk Widths NYC (originally from Sidewalk – NYC Open Data): https://www.sidewalkwidths.nyc/
LION Single Line Street Base Map - The NYC Department of City Planning (DCP): https://www.nyc.gov/site/planning/data-maps/open-data/dwn-lion.page
NYC Planimetric Database Median – NYC Open Data: https://data.cityofnewyork.us/Transportation/NYC-Planimetrics/wt4d-p43d
NYC Vision Zero Open Data (including multiple datasets including all the implementations): https://www.nyc.gov/content/visionzero/pages/open-data
NYS Traffic Data - New York State Department of Transportation Open Data: https://data.ny.gov/Transportation/NYS-Traffic-Data-Viewer/7wmy-q6mb
Smart Location Database - US Environmental Protection Agency: https://www.epa.gov/smartgrowth/smart-location-mapping
Race and ethnicity in area - American Community Survey (ACS): https://www.census.gov/programs-surveys/acs
Facebook
TwitterPublication Date: April 2025. This polygon layer is updated annually.
This layer contains 2024 parcel data only for NY State counties which gave NYS ITs Geospatatial Services permission to share this data with the public. Work to obtain parcel data from additional counties, as well as permission to share the data, is ongoing. To date, 36 counties have provided the Geospatial Services permission to share their parcel data with the public. Parcel data for counties which do not allow the Geospatial Services to redistribute their data must be obtained directly from those counties. Geospatial Services' goal is to eventually include parcel data for all counties in New York State.
Parcel geometry was incorporated as received from County Real Property Departments. No attempt was made to edge-match parcels along adjacent counties. County attribute values were populated using 2024 Assessment Roll tabular data Geospatial Services obtained from the NYS Department of Tax and Finance’s Office of Real Property Tax Services (ORPTS). Tabular assessment data was joined to the county provided parcel geometry using the SWIS & SBL or SWIS & PRINT KEY unique identifier for each parcel.
Detailed information about assessment attributes can be found in the ORPTS Assessor’s Manuals available here: https://www.tax.ny.gov/research/property/assess/manuals/assersmanual.htm. New York City data comes from NYC MapPluto which can be found here: https://www1.nyc.gov/site/planning/data-maps/open-data/dwn-pluto-mappluto.page.
This layer displays when zoomed in below 1:37,051-scale.
This map service is available to the public.
Geometry accuracy varies by contributing county.
Thanks to the following counties that specifically authorized Geospatial Services to share their GIS tax parcel data with the public: Albany, Cayuga, Chautauqua, Cortland, Erie, Genesee, Greene, Hamilton, Lewis, Livingston, Montgomery, New York City (Bronx, Kings, New York, Queens, Richmond), Oneida, Onondaga, Ontario, Orange, Oswego, Otsego, Putnam, Rensselaer, Rockland, Schuyler, Steuben, St Lawrence, Suffolk, Sullivan, Tioga, Tompkins, Ulster, Warren, Wayne, and Westchester.
The State of New York, acting through the New York State Office of Information Technology Services, makes no representations or warranties, express or implied, with respect to the use of or reliance on the Data provided. The User accepts the Data provided “as is” with no guarantees that it is error free, complete, accurate, current or fit for any particular purpose and assumes all risks associated with its use. The State disclaims any responsibility or legal liability to Users for damages of any kind, relating to the providing of the Data or the use of it. Users should be aware that temporal changes may have occurred since this Data was created.
Facebook
TwitterThis dataset was created by Jamie Allen
Facebook
TwitterPublication Date: April 2025. Updated annually, or as needed. The data can be downloaded here: https://gis.ny.gov/parcels#data-download. This feature service only contains parcel data for NYS State-owned tax parcels. 2024 Parcel geometry was incorporated as received from County Real Property Departments. No attempt was made to edge-match parcels along adjacent counties. County attribute values were populated using 2024 Assessment Roll tabular data NYS ITS Geospatial Services obtained from the NYS Department of Tax and Finance’s Office of Real Property Tax Services (ORPTS).Tabular assessment data was joined to the county provided parcel geometry using the SWIS & SBL or SWIS & PRINT KEY unique identifier for each parcel. Detailed information about assessment attributes can be found in the ORPTS Assessor’s Manuals available here: https://www.tax.ny.gov/research/property/assess/manuals/assersmanual.htm. New York City data comes from NYC MapPluto which can be found here: https://www1.nyc.gov/site/planning/data-maps/open-data/dwn-pluto-mappluto.page. The State-owned tax parcel polygons in this file are the result of a best effort selection based on the Primary Owner and Additional Owner fields in the NYS Statewide Tax Parcels file, Geospatial Services data, State agency data, and online research. These same data and information are also the basis for a best effort assignment of a NYS agency name (listed in the NYS Name field) to most tax parcel polygons where the Owner Type field value is State. The NYS Name Source field consists of a code that describes how the State-owned designation and agency name were determined, if this was verified, and the means of verification, if applicable.This map service is available to the public.The State of New York, acting through the New York State Office of Information Technology Services, makes no representations or warranties, express or implied, with respect to the use of or reliance on the Data provided. The User accepts the Data provided “as is” with no guarantees that it is error free, complete, accurate, current or fit for any particular purpose and assumes all risks associated with its use. The State disclaims any responsibility or legal liability to Users for damages of any kind, relating to the providing of the Data or the use of it. Users should be aware that temporal changes may have occurred since this Data was created.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains trip records for yellow and green taxis operating in New York City. Each trip includes detailed information such as pickup and dropoff times and locations, passenger count, trip distance, payment type, fare amount, and various surcharges. The data can be used for urban mobility research, fare prediction, traffic analysis, and more.
VendorID: LPEP provider ID (e.g., CMT, Curb, Myle)lpep_pickup_datetime, lpep_dropoff_datetime: Pickup and dropoff timespassenger_count, trip_distanceRatecodeID: Final rate appliedstore_and_fwd_flag: Whether the trip was stored in vehicle memoryPULocationID, DOLocationID: Pickup and dropoff TLC taxi zonesfare_amount, extra, mta_tax, tip_amount, tolls_amount, improvement_surcharge, total_amountpayment_type, trip_type, congestion_surcharge, cbd_congestion_feeVendorID: TPEP provider ID (e.g., CMT, Curb, Myle, Helix)tpep_pickup_datetime, tpep_dropoff_datetimeairport_feeFor more information, refer to the NYC TLC website: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
Facebook
Twitterhttps://www.incomebyzipcode.com/terms#TERMShttps://www.incomebyzipcode.com/terms#TERMS
A dataset listing the richest zip codes in New York per the most current US Census data, including information on rank and average income.
Facebook
TwitterThis dataset provides information about the number of properties, residents, and average property values for Doe Lane cross streets in South Setauket, NY.
Facebook
TwitterThis profile shows a detailed summary of recipients, units of service, paid claim dollars, dollars per individual and dollars per units of service for services reimbursed by Medicaid Fee for Service billing as well as Medicaid Managed Care Plans for chemical dependence and non-chemical dependence services received per state fiscal year (SFY).
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This project focuses on analyzing the New York City 'Taxi Data', specifically vehicle-for-hire (Uber/Lyft) data from January 1, 2022, to August 31, 2023. The project (will) encompasses a comprehensive statistical analysis followed by the development of a machine learning model. The model is intended for open use.
rideshare_data.parquet: a parquet file with roughly 300M rows x 19 columns. Contains ride by ride entries for NYC rideshare services.| Field | Data Type | Description |
|---|---|---|
| Field | Data Type | Description |
| Business | Object | Vehicle-for-Hire company operating the ride. Options include Juno, Uber, Via, Lyft. |
| Pickup Location | Integer (64-bit) | The Taxi and Limousine Commission (TLC) Taxi Zone where the journey commenced. Refer to 'taxi_zone_lookup.csv' for details. |
| Dropoff Location | Integer (64-bit) | The TLC Taxi Zone where the journey concluded. Reference 'taxi_zone_lookup.csv' for more information. |
| Trip Length | Floating point 64 | The total distance of the trip in miles. |
| Request to Dropoff | Time Delta (ns) | The time elapsed from the ride request to the dropoff, representing the total time from the passenger's perspective. |
| Request to Pickup | Time Delta (ns) | The time taken from the ride request to the passenger pickup. |
| Total Ride Time | Time Delta (ns) | The duration between the passenger pickup and dropoff, indicating the total time spent in the car. |
| On Scene to Pickup | Time Delta (ns) | The time duration between the driver's arrival on the scene and the passenger pickup, reflecting how long the driver waited. |
| On Scene to Dropoff | Time Delta (ns) | Time from the driver's arrival on the scene to the passenger dropoff, indicating the driver's total time commitment for the passenger. |
| Time of Day | Object | Categorization of the time of day as morning (0600-1100), afternoon (1200-1600), evening (1700-1900), or night (other times). |
| Date | Object | The date when the ride was requested. |
| Hour of Day | Integer (32-bit) | The hour of the day when the ride was requested, where 0 represents midnight and 23 represents 11 PM. |
| Week of Year | Unsigned Integer (32-bit) | The ISO week number of the year when the ride was requested. |
| Month of Year | Integer (32-bit) | The ISO month number of the year when the ride was requested, with January as 1. |
| Passenger Fare | Floating point 64 | The total fare paid by the passenger in USD, inclusive of all charges such as base fare, tips, tolls, taxes, surcharges, and applicable fees. |
| Driver Total Pay | Floating point 64 | The complete payment received by the driver, encompassing base pay and tips. |
| Rideshare Profit | Floating point 64 | The difference between the passenger fare and the driver's total pay, representing the platform's profit. |
| Hourly Rate | Floating point 64 | The calculated hourly rate based on 'on_scene_hours', including the duration from the driver's arrival to the final drop-off. |
| Dollars per Mile | Floating point 64 | The driver's earnings per mile, calculated as the total driver pay divided by the trip length. |
Contributions to this project are welcome. Look for collaborators to build and train a model.
Distributed under the MIT license
Project Link: https://github.com/aweymouth13/rideshare_analysis
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The U.S. Geological Survey (USGS) is providing a point feature class containing a compilation of geologic well records (n=221) obtained from: 1) previous U.S. Geological Survey groundwater investigations, 2) the U.S. Geological Survey's National Water Information System (NWIS), 3) the New York State Department of Environmental Conservation (NYSDEC) Water Well Contractor Program, and 4) the New York State Department of Transportation (NYSDOT). The wells are located within the Binghamton East 1:24,000 quadrangle of south-central Broome County, New York, 2014-2020. The shapefile was created and intended for use with geographic information system (GIS) software. A companion report, USGS Scientific Investigations Report 2021-5026 (Van Hoesen and others, 2021; https://doi.org/10.3133/sir20215026) further describes data collection and map preparation.
Facebook
TwitterA study was conducted to provide detailed mapping of glacial aquifers associated with the Fairport-Lyons channel system in Wayne, Ontario, and Seneca Counties, New York. The study was part of the cooperative Detailed Aquifer Mapping Program between the U.S. Geological Survey (USGS) and the New York State Department of Environmental Conservation (NYSDEC). The objective of the study was to characterize the hydrogeology of the Fairport-Lyons channel and inter-drumlin aquifer system and to present the results as an electronic 1:24,000 scale map, hydrogeologic sections, and a summary report. The spatial extent and hydrogeologic framework of this valley-fill aquifer was delineated using existing data, including soils maps, well records, geologic logs, topographic data, and published reports. This data release contains digital datasets for the areas of surficial sand and gravel with water-resource potential, eskerform features, areas of sand and gravel or sand pits, delineated areas of postglacial and glacial deposits, and records of selected wells within the study area and traces of the hydrogeologic sections referred to in the study. These digital datasets support USGS SIR 2021-5086, "Hydrogeology of aquifers within the Fairport-Lyons channel system and adjacent areas in Wayne, Ontario, and Seneca Counties, New York."
Facebook
TwitterThe dataset provides information about projects in the MTA's 5- Year Capital Programs from 2005 - 2009 to the current one. It provides quarterly updates to budgets, scopes, and schedules for planned and ongoing projects. This dataset organizes information at the agency level.
Facebook
TwitterComprehensive demographic dataset for East Setauket, NY, US including population statistics, household income, housing units, education levels, employment data, and transportation with year-over-year changes.
Facebook
TwitterThe Statewide Planning and Research Cooperative System (SPARCS) Inpatient De-identified File contains discharge level detail on patient characteristics, diagnoses, treatments, services and charges. This data file contains basic record level detail for the discharge. The de-identified data file does not contain data that is protected health information (PHI) under HIPAA. The health information is not individually identifiable; all data elements considered identifiable have been redacted. For example, the direct identifiers regarding a date have the day and month portion of the date removed. A downloadable file of this dataset is available at: https://health.data.ny.gov/Health/Hospital-Inpatient-Discharges-SPARCS-De-Identified/mpue-vn67. For more information, including changes to the data from previous years, please visit http://www.health.ny.gov/statistics/sparcs/access/. The "About" tab contains additional details concerning this dataset.
Facebook
TwitterBy Health Data New York [source]
The Statewide Planning and Research Cooperative System (SPARCS) Inpatient De-identified dataset is a wealth of information, containing discharge level detail on various aspects of hospital inpatient discharges in New York State during the year 2010. From patient characteristics such as age group, gender, race and ethnicity to diagnoses, treatments, services and charges - all data elements excluding those consideredidentifiable have been made available within this dataset. This data does not contain any protected health information (PHI) under the Health Insurance Portability and Accountability Act (HIPAA). Understanding the plethora of details in this data can give individuals insights into many varying aspects related to hospital care. Before using or referencing any data from this dataset it is important to read and understand the Terms of Service which can be found at [link]. Dive into understanding more about what goes on behind closed doors at hospitals with the SPARCS Inpatient De-identified Dataset!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This guide is here to provide you with information on how to use this dataset efficiently and effectively. Here are some useful tips:
- Familiarize Yourself With The Data: Before diving into the data itself it is important to understand what it is you will be working with. Take time to read through the columns that are included in the dataset as well as any other relevant documentation associated with this data so that you know exactly what it is you are looking at.
- Clean and Process The Data: When working with raw datasets such as this one it is important to ensure that all of the data provided has been properly cleaned and structured before being used for further analysis or machine learning models. Taking contamination for example; if not correctly diagnosed then these can affect your results later down the line when drawing conclusions from your analysis results. Additionally take care when handling missing values - weighing usage / exclusion of certain values and where applicable looking for patterns which may suggest underlying reasons leading up to them being absent from certain records etc...
- Explore Your Hypothesis/Goals Further: After understanding more about what this data has got behind offer explore any potential hypothesis/goals further by analysing different correlations between various factors across different dimensions (by taking various columns into consideration). Visualisation tools such s Tableau can be used here - however take great care when doing so; visualisations too easily dictate terms leaving a bias sometimes without particularly realising or consciously intending so when carrying out an analysis on a large dataset (which isn’t necessarily bad but something which needs close attention).
4 Lastly Utilise Actionable Insights Gathered From Your Findings: Once your initial exploration phase has been completed utilize any insights gathered within a productive manner - share your findings & collaborate closely with key stakeholders where applicable presenting any actionable insights gained from your analysis making use potential optimization strategies & aiming towards greater understanding of issues / opportunities affecting business practices
- Identifying health disparities in hospital inpatient discharges across New York State – This dataset can be used to understand the regional variations between communities across NY and diagnose which areas need more healthcare coverage for certain diagnosis codes or procedure codes.
- Knowing patient needs ahead of time based on demographics, diagnosis, and procedures – With this dataset, health professionals will be able to get an idea of what kind of treatments most patients look for when they come down with a particular illness or injury. This will allow them to better prepare the necessary equipment, medicine and resources needed beforehand so that they don't have to search while the patient is already at their facility waiting for treatment.
- Improving cost efficiency by looking at correlations between different payment sources – With this dataset, hospitals could identify any patterns or correlations between different payment sources (such as Medicaid and private insurance) that could be used toward improving cost efficiency during inpatient visits by optimizing resource allocation according to source ...
Facebook
TwitterPrivately owned public spaces, also known by the acronym POPS, are outdoor and indoor spaces provided for public enjoyment by private owners in exchange for bonus floor area or waivers, an incentive first introduced into New York City's zoning regulations in 1961. To find out more about POPS, visit the Department of City Planning's website at http://nyc.gov/pops. This database contains detailed information about each privately owned public space in New York City.
Data Source: Privately Owned Public Space Database (2018), owned and maintained by the New York City Department of City Planning and created in collaboration with Jerold S. Kayden and The Municipal Art Society of New York. All previously released versions of this data are available on the DCP Website: BYTES of the BIG APPLE. Current version: 25v2
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Cohoes Hispanic or Latino population. It includes the distribution of the Hispanic or Latino population, of Cohoes, by their ancestries, as identified by the Census Bureau. The dataset can be utilized to understand the origin of the Hispanic or Latino population of Cohoes.
Key observations
Among the Hispanic population in Cohoes, regardless of the race, the largest group is of Puerto Rican origin, with a population of 730 (52.37% of the total Hispanic population).
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Origin for Hispanic or Latino population include:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Cohoes Population by Race & Ethnicity. You can refer the same here
Facebook
TwitterThis dataset provides information about the number of properties, residents, and average property values for New Street cross streets in Setauket, NY.
Facebook
Twitterhttps://www.myvisajobs.com/terms-of-service/https://www.myvisajobs.com/terms-of-service/
A dataset that explores Green Card sponsorship trends, salary data, and employer insights for east setauket, ny in the U.S.
Facebook
TwitterIn collaboration with state, regional, and federal partners, New York Department of State (DOS) employed participatory methods to gather detailed information about the characteristics and locations of New York’s offshore recreational uses in order to better understand how and where New York residents are using and enjoying ocean resources. These data are intended to support New York's marine spatial planning efforts. DOS staff worked with NOAA’s Coastal Services Center (CSC) to design and develop a participatory mapping process. Leaders from 30 partner organizations and other knowledgeable individuals were invited to participate in one of five offshore use workshops conducted during the summer of 2011: two each in Riverhead and Baldwin, and one in Manhattan. At the workshops, DOS and CSC trained organizational contacts and knowledgeable individuals to work with their colleagues, constituents, and memberships to collect ocean use information. At the conclusion of the workshops, participants were provided with information-collecting kits containing navigation charts, information tables, guidance for meeting with their members and collecting information, sample charts and tables, and copies of several one-pagers explaining DOS’s marine spatial planning process, ocean uses, offshore habitats, and offshore renewable energy development.Workshop participants collected ocean use information from their peers over several months, and the marked-up charts with corresponding information tables were returned to DOS, representing over 130 records of new ocean use information. DOS digitized the geographic information provided by ocean users and created an aggregate dataset, including linked attribute data characterizing each mapped use area. DOS staff returned to the organizations that provided ocean use information to “ground truth” the digitized data during the winter of 2011 and through the spring of 2012. These geographic data were updated/corrected based on participant feedback.View Dataset on the Gateway
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset compiles a comprehensive database containing 90,327 street segments in New York City, covering their street design features, streetscape design, Vision Zero treatments, and neighborhood land use. It has two scales-street and street segment group (aggregation of same type of street at neighborhood). This dataset is derived based on all publicly available data, most from NYC Open Data. The detailed methods can be found in the published paper, Pedestrian and Car Occupant Crash Casualties Over a 9-Year Span of Vision Zero in New York City. To use it, please refer to the metadata file for more information and cite our work. A full list of raw data source can be found below:
Motor Vehicle Collisions – NYC Open Data: https://data.cityofnewyork.us/Public-Safety/Motor-Vehicle-Collisions-Crashes/h9gi-nx95
Citywide Street Centerline (CSCL) – NYC Open Data: https://data.cityofnewyork.us/City-Government/NYC-Street-Centerline-CSCL-/exjm-f27b
NYC Building Footprints – NYC Open Data: https://data.cityofnewyork.us/Housing-Development/Building-Footprints/nqwf-w8eh
Practical Canopy for New York City: https://zenodo.org/record/6547492
New York City Bike Routes – NYC Open Data: https://data.cityofnewyork.us/Transportation/New-York-City-Bike-Routes/7vsa-caz7
Sidewalk Widths NYC (originally from Sidewalk – NYC Open Data): https://www.sidewalkwidths.nyc/
LION Single Line Street Base Map - The NYC Department of City Planning (DCP): https://www.nyc.gov/site/planning/data-maps/open-data/dwn-lion.page
NYC Planimetric Database Median – NYC Open Data: https://data.cityofnewyork.us/Transportation/NYC-Planimetrics/wt4d-p43d
NYC Vision Zero Open Data (including multiple datasets including all the implementations): https://www.nyc.gov/content/visionzero/pages/open-data
NYS Traffic Data - New York State Department of Transportation Open Data: https://data.ny.gov/Transportation/NYS-Traffic-Data-Viewer/7wmy-q6mb
Smart Location Database - US Environmental Protection Agency: https://www.epa.gov/smartgrowth/smart-location-mapping
Race and ethnicity in area - American Community Survey (ACS): https://www.census.gov/programs-surveys/acs