I wanted to find a better way to provide live traffic updates. We dont all have access to the data from traffic monitoring sensors or whatever gets uploaded from people's smart phones to Apple, Google etc plus I question how accurate the traffic congestion is on Google Maps or other apps. So I figured that since buses are also in the same traffic and many buses stream their GPS location and other data live, that would be an ideal source for traffic data. I investigated the data streams available from many bus companies around the world and found MTA in NYC to be very reliable.
This dataset is from the NYC MTA buses data stream service. In roughly 10 minute increments the bus location, route, bus stop and more is included in each row. The scheduled arrival time from the bus schedule is also included, to give an indication of where the bus should be (how much behind schedule, or on time, or even ahead of schedule).
Data is recorded from the MTA SIRI Real Time data feed and the MTA GTFS Schedule data.
I want to see what exploratory & discovery people come up with from this data. Feel free to download this dataset for your own use however I would appreciate as many Kernals included on Kaggle as we can get.
Based on the interest this generates I plan to collect more data for subsequent months down the track.
OPT provides transportation service to many different kinds of locations. Many of these locations are schools but they also include offices or other sites that may be part of certain students’ educational plans. The schools may be public, private or religious. OPT provides busing to some Pre-K sites for students who have an IEP for curb-to-curb busing because of medical condition. Transportation service is not limited to school bus service; it includes distribution of MetroCards and approved reimbursement services. Bus service can be conducted on a yellow school bus, an ambulance, or even a coach bus. Yellow school buses are available in a number of sizes and seating configurations. This dataset includes schools, offices or Pre-K/EI sites that currently receive any transportation services from OPT. These sites may be within the New York City limits or up to fifty miles from the city limits in the states of New York, New Jersey or Connecticut. This dataset does not include field trip destinations.
This dataset provides the load percentage for each express bus route at its maximum load point (the bus stop where the highest number of passengers are on the bus) by direction, hour, day type (weekday, weekends and holidays), aggregated by week.
One of OPT’s main functions is to plan efficient and fiscally responsible school bus routes. OPT staff use a variety of systems to generate and share bus route information with bus vendors and the public. Specific bus route paths cannot be publicly disclosed because they could reveal personally identifiable information about individual students. In this dataset, OPT has provided all the route information that does not risk disclosing personally identifiable information. School-age service for students in grades K through 12 are contracted with bus vendors on a per route basis. OPT also manages bus service for Pre-K students who require curb-to-curb service as per a student’s Individualized Education Plan (IEP). This Pre-K bus service is contracted on a per student basis, instead of per route. As a consequence of this difference, OPT does not design bus routes for Pre-K service, so those routes are not included in this dataset. There are a variety of different vehicles used on routes that serve students requiring curb-to-curb service because an Individualized Education Plan (IEP) indicates specific transportation needs. The standard bus is the only vehicle used for general education routes with students eligible for bus service but who do not have an IEP. Users may occasionally see a route without a garage assignment. Because this dataset is derived from a snapshot of a transactional system, there may be routes that are in the process of being assigned to a garage. In those cases, the garage information will appear as NULL until the assignment is complete.
OPT maintains an inventory of all the vehicles used by contracted bus vendors in service to the Department of Education. The vehicles inventory can include school buses, ambulances or coach buses. This dataset details the active vehicle inventory for each bus vendor. There are a variety of different vehicles used to serve students requiring curb-to-curb service because an Individualized Education Plan (IEP) indicates specific transportation needs. The standard bus is the only vehicle used for general education students eligible for bus service but who do not have an IEP.
This dataset provides subway and bus mask compliance statistics from MTA surveys that took place between June 2020 and April 2022. It provides the number of observations in each survey, and the percentages for the number of people with no mask, wearing a mask, wearing a mask incorrectly, and wearing a mask correctly.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Transportation Sites’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/60288b7b-a1c5-4fc5-889c-5d544bd6ed4a on 13 February 2022.
--- Dataset description provided by original source is as follows ---
OPT provides transportation service to many different kinds of locations. Many of these locations are schools but they also include offices or other sites that may be part of certain students’ educational plans. The schools may be public, private or religious. OPT provides busing to some Pre-K sites for students who have an IEP for curb-to-curb busing because of medical condition. Transportation service is not limited to school bus service; it includes distribution of MetroCards and approved reimbursement services. Bus service can be conducted on a yellow school bus, an ambulance, or even a coach bus. Yellow school buses are available in a number of sizes and seating configurations. This dataset includes schools, offices or Pre-K/EI sites that currently receive any transportation services from OPT. These sites may be within the New York City limits or up to fifty miles from the city limits in the states of New York, New Jersey or Connecticut. This dataset does not include field trip destinations.
--- Original source retains full ownership of the source dataset ---
Students are reported as those who were planned and routed, but may or may not have used the service provided. Student stop assignment is done at the school, OPT creates the routes. Routes were determined by those routed with students assigned Vehicles are identified by school bus company reported data. This data is subject to data entry error and is dependent on the school bus companies to maintain. All transportation sites were included, and reported by type. A transportation site was defined by a unique location identified by a street address. Please note multiple students or schools can be located in the same address and would there for represent multiple stops at the same site.
For the 2020-2021 school year, only students who were enrolled in blended learning were assigned to bus routes. Therefore, students who are routed are assumed to be enrolled in blended learning. Students are reported as those who were planned and routed, but may or may not have used the service provided. Routes were determined by those routed with a vendor and student(s) assigned Vehicles are identified by school bus company reported data. This data is subject to data entry error and is dependent on the school bus companies to maintain. All transportation sites were included, and reported by type. A transportation site was defined by a unique location identified by a street address. Please note: multiple students or schools can be located in the same address and would therefore represent a stop for multiple students at the same site. MetroCard passes are assigned at the school. For public and charter schools, OPT reports students assigned to MetroCard, which includes those assigned a MetroCard by serial number or "T'd" for MetroCard, meaning the school indicated the student as eligible for a MetroCard but did not yet assign a serial number. For Non-Public schools, OPT reports on students who are eligible for MetoCards if they are not assigned to yellow bus service Some special education schools do not classify students by grade level. Students attending such schools are identified as “NG.” Service type is representative only of the type of busing a student is assigned to and is not reflective of the educational classification of a student. Students who are classified as special education but do not have an IEP mandate for specific transportation accommodations may be assigned to general education busing. General education students under certain circumstances including, but not limited to temporary housing status, orders of protection and medical conditions may be assigned to special education busing. OPT assigns students to busing but cannot confirm student ridership. This reporting would have to be captured at the school level. Students in temporary housing (STH) situations other than those reported to be living in a DHS shelter* were excluded from the report due to data quality issues. While the DOE student data system (ATS) has a housing indicator flag, it is often inaccurate and unreliable. Therefore, OPT cannot reliably report on the temporary housing status of students other than those residing in DHS shelters, where a daily report indicating shelter status is provided. *Transportation data for students in Foster Care is also provided. Please see section 21-993 B7 for reporting. Stop to school bus service is typically not provided to students after the 6th grade. Those assigned to this service are students who are granted individual or school-wide exceptions. In addition to students who applied for transportation because of foster care placement, also reported were students known to be in foster care who received busing or A MetroCard to provide a more complete report of students in foster care who received service. Students in foster care were identified by a monthly data feed from ACS to OPT via DIIT. Source is subject to error due to data matching to identify OSIS based on name and data of birth.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This directory contains data on over 4.5 million Uber pickups in New York City from April to September 2014, and 14.3 million more Uber pickups from January to June 2015. Trip-level data on 10 other for-hire vehicle (FHV) companies, as well as aggregated data for 329 FHV companies, is also included. All the files are as they were received on August 3, Sept. 15 and Sept. 22, 2015.
FiveThirtyEight obtained the data from the NYC Taxi & Limousine Commission (TLC) by submitting a Freedom of Information Law request on July 20, 2015. The TLC has sent us the data in batches as it continues to review trip data Uber and other HFV companies have submitted to it. The TLC's correspondence with FiveThirtyEight is included in the files TLC_letter.pdf
, TLC_letter2.pdf
and TLC_letter3.pdf
. TLC records requests can be made here.
This data was used for four FiveThirtyEight stories: Uber Is Serving New York’s Outer Boroughs More Than Taxis Are, Public Transit Should Be Uber’s New Best Friend, Uber Is Taking Millions Of Manhattan Rides Away From Taxis, and Is Uber Making NYC Rush-Hour Traffic Worse?.
The dataset contains, roughly, four groups of files:
There are six files of raw data on Uber pickups in New York City from April to September 2014. The files are separated by month and each has the following columns:
Date/Time
: The date and time of the Uber pickupLat
: The latitude of the Uber pickupLon
: The longitude of the Uber pickupBase
: The TLC base company code affiliated with the Uber pickupThese files are named:
uber-raw-data-apr14.csv
uber-raw-data-aug14.csv
uber-raw-data-jul14.csv
uber-raw-data-jun14.csv
uber-raw-data-may14.csv
uber-raw-data-sep14.csv
Also included is the file uber-raw-data-janjune-15.csv
This file has the following columns:
Dispatching_base_num
: The TLC base company code of the base that dispatched the UberPickup_date
: The date and time of the Uber pickupAffiliated_base_num
: The TLC base company code affiliated with the Uber pickuplocationID
: The pickup location ID affiliated with the Uber pickupThe Base
codes are for the following Uber bases:
B02512 : Unter B02598 : Hinter B02617 : Weiter B02682 : Schmecken B02764 : Danach-NY B02765 : Grun B02835 : Dreist B02836 : Drinnen
For coarse-grained location information from these pickups, the file taxi-zone-lookup.csv
shows the taxi Zone
(essentially, neighborhood) and Borough
for each locationID
.
The dataset also contains 10 files of raw data on pickups from 10 for-hire vehicle (FHV) companies. The trip information varies by company, but can include day of trip, time of trip, pickup location, driver's for-hire license number, and vehicle's for-hire license number.
These files are named:
American_B01362.csv
Diplo_B01196.csv
Highclass_B01717.csv
Skyline_B00111.csv
Carmel_B00256.csv
Federal_02216.csv
Lyft_B02510.csv
Dial7_B00887.csv
Firstclass_B01536.csv
Prestige_B01338.csv
There is also a file other-FHV-data-jan-aug-2015.csv
containing daily pickup data for 329 FHV companies from January 2015 through August 2015.
The file Uber-Jan-Feb-FOIL.csv
contains aggregated daily Uber trip statistics in January and February 2015.
These surveys were conducted to collect data on travel origins and destinations, trip purposes, and travel characteristics of New York City Transit, Metro-North Railroad, and Long Island Rail Road customers with the aim of upgrading the MTA's travel forecasting tools and gaining a better understanding of how people travel. --LIRR origin-destination survey (2012-14) --Metro-North origin-destination survey (2007) --Metro-North origin-destination survey (2017) --MTA New York City travel survey (2008) --MTA New York City travel survey (2018)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
New York has become one of the worst-affected COVID-19 hotspots and a pandemic epicenter due to the ongoing crisis. This paper identifies the impact of the pandemic and the effectiveness of government policies on human mobility by analyzing multiple datasets available at both macro and micro levels for New York City. Using data sources related to population density, aggregated population mobility, public rail transit use, vehicle use, hotspot and non-hotspot movement patterns, and human activity agglomeration, we analyzed the inter-borough and intra-borough movement for New York City by aggregating the data at the borough level. We also assessed the internodal population movement amongst hotspot and non-hotspot points of interest for the month of March and April 2020. Results indicate a drop of about 80% in people’s mobility in the city, beginning in mid-March. The movement to and from Manhattan showed the most disruption for both public transit and road traffic. The city saw its first case on March 1, 2020, but disruptions in mobility can be seen only after the second week of March when the shelter in place orders was put in effect. Owing to people working from home and adhering to stay-at-home orders, Manhattan saw the largest disruption to both inter- and intra-borough movement. But the risk of spread of infection in Manhattan turned out to be high because of higher hotspot-linked movements. The stay-at-home restrictions also led to an increased population density in Brooklyn and Queens as people were not commuting to Manhattan. Insights obtained from this study would help policymakers better understand human behavior and their response to the news and governmental policies.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The New York State Department of Transportation (NYSDOT) traffic control devices data set is a list of all of the traffic devices that are either owned or maintained by NYSDOT. The devices included are of various types such as traffic signals, street lights, beacons, flashers, navigational lights, Intelligent Transportation Systems (ITS), etc. Devices in the 5 boroughs of New York City are not owned or maintained by NYSDOT and therefore not represented in this dataset
This is a dataset hosted by the State of New York. The state has an open data platform found here and they update their information according the amount of data that is brought in. Explore New York State using Kaggle and all of the data sources available through the State of New York organization page!
This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.
Cover photo by Dadan Fitrayana on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
This dataset provides data on how fast buses are traveling between pairs of subsequent timepoints (the major stops on a bus route) for every bus route in the system. It will provide the average speed in miles per hour between pairs of timepoints, the average travel time in minutes, the road distance in miles, and the number of bus trips for each bus route aggregated by month, day of the week, and hour of day. It also provides the type of bus trip (Local, Limited, SBS, Express, School), the borough of the bus route, the names of the timepoints, their coordinates, and the stop sequence order of the first two timepoints in a pair.
This dataset provides data showing the number of vehicles (including cars, buses, trucks and motorcycles) that pass through each of the bridges and tunnels operated by the MTA each hour of the day. The data is updated weekly.
OPT provides transportation service to many different kinds of locations. Many of these locations are schools but they also include offices or other sites that may be part of certain students’ educational plans. The schools may be public, private or religious. OPT provides busing to some Pre-K sites for students who have an IEP for curb-to-curb busing because of medical condition. Transportation service is not limited to school bus service; it includes distribution of MetroCards and approved reimbursement services. Bus service can be conducted on a yellow school bus, an ambulance, or even a coach bus. Yellow school buses are available in a number of sizes and seating configurations. This dataset includes schools, offices or Pre-K/EI sites that currently receive any transportation services from OPT. These sites may be within the New York City limits or up to fifty miles from the city limits in the states of New York, New Jersey or Connecticut. This dataset does not include field trip destinations.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset represents a detailed compilation of trips made using yellow taxis in New York City. The data encapsulates a wide range of information, from pickup and drop_off times to fare amounts and payment types, offering a comprehensive view into urban mobility and the economics of taxi rides within the city. This dataset is invaluable for anyone interested in urban transportation trends, fare analysis, geographic movement patterns within New York City, and the study of temporal variations in taxi usage.
VendorID: A code indicating the provider associated with the trip record.
tpep_pickup_datetime: The date and time when the meter was engaged.
tpep_dropoff_datetime: The date and time when the meter was disengaged.
passenger_count: The number of passengers in the vehicle. This is a driver-entered value.
trip_distance: The distance of the trip measured in miles.
RatecodeID: The final rate code in effect at the end of the trip.
store_and_fwd_flag: Indicates whether the trip record was held in vehicle memory before sending to the vendor, Y=store and forward, N=not a store and forward trip.
PULocationID: The Taxi and Limousine Commission (TLC) Taxi Zone ID for the pickup location.
DOLocationID: The Taxi and Limousine Commission (TLC) Taxi Zone ID for the dropoff location.
payment_type: A numeric code signifying how the passenger paid for the trip.
fare_amount: The time-and-distance fare calculated by the meter.
extra: Miscellaneous extras and surcharges.
mta_tax: $0.50 MTA tax that is automatically triggered based on the metered rate in use.
tip_amount: Tip amount – This field is automatically populated for credit card tips. Cash tips are not included.
tolls_amount: Total amount of all tolls paid in trip.
improvement_surcharge: $0.30 improvement surcharge assessed trips at the flag drop. The surcharge began in 2015.
total_amount: The total amount charged to passengers. Does not include cash tips.
congestion_surcharge: A surcharge applied on trips that start, end, or pass through certain areas at specific times.
Suggest several research questions or project ideas that could be explored using the dataset. For example:
-Analyzing the impact of weather conditions on taxi usage.
-Exploring the correlation between trip distances and fares to identify pricing patterns.
-Investigating the effect of different times of day or days of the week on taxi demand.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Data extracted from records of tickets on file with NYS DMV. The tickets were issued to motorists for violations of: NYS Vehicle & Traffic Law (VTL), Thruway Rules and Regulations, Tax Law, Transportation Law, Parks and Recreation Regulations, Local New York City Traffic Ordinances, and NYS Penal Law pertaining to the involvement of a motor vehicle in acts of assault, homicide, manslaughter and criminal negligence resulting in injury or death.
This is a dataset hosted by the State of New York. The state has an open data platform found here and they update their information according the amount of data that is brought in. Explore New York State using Kaggle and all of the data sources available through the State of New York organization page!
This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.
Cover photo by Eric Welch on Unsplash
Parking Permits for People with Disabilities (PPPD- City) are issued to people who have a disability that severely and permanently impairs mobility and requires the use of a private automobile for transportation. Non-drivers, such as children with qualifying disabilities, are also eligible for consideration.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
To better understand the movement of people and goods throughout the Highlands Region, and to support the development of Regional Master Plan policies and long-term planning goals, a Baseline Transportation and Transit Layer was developed. This layer includes all major private bus carriers in the Highlands Region that operate on a daily basis on any of the US, State or County routes used in the analysis. The presence of a routine bus route indicates the potential for transit opportunity. Spatial data were acquired from NJ Transit, Morris County and Somerset County. Bus route data, maps and other information were collected from Sussex, Hunterdon, Bergen and Passaic Counties. Private bus providers were contacted in order to verify the presence of existing routes including Coach Bus, Short Line Bus, Lakeland Bus Lines and Trans-Bridge Lines. There are several private bus carriers including but not limited to Coach Bus, Short Line Bus, Lakeland Bus Lines and Trans-Bridge Lines which predominately run independent routes to Northeast New Jersey and New York City.
I wanted to find a better way to provide live traffic updates. We dont all have access to the data from traffic monitoring sensors or whatever gets uploaded from people's smart phones to Apple, Google etc plus I question how accurate the traffic congestion is on Google Maps or other apps. So I figured that since buses are also in the same traffic and many buses stream their GPS location and other data live, that would be an ideal source for traffic data. I investigated the data streams available from many bus companies around the world and found MTA in NYC to be very reliable.
This dataset is from the NYC MTA buses data stream service. In roughly 10 minute increments the bus location, route, bus stop and more is included in each row. The scheduled arrival time from the bus schedule is also included, to give an indication of where the bus should be (how much behind schedule, or on time, or even ahead of schedule).
Data is recorded from the MTA SIRI Real Time data feed and the MTA GTFS Schedule data.
I want to see what exploratory & discovery people come up with from this data. Feel free to download this dataset for your own use however I would appreciate as many Kernals included on Kaggle as we can get.
Based on the interest this generates I plan to collect more data for subsequent months down the track.