Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides comprehensive information about various aspects of Smart Cities worldwide, including infrastructure, population, technological advancements, green initiatives, and economic data. The data is generated synthetically but resembles realistic trends and distributions to make it suitable for predictive analysis, machine learning, and data science projects.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
mart city movements are growing all over the world. The un-dertaking is expected to solve a plethora of problems arising from urbanization. Indonesia is one of the countries who march toward the development of sustainable smart cities. However, before the government can start a smart city project, they need to assess the readiness of each target city. Data in this article illustrate the readiness of six major cities in Indonesia, which are Semarang, Makassar, Jakarta, Samarinda, Medan, and Surabaya. They repre-sent the four biggest islands in Indonesia. The readiness assess-ment was based on three main elements and six Smart City Pillars taken from Smart City Master Plan Preparation Guidance Book prepared by Ministry of Communication and Information Tech-nology of the Republic of Indonesia. Those elements serve as a checklist to determine the readiness of the cities. Data for quali-tative analysis were gathered through interviews and triangulated through secondary sources, such as publication from Statistics Indonesia and the assessment reports. The dataset contains in-formation on the readiness assessment is presented in this article. The indices of the six region's readiness assessment are presented in percentages.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Utilizing different types of IoT (Internet of Things) sensors to collect and manage data - combined with many other technical integrations into our city hubs - defines the future of data & automation being embedded in our urban-living. Think of Smart Cities as a customer experience - for residents of a city.
The Leap Data team utilized globally-recognized indices (formalized for the evaluation of Smart City initiatives), and developed a data model to interpret how Calgary & Edmonton stand in relation to Global Leaders of Smart City activities. The indices utilized to create these insights were developed exclusively from Open Datasets.
Smart City Index Methodology : [https://www.imd.org/globalassets/wcc/docs/smart_city/smart_city_index_methodology_and_groups.pdf] **I just shared this dataset, it wasn't me who collected and parameterized the data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data collected during a study "Transparency of open data ecosystems in smart cities: Definition and assessment of the maturity of transparency in 22 smart cities" (Sustainable Cities and Society (SCS), vol.82, 103906) conducted by Martin Lnenicka (University of Pardubice), Anastasija Nikiforova (University of Tartu), Mariusz Luterek (University of Warsaw), Otmane Azeroual (German Centre for Higher Education Research and Science Studies), Dandison Ukpabi (University of Jyväskylä), Visvaldis Valtenbergs (University of Latvia), Renata Machova (University of Pardubice).
This study inspects smart cities’ data portals and assesses their compliance with transparency requirements for open (government) data by means of the expert assessment of 34 portals representing 22 smart cities, with 36 features.
It being made public both to act as supplementary data for the paper and in order for other researchers to use these data in their own work potentially contributing to the improvement of current data ecosystems and build sustainable, transparent, citizen-centered, and socially resilient open data-driven smart cities.
Purpose of the expert assessment The data in this dataset were collected in the result of the applying the developed benchmarking framework for assessing the compliance of open (government) data portals with the principles of transparency-by-design proposed by Lněnička and Nikiforova (2021)* to 34 portals that can be considered to be part of open data ecosystems in smart cities, thereby carrying out their assessment by experts in 36 features context, which allows to rank them and discuss their maturity levels and (4) based on the results of the assessment, defining the components and unique models that form the open data ecosystem in the smart city context.
Methodology Sample selection: the capitals of the Member States of the European Union and countries of the European Economic Area were selected to ensure a more coherent political and legal framework. They were mapped/cross-referenced with their rank in 5 smart city rankings: IESE Cities in Motion Index, Top 50 smart city governments (SCG), IMD smart city index (SCI), global cities index (GCI), and sustainable cities index (SCI). A purposive sampling method and systematic search for portals was then carried out to identify relevant websites for each city using two complementary techniques: browsing and searching. To evaluate the transparency maturity of data ecosystems in smart cities, we have used the transparency-by-design framework (Lněnička & Nikiforova, 2021)*. The benchmarking supposes the collection of quantitative data, which makes this task an acceptability task. A six-point Likert scale was applied for evaluating the portals. Each sub-dimension was supplied with its description to ensure the common understanding, a drop-down list to select the level at which the respondent (dis)agree, and a comment to be provided, which has not been mandatory. This formed a protocol to be fulfilled on every portal. Each sub-dimension/feature was assessed using a six-point Likert scale, where strong agreement is assessed with 6 points, while strong disagreement is represented by 1 point. Each website (portal) was evaluated by experts, where a person is considered to be an expert if a person works with open (government) data and data portals daily, i.e., it is the key part of their job, which can be public officials, researchers, and independent organizations. In other words, compliance with the expert profile according to the International Certification of Digital Literacy (ICDL) and its derivation proposed in Lněnička et al. (2021)* is expected to be met. When all individual protocols were collected, mean values and standard deviations (SD) were calculated, and if statistical contradictions/inconsistencies were found, reassessment took place to ensure individual consistency and interrater reliability among experts’ answers. *Lnenicka, M., & Nikiforova, A. (2021). Transparency-by-design: What is the role of open data portals?. Telematics and Informatics, 61, 101605 *Lněnička, M., Machova, R., Volejníková, J., Linhartová, V., Knezackova, R., & Hub, M. (2021). Enhancing transparency through open government data: the case of data portals and their features and capabilities. Online Information Review.
Test procedure (1) perform an assessment of each dimension using sub-dimensions, mapping out the achievement of each indicator (2) all sub-dimensions in one dimension are aggregated, and then the average value is calculated based on the number of sub-dimensions – the resulting average stands for a dimension value - eight values per portal (3) the average value from all dimensions are calculated and then mapped to the maturity level – this value of each portal is also used to rank the portals.
Description of the data in this data set Sheet#1 "comparison_overall" provides results by portal Sheet#2 "comparison_category" provides results by portal and category Sheet#3 "category_subcategory" provides list of categories and its elements
Format of the file .xls
Licenses or restrictions CC-BY
For more info, see README.txt
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is a simulated, multivariate representation of smart city development indicators across 27 global cities from 2007 to 2023. It has been designed to support academic research focused on the intersection of IoT (Internet of Things), artificial intelligence (AI), social safety, and ecosystem restoration within the context of smart urban planning.
The dataset includes yearly observations for each city and covers key variables such as:
IoT device integration
AI adoption scores
Crime rates
Emergency response times
Air quality indices (AQI)
Green cover percentages
Public safety satisfaction
Smart policy implementation status
These indicators were selected to reflect the core elements of sustainable and secure smart city infrastructure. To enhance realism, the data values and variable distributions were modeled based on statistical patterns observed in global urban datasets from trusted public repositories.
A significant reference for constructing this dataset was the World Bank Open Data platform, which provides detailed development indicators on urbanization, environmental health, and governance. Variables like air quality, urban green space, and emergency response have been conceptually derived from regional patterns and indicators reported by the World Bank and other intergovernmental sources.
Reference Acknowledgment: While the dataset is synthetic and self-generated for research and analysis purposes, its structure and baseline trends are informed by international data sources including the World Bank Open Data and the World Urbanization Prospects by the United Nations.
This dataset is suitable for multivariate statistical analysis, interaction modeling, and policy simulation studies focusing on smart governance, urban safety, and sustainable development.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Dataset comprises 5,000+ images of trash cans captured in various outdoor environments at different times of day and under diverse weather conditions. This extensive collection is designed for research in waste classification and detection methods, researchers can advance their capabilities in deep learning and machine learning applications, specifically in the areas of image recognition and instance segmentation.
By utilizing this dataset, researchers can enhance recommendation systems, optimize processes, and automate the operations of community services in smart cities. - Get the data
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2Fe69b1fb71e20bdb551691cf67106b7e8%2FFrame%20182%20(1).png?generation=1734355735483445&alt=media" alt="">
Each image in the dataset is accompanied by an XML annotation that provides detailed information about the types of trash bins and their respective capacities: full, empty, or scattered. The dataset is particularly valuable for developing and testing segmentation models and detection performances in real-world scenarios.
The dataset supports various image classification and object detection tasks, enabling researchers and developers to enhance their understanding of waste management systems and improve litter detection technologies.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
License Plate Recognition - 118 798 Image
The dataset combines annotated license plates from real-world traffic across the UK, offering 118 798 images for license plate recognition, license plate detection, and OCR tasks, enabling advancements in autonomous vehicles, traffic management, and smart city systems. - Get the data
Dataset characteristics:
Characteristic Data
Description
License plate images with labeling for OCR tasks
Data types Image
Tasks… See the full description on the dataset page: https://huggingface.co/datasets/ud-smart-city/united-kingdom-license-plate-dataset.
Facebook
TwitterGlobally available, ON-DEMAND noise pollution maps generated from real-world measurements (our sample dataset) and AI interpolation. Unlike any other available noise-level data sets! GIS-ready, high-resolution visuals for real estate platforms, government dashboards, and smart city applications.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset simulates a fully instrumented smart city in the year 2032, where every movement, environmental reading, and energy interaction is tracked through an advanced urban IoT network. It contains 10,000 unique citizen events generated across fictional districts like Quantum Bay, Neon Habitat, AeroTech District, Solaris Sector, and Hydra Loop.
Each row captures a real-time snapshot of the city: mobility choices such as HyperLoop, HoverCab, MagRail, E-Bike, energy usage from Solar, Wind, FusionCell, HydroGrid, environmental signals like air quality, temperature, noise, and behavioral patterns such as commuting, shopping, leisure, work, health, and more. The data is synthetic but built to mimic real-world smart-city sensors, making it safe, clean, and ideal for machine-learning experimentation.
The dataset blends numerical, categorical, and time-series features, enabling broad exploration across forecasting, clustering, anomaly detection, behavioral modeling, and simulation tasks. Its futuristic theme makes it visually appealing and highly engaging for Kaggle users while still serving as a robust ML playground.
Dataset Features
10,000 rows • 14 columns
Timestamps entirely from 2032
Environmental signals (temperature, air quality, noise levels)
Energy consumption in kWh
Transport modes across next-gen mobility systems
Activity types reflecting citizen routines
Predicted crowd density
Security alert flags
Clean, structured, ready-to-use CSV format
Ideal For
Time-series prediction
Urban analytics
IoT simulations
Anomaly detection
Energy forecasting
Behavioral clustering
Environmental modeling
ML experimentation & education
Synthetic-data research
GenAI evaluation frameworks
This dataset aims to combine the appeal of a sci-fi world with the practicality of real machine-learning structure, making it both interesting to explore and powerful for modeling.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
In January 2017, PAVIC submitted a survey focused on Smart City and collected data from 1076 people. This survey was fully anonymous and was aimed at improving the citizens' lives in the future Smart City
The idea of the survey is to obtain a precise insight concerning the citizens' reactions to different recommendations in two different contexts. In clear, respondents were asked to choose among a set of 18 recommendations those that they would be most interested in if it were proposed in two different contexts: on a sunny and warm (20°C) Saturday afternoon in Spring (referred to as the "Sun" context) and on a rainy and cold (8°C) Saturday afternoon in Winter (referred to as the "Rain" context). The recommendations concerned various subjects : social or cultural events, discounts in restaurants, useful city information, etc. and people were asked in each context which they would like to receive as push notifications on their phones. For each context, respondents could give several or no responses.
The following are the precise text of the questions submitted to the respondents : - for the "Sun" dataset : A Saturday in spring around 4pm with a comfy temperature of 20°C or 68°F. You are downtown in the city for the afternoon and your mobile application can send you personalized services/activities notifications in real-time. Which of the following activities would you want to receive in your notifications ? (Several may be chosen). - for the "Rain" dataset : A rainy Saturday in the winter around 4pm with a cool temperature of 8°C or 46°F. You are downtown in the city for the afternoon and your mobile application can send you personalized services/activities notifications in real-time. Which of the following activities would you want to receive in your notifications ? (Several may be chosen).
These dataset could allow future applications both to simulate recommendation system algorithms, and to deduce clusters from the collected profiles.
Facebook
TwitterThe world’s largest noise complaint dataset with over 160K reports including labeled noise sources. Ideal for AI training in acoustic event detection and urban noise analysis. Available via CSV, S3, and API (coming soon). GDPR-compliant.
Facebook
TwitterNoise Complaint Dataset —Acoustic Source Detection
Silencio offers the world’s largest noise complaint dataset, consisting of over 160,000 geolocated and categorized noise complaints collected directly through our mobile app. This unique dataset includes not only the location and time of each complaint but also the source of the noise (e.g., traffic, construction, nightlife, neighbors), making it a rare resource for research and monitoring focused on acoustic event classification, noise source identification, and urban sound analysis.
Unlike standard sound datasets, which often lack real-world context or human-labeled sources, Silencio’s dataset is built entirely from user-submitted reports, providing authentic, ground-truth labels for research. It is ideal for training models in sound recognition, urban noise prediction, acoustic scene analysis, and noise impact assessment.
Combined with Silencio’s Street Noise-Level Dataset, this complaint dataset allows researchers to correlate objective measurements with subjective community-reported noise events, opening up possibilities for multi-modal AI models that link noise intensity with human perception.
Data delivery options include: • CSV exports • S3 bucket delivery • (Upcoming) API access
All data is fully anonymized, GDPR-compliant, and available as both historical and updated datasets. We are open to early-access partnerships and custom formatting to meet AI research needs.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Where should we live in the next 10 years? Where should we settle down without relying on public transport? Which city should we move to without fearing losing our homes?
As weather patterns become more unpredictable with aggressive changes in temperatures, I collected some data below to see if there would be a city that could help assess our answers to the prior questions. I am curious to see if cities that typically have great infrastructure for walking, biking or public transit will be better prepared than those that are more typically car centric. Whichever you prefer, we can have a sense on where you might be migrating, and to which areas.
Here's how the data was collected:
The columns have different rating systems. The counties have all major climate risks expected in the future, while corresponding cities in each county have walking, transit and biking scores to assess livability without cars.
Understanding County Climate Risks The counties were were represented on a 1- 10 scale, based on RCP 8.5 levels. Here are the following explanations (0 = lowest, 10 = highest)
1) Heat: Heat is one of the largest drivers changing the niche of human habitability. Rhodium Group researchers estimate that, between 2040 and 2060 extreme temperatures, many counties will face extremely high temperatures for half a year. The measure shows how many weeks per year will we anticipate temperatures to soar above 95 degrees. (0 = 0 weeks, 10 = 26 weeks).
2) Wet Bulb: Wet bulb temperatures occur when heat meets excessive humidity. This is commonplace across cities that have a urban island heat effects (dense concentration of pavements, less nature, higher chances of absorbing heat). That combination creates wet bulb temperatures, where 82 degrees can feel like southern Alabama on its hottest day, making it dangerous to work outdoors and for children to play school sports. As wet bulb temperatures increase even higher, so will the risk of heat stroke — and even death. The measure shows how many days will a county experience high wet bulb temperatures yearly, from 2040 to 2060. (0 = 0 days, 10 = 70 days)
3) Farm Crop Yield: With rising temperatures, it will become more difficult to grow food. Corn and soy are the most prevalent crops in the U.S. and the basis for livestock feed and other staple foods, and they have critical economic significance. Because of their broad regional spread, they offer the best proxy for predicting how farming will be affected by rising temperatures and changing water supplies. As corn and soy production gets more sensitive to heat than drought, the US will see a huge continental divide between cooler counties now having more ability to produce, while current warmer counties loosing all abilities to produce basic crops. The expected measure shows the percent decline yields from 2040 to 2060 (0 = -20.5% decline, 10 = 92% decline).
4) Sea Level Rise: As sea levels rise, the share of property submerged by high tides increases dramatically, affecting a small sliver of the nation's land but a disproportionate share of its population. The rating measures how much of property in the county will go below high tide from 2040 to 2060 (0 = 0%, 10 = 25%).
5) Very Large Fires: With heat and evermore prevalent drought, the likelihood that very large wildfires (ones that burn over 12,000 acres) will affect U.S. regions increases substantially, particularly in the West, Northwest and the Rocky Mountains. The rating calculates how many average number of large fires will we expect to see per year (0 = N/A, 10 = 2.45) from 2040 to 2071.
6) Economic Damages: Rising energy costs, lower labor productivity, poor crop yields and increasing cr...
Facebook
TwitterThis dataset contains synthetic time-series data from smart city infrastructure sensors in large urban areas. It simulates data collected from various domains such as traffic density, energy consumption, waste occupancy rates, and environmental noise levels. Designed to reflect real-world dynamics, it serves as an ideal resource for various research and development projects including urban planning, IoT analytics, resource optimization, and anomaly detection. The dataset includes 245,616 hourly observations between 2023-01-01 00:00 and 2024-12-31 23:00.
Dataset Details The dataset simulates data collected from a total of 7 sensors across two major Turkish cities (2 cities). Fixed metadata, such as geographic location, sensor type, street type, and nearby significant services, are defined for each sensor.
Column Descriptions: -City ID/Name: Unique identifier or name of the city where the observation was made. (Data Type: string)
-Sensor ID/Name: Unique identifier or name of the sensor from which the observation was made. (Data Type: string)
-Latitude: Geographic latitude coordinate of the sensor. (Between -90 and +90) (Data Type: float)
-Longitude: Geographic longitude coordinate of the sensor. (Between -180 and +180) (Data Type: float)
-Date/Time: Date and time when the observation was made. (In YYYY-MM-DD HH:MM:SS format) (Data Type: datetime)
-Sensor Type: Categorical type of the sensor (e.g., 'Traffic Counter', 'Energy Meter', 'Waste Sensor', 'Environmental Sensor'). (Data Type: string)
-Vehicle Count: Number of vehicles measured if the sensor is a traffic counter. (Greater than or equal to 0) (Data Type: int)
-kWh: Amount of energy consumed (kWh) if the sensor is an energy meter. (Greater than or equal to 0) (Data Type: float)
-Occupancy Rate: Occupancy rate (%) of the parking space if the sensor is a parking sensor. (Between 0 and 100) (Data Type: float)
-Noise Level: Measured noise level (dB) if the sensor is an environmental sensor. (Greater than or equal to 0) (Data Type: float)
-Street Type: Type of street where the sensor is located (e.g., 'Main Road', 'Residential Area'). (Data Type: string)
-Nearby Services: Important services or points near the sensor (e.g., 'School', 'Hospital', 'Mall'). (Data Type: string)
Data Generation Methodology This dataset was created using complex synthetic data generation techniques. These techniques include:
-Time Series Models: Daily, weekly, and seasonal cycles were simulated for sensor data. For example, morning and evening peaks for traffic data, seasonal fluctuations for energy consumption, and weekly cycles for waste occupancy rates were modeled.
-Spatial Variations: The geographical locations of the sensors and environmental/infrastructure metadata (street type, nearby services) were designed to influence measured values.
-Inter-Feature Correlations: Logical correlations between different sensor types and environmental factors (e.g., traffic density and noise level) were incorporated into data generation.
-Realism Layers: Approximately 1-5% random missing values (NaN) and approximately 0.1-0.5% outliers were added to the dataset to simulate irregularities and sensor errors found in real-world data.
Simulated Scenarios The dataset includes the following specific scenarios simulating real-world events and policies affecting urban infrastructure:
Traffic Volume Increase - Main Road Event (Istanbul): Due to a major event on Istanbul's main roads in June 2023, traffic and noise levels increased.
Energy Saving Campaign (Ankara): An energy-saving campaign implemented in Ankara during the first quarter of 2024 reduced energy consumption.
Holiday Period Occupancy Rate Decrease (All Cities - Parking Areas): A decrease in waste sensor occupancy rates was observed in parking areas across all cities during the summer holiday period in 2023.
Curfew Effect (Istanbul - Residential Areas): A night-time curfew implemented in residential areas of Istanbul in November 2023 reduced traffic and noise.
Sensor Malfunction - Traffic Counter (Istanbul - IST-TRAF-001): A malfunction occurred in the IST-TRAF-001 traffic counter in Istanbul during the first week of May 2024, resulting in data loss. These scenarios enhance the dataset's analytical potential and its ability to address real-world problems.
Potential Use Cases This dataset can be valuable for researchers, data scientists, and students across various disciplines:
Smart City Planning: Optimizing urban traffic flow, improving energy distribution, and streamlining waste collection routes.
IoT Analytics: Processing, analyzing sensor data, and developing predictive models. Anomaly Detection: Identifying sensor malfunctions, unexpected traffic congestion, or energy fluctuations.
Infrastructure Monitoring and Maintenance: Monitoring sensor performance and developing predictive maintenance s...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In Smart Cities, technologies are playing an important role in efficiently managing the rapid growth of the world's industrialization today. The deployment of surveillance cameras has proliferated to improve public safety and security. Many Closed-Circuit Television (CCTV) cameras have been installed to monitor and safeguard public spaces efficiently within the cities. Despite advancements in technology, video and image processing still largely rely on manual observation. This manual analysis is time-consuming, prone to missing critical details, and costly in terms of labor and resources. Nevertheless, monitoring large video feeds for long periods indicates fatigue, demise of focus, and errors, particularly when video surveillance is a necessity.
Road anomaly detection is one of the prominent computer vision issues that researchers have investigated to guarantee public safety. Road anomaly identification is increasingly difficult and complex due to the variety and complexity of abnormalities.
Deep learning algorithms must be efficient but also need a large dataset to train to recognize road anomalies in different environments. We proposed a custom real-world data set containing road anomaly images and videos that are made available to the public and private surveillance systems. Primary data were collected from diverse sites in Pakistan, and the data were gathered by recording videos and capturing images by using mobile and surveillance cameras The dataset encompasses five major categories of road anomaly effects.: vehicle accidents, vehicle fire, fighting, snatching(gunpoint), and potholes that classification modeling while promoting improvement in both scientific research and realistic application. The dataset also encompasses annotations with You Only Look Once (YOLO) based bounding boxes and class label files in text format for every image.
The researchers can utilize data to train and validate their anomaly detection algorithms and models, thus increasing public security and safety. This dataset focuses on natural environment scenes with a detailed examination of safe transportation and impacts on broader environmental knowledge. Data can give to the liable and ethical arrangement of Artificial Intelligence technologies in surveillance security system
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The FlowSync Commuter Survey 2025 is designed to capture the realities of daily urban commuting in India, with a focus on travel time, route choices, flexibility, and challenges faced by commuters.
With rising congestion, long commute hours, and the growing demand for flexible work arrangements, understanding commuter behavior is critical for urban mobility planning, corporate HR strategies, and smart transport solutions.
This dataset contains anonymized responses from participants who shared:
Their average daily commute time and routes
Willingness to shift travel times to off-peak slots
Preferred departure slots for flexibility in travel
Incentive preferences to encourage schedule adjustments
By combining quantitative commute metrics with qualitative preferences, this dataset provides a rare view into how people balance time, convenience, and incentives in their daily travel.
🔑 Key Features
Commute Duration (in minutes)
Route Information (origin → destination)
Travel Flexibility (willingness to shift commute time)
Preferred Evening Departure Slots
Incentive Preferences (transport stipends, flexible perks, etc.)
🚀 Use Cases
This dataset is ideal for data analysts, urban researchers, and AI/ML practitioners exploring real-world commuting behaviors. Some possible applications include:
Urban Planning & Smart Cities
Identifying high-congestion routes and times
Designing policies to encourage off-peak commuting
Transport Optimization
Studying how incentives affect commuting flexibility
Evaluating demand for staggered work hours
Corporate HR & Workplace Strategy
Supporting remote/hybrid policy decisions
Designing incentive models for flexible office timings
AI & Machine Learning Applications
Predictive models for commute time under different conditions
Clustering commuters based on behavior & preferences
Policy & Sustainability Research
Analyzing willingness to adopt eco-friendly modes
Supporting strategies to reduce peak-hour congestion
🌍 Impact
By publishing this dataset openly, FlowSync aims to contribute towards building smarter mobility solutions that improve commuter well-being, reduce traffic load, and promote sustainable transport policies for Indian cities.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset comprises 1500 high-quality images depicting various forms of road surface damage collected from major cities in Bangladesh, specifically Dhaka, Mymensingh, and Chattogram. The dataset captures real-world conditions of urban and semi-urban road networks, providing valuable visual data for analysis in computer vision, machine learning, deep learning, and civil infrastructure research. The images are captured under diverse lighting conditions and angles, ensuring variability and practical utility for robust algorithm development.
Dataset Composition:
Total Images: 1500
Format: JPG
Image Resolution: Varied, high-resolution suitable for computer vision tasks.
Class-wise Distribution:
Asphalt Damage: 500 images
Crack: 500 images
Pothole: 500 images
Dataset Potential Applications:
Training, validation, and benchmarking for deep learning and machine learning algorithms focusing on road infrastructure assessment.
Development of computer vision-based automated systems for road damage detection and classification.
Research and development in intelligent transportation systems (ITS), smart city infrastructure management, and predictive road maintenance.
Analysis and testing of algorithms for damage severity assessment and automated cost estimation for repairs.
Intended Users:
Researchers in civil engineering, transportation, and urban planning.
Machine learning and computer vision practitioners focus on infrastructure monitoring and predictive maintenance.
Government bodies and policymakers are interested in infrastructural health assessments and proactive maintenance planning.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the context of global climate change, green development has become the main goal of smart city construction. Most existing research suggests that smart cities will enhance the level of the green total factor productivity (GTFP) in cities. However, this study found that smart cities will reduce the level of green total factor production in the short term and increase it in the long term. Based on this, this article selects three batches of smart cities in China from 2013 to 2019, and uses the Malmquist index model, common frontier function, and panel data method to analyze the GTFP model in the early stage of smart city construction in China. The study found that: (1) the GTFP of the three batches of smart cities in the early stage of construction was less than 1 and showed a downward trend, indicating that smart cities will reduce the GTFP level of cities in the short term. (2) Technical efficiency is the main reason for the decline of GTFP in the early stage of smart city construction and the rise of GTFP in the medium and long term. Specifically, there is a U-shaped relationship between the technological efficiency of smart cities and their GTFP. For every 1% increase in technical efficiency in the later stages of smart cities, GTFP increases by 47.3%. (3) The GTFP in the process of smart city construction shows a trend of decreasing in the early stage and increasing in the middle and later stages. The GTFP level in the later stage of smart cities is greater than 1 and shows a fluctuating upward trend, indicating that smart cities will improve the city’s GTFP level in the long run. In view of this, we should attach importance to ecological protection in the early stage of smart city construction and take effective measures to reduce carbon emissions during this period. During this period, policies such as taxation can be implemented to encourage companies to adopt cleaner production technologies, strengthen the exchange of green technologies between cities, accelerate the flow of green knowledge, reduce redundant construction of information infrastructure, and thus minimize the decline in GTFP in the early stages of smart city construction. This study provides policy recommendations and decision-making references for further promoting the construction of new green and smart cities worldwide.
Facebook
TwitterIntroduction: Traffic congestion remains a significant challenge in urban environments, and optimizing traffic signals plays a crucial role in easing traffic flow. This dataset is designed to aid researchers and developers working on intelligent traffic management systems. It provides comprehensive data collected from three different sources, each offering unique insights into vehicle detection and traffic patterns. Dataset's collection strategy: Kaggle Data Collection 1.Source: Curated datasets from Kaggle, including well-known vehicle detection collections. 2.Content: Contains images and labels of vehicles such as cars, buses, and bikes. 3.Purpose: Provides standardized data for baseline testing and model comparisons. 4.Format: Images in JPEG format with associated YOLO compatible label files (.txt). Link: https://www.kaggle.com/datasets/tubasiddiqui/toy-cars-annotated-on-yolo-format
Custom Data Collection 1.Source: Synthetic and toy vehicle images created in controlled conditions. 2.Content: Features miniature models of cars, buses, and motorcycles. 3.Purpose: Ensures a controlled environment for initial model training and testing, simulating various lighting and angle conditions. 4.Format: JPEG images with YOLO annotation files. Real-Environment Data (Skardu City) 1.Source: Collected from various locations in Skardu city. 2.Content: Real-world images capturing vehicles in diverse scenarios, including intersections, narrow streets, and busy roads. 3.Purpose: Provides data reflecting real traffic conditions, environmental variations, and vehicle diversity, crucial for training robust models. 4.Format: High-resolution JPEG images with detailed annotation files.
Potential Applications Traffic Signal Optimization: Train machine learning models to adjust traffic signals dynamically based on real-time vehicle detection. Autonomous Vehicle Navigation: Use real-world data to enhance the perception systems of self-driving cars. Traffic Flow Analysis: Analyze congestion patterns and develop predictive models for traffic management. Smart City Initiatives: Develop solutions to improve urban mobility and reduce traffic-related issues. How to Use the Dataset 1.Download: interested user can download the dataset from this platform. 2.Training: Use the YOLO compatible images and labels to train object detection models. 3.Testing and Validation: Validate your models on real-world data to assess performance under varying conditions. Acknowledgments We thank the team involved in data collection across Skardu city and the community contributions from Kaggle. This dataset aims to facilitate advancements in smart traffic systems and support innovative solutions for traffic management. Contribute Feedback and contributions are welcome! Let's collaborate to improve and expand this dataset for future research and practical applications.
Facebook
TwitterStreet Noise-Level — Statistically Interpolated + Processed Measurements
Connect with our experts for the world’s most comprehensive Street Noise-Level Dataset. Access hyper-local and global average noise levels (dBA) from public streets across over 180+ countries. This dataset, built using over 35 billion datapoints and developed in collaboration with leading acoustics professionals, provides unparalleled insight into real-world urban soundscapes. Unlike conventional noise models, which rely solely on simulations, our dataset combines real measurements with AI-powered interpolation to deliver statistically robust, highly accurate, and spatially complete noise-level data.
Power Your AI & Urban Analytics with Real-World Noise Insights
What makes this dataset unique? Silencio’s processed and interpolated Street Noise-Level Dataset is the largest and most precise global collection of acoustic data available. It integrates real user-collected measurements with AI-driven modeling, ensuring unmatched ground truth for AI training, urban intelligence, and noise-impact assessments.
Optimized for AI, Urban Planning & Research:
Empower your AI models and spatial analyses with rich, diverse, and realistic noise data. Ideal for sound recognition, smart cities, mobility modeling, noise mapping, real estate analysis, and sustainable urban planning.
Trusted & Compliant:
All data is collected via our mobile app, strictly anonymized, fully consented, and 100% GDPR-compliant — ensuring privacy and ethical integrity.
Historical & Up-to-Date:
Leverage both historical and continuously updated noise data to uncover trends, detect change, and power predictive models.
Hyper-Local & Global Coverage:
With coverage of over 180+ countries and high spatial granularity, the dataset provides insights from the city level down to street segments.
Seamless Integration:
Delivered via CSV exports or S3 bucket delivery (APIs coming soon) for easy integration into AI training pipelines, geospatial tools, or analytics platforms.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides comprehensive information about various aspects of Smart Cities worldwide, including infrastructure, population, technological advancements, green initiatives, and economic data. The data is generated synthetically but resembles realistic trends and distributions to make it suitable for predictive analysis, machine learning, and data science projects.