https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Airline data holds immense importance as it offers insights into the functioning and efficiency of the aviation industry. It provides valuable information about flight routes, schedules, passenger demographics, and preferences, which airlines can leverage to optimize their operations and enhance customer experiences. By analyzing data on delays, cancellations, and on-time performance, airlines can identify trends and implement strategies to improve punctuality and mitigate disruptions. Moreover, regulatory bodies and policymakers rely on this data to ensure safety standards, enforce regulations, and make informed decisions regarding aviation policies. Researchers and analysts use airline data to study market trends, assess environmental impacts, and develop strategies for sustainable growth within the industry. In essence, airline data serves as a foundation for informed decision-making, operational efficiency, and the overall advancement of the aviation sector.
This dataset comprises diverse parameters relating to airline operations on a global scale. The dataset prominently incorporates fields such as Passenger ID, First Name, Last Name, Gender, Age, Nationality, Airport Name, Airport Country Code, Country Name, Airport Continent, Continents, Departure Date, Arrival Airport, Pilot Name, and Flight Status. These columns collectively provide comprehensive insights into passenger demographics, travel details, flight routes, crew information, and flight statuses. Researchers and industry experts can leverage this dataset to analyze trends in passenger behavior, optimize travel experiences, evaluate pilot performance, and enhance overall flight operations.
https://i.imgur.com/cUFuMeU.png" alt="">
The dataset provided here is a simulated example and was generated using the online platform found at Mockaroo. This web-based tool offers a service that enables the creation of customizable Synthetic datasets that closely resemble real data. It is primarily intended for use by developers, testers, and data experts who require sample data for a range of uses, including testing databases, filling applications with demonstration data, and crafting lifelike illustrations for presentations and tutorials. To explore further details, you can visit their website.
Cover Photo by: Kevin Woblick on Unsplash
Thumbnail by: Airplane icons created by Freepik - Flaticon
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Some say climate change is the biggest threat of our age while others say it’s a myth based on dodgy science. We are turning some of the data over to you so you can form your own view.
Even more than with other data sets that Kaggle has featured, there’s a huge amount of data cleaning and preparation that goes into putting together a long-time study of climate trends. Early data was collected by technicians using mercury thermometers, where any variation in the visit time impacted measurements. In the 1940s, the construction of airports caused many weather stations to be moved. In the 1980s, there was a move to electronic thermometers that are said to have a cooling bias.
Given this complexity, there are a range of organizations that collate climate trends data. The three most cited land and ocean temperature data sets are NOAA’s MLOST, NASA’s GISTEMP and the UK’s HadCrut.
We have repackaged the data from a newer compilation put together by the Berkeley Earth, which is affiliated with Lawrence Berkeley National Laboratory. The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied. They also use methods that allow weather observations from shorter time series to be included, meaning fewer observations need to be thrown away.
In this dataset, we have include several files:
Global Land and Ocean-and-Land Temperatures (GlobalTemperatures.csv):
Other files include:
The raw data comes from the Berkeley Earth data page.
This parent dataset (collection of datasets) describes the general organization of data in the datasets for the 2009 and 2011 growing seasons (year) when sunflower (Helianthus annuus L.) was grown for seed grain at the USDA-ARS Conservation and Production Laboratory (CPRL), Soil and Water Management Research Unit (SWMRU), Bushland, Texas (Lat. 35.186714°, Long. -102.094189°, elevation 1170 m above MSL). Sunflower was grown for seed grain on two large, precision weighing lysimeters, each in the center of a 4.44 ha square field. The two fields were contiguous, arranged along a north-south axis, and were labeled northeast (NE), and southeast (SE). See the resource titled "Geographic Coordinates, USDA, ARS, Bushland, Texas" for UTM geographic coordinates for field and lysimeter locations. The fields were irrigated by a linear move sprinkler system equipped with spray applicators. Irrigations were managed to replenish soil water used by the crop on a weekly or more frequent basis as determined by soil profile water content readings made with a neutron probe from 0.10- to 2.4-m depth in the field. The number and spacing of neutron probe reading locations changed through the years (additional sites were added), which is one reason why subsidiary datasets and data dictionaries are needed. The lysimeters and fields were planted to the same plant density, row spacing, tillage depth (by hand on the lysimeters and by machine in the fields), and fertilizer and pesticide applications. The weighing lysimeters were used to measure relative soil water storage to 0.05 mm accuracy at 5-minute intervals, and the 5-minute change in soil water storage was used along with precipitation, dew and frost accumulation, and irrigation amounts to calculate crop evapotranspiration (ET), which is reported at 15-minute intervals. Each lysimeter was equipped with a suite of instruments to sense wind speed, air temperature and humidity, radiant energy (incoming and reflected, typically both shortwave and longwave), surface temperature, soil heat flux, and soil temperature, all of which are reported at 15-minute intervals. Instruments used changed from season to season, which is another reason that subsidiary datasets and data dictionaries for each season are required. Important conventions concerning the data-time correspondence, sign conventions, and terminology specific to the USDA ARS, Bushland, TX, field operations are given in the resource titled "Conventions for Bushland, TX, Weighing Lysimeter Datasets". There are six datasets in this collection. Common symbols and abbreviations used in the datasets are defined in the resource titled, "Symbols and Abbreviations for Bushland, TX, Weighing Lysimeter Datasets". Datasets consist of Excel (xlsx) files. Each xlsx file contains an Introductory tab that explains the other tabs, lists the authors, describes conventions and symbols used and lists any instruments used. The remaining tabs in a file consist of dictionary and data tabs. There is a dictionary tab for every data tab. The name of the dictionary tab contains the name of the corresponding data tab. Tab names are unique so that if individual tabs were saved to CSV files, each CSV file in the entire collection would have a different name. The six datasets, according to their titles, are as follows: Agronomic Calendars for the Bushland, Texas Sunflower Datasets Growth and Yield Data for the Bushland, Texas Sunflower Datasets Weighing Lysimeter Data for The Bushland, Texas Sunflower Datasets Soil Water Content Data for The Bushland, Texas, Large Weighing Lysimeter Experiments Evapotranspiration, Irrigation, Dew/frost - Water Balance Data for The Bushland, Texas Sunflower Datasets Standard Quality Controlled Research Weather Data – USDA-ARS, Bushland, Texas See the README for descriptions of each dataset. The land slope is <1% and topography is flat. The mean annual precipitation is ~470 mm, the 20-year pan evaporation record indicates ~2,600 mm Class A pan evaporation per year, and winds are typically from the South and Southwest. The climate is semi-arid with ~70% (350 mm) of the annual precipitation occurring from May to September, during which period the pan evaporation averages ~1520 mm. These datasets originate from research aimed at determining crop water use (ET), crop coefficients for use in ET-based irrigation scheduling based on a reference ET, crop growth, yield, harvest index, and crop water productivity as affected by irrigation method, timing, amount (full or some degree of deficit), agronomic practices, cultivar, and weather. Prior publications have described the facilities and research methods, and have focused on ET, crop coefficients, and crop water productivity. Crop coefficients have been used by ET networks for irrigation management. The data have utility for testing simulation models of crop ET, growth, and yield. Resources in this dataset: Resource Title: Geographic Coordinates of Experimental Assets, Weighing Lysimeter Experiments, USDA, ARS, Bushland, Texas. File Name: Geographic Coordinates, USDA, ARS, Bushland, Texas.xlsx. Resource Description: The file gives the UTM latitude and longitude of important experimental assets of the Bushland, Texas, USDA, ARS, Conservation & Production Research Laboratory (CPRL). Locations include weather stations [Soil and Water Management Research Unit (SWMRU) and CPRL], large weighing lysimeters, and corners of fields within which each lysimeter was centered. There were four fields designated NE, SE, NW, and SW, and a weighing lysimeter was centered in each field. The SWMRU weather station was adjacent to and immediately east of the NE and SE lysimeter fields. Resource Title: Conventions for Bushland, TX, Weighing Lysimeter Datasets. File Name: Conventions for Bushland, TX, Weighing Lysimeter Datasets.xlsx. Resource Description: Descriptions of conventions and terminology used in the Bushland, TX, weighing lysimeter research program. Resource Title: Symbols and Abbreviations for Bushland, TX, Weighing Lysimeter Datasets. File Name: Symbols and Abbreviations for Bushland, TX, Weighing Lysimeter Datasets.xlsx. Resource Description: Definitions of symbols and abbreviations used in the Bushland, TX, weighing lysimeter research datasets. Resource Title: README - Bushland Texas Sunflower collection. File Name: README_Bushland_sunflower_collection.pdf. Resource Description: Descriptions of the datasets in the Bushland Texas Sunflower collection.
Download Employee Travel Excel SheetThis dataset contains information about the employee travel expenses for the year 2021. Details are provided on the employee (name, title, department), the travel (dates, location, purpose) and the cost (expenses, recoveries). Expenses are broken down in separate tabs by Quarter (Q1, Q2, Q3 and Q4). Updated quarterly when expenses are prepared. Expenses for other years are available in separate datasets.
On an annual basis (individual hospital fiscal year), individual hospitals and hospital systems report detailed facility-level data on services capacity, inpatient/outpatient utilization, patients, revenues and expenses by type and payer, balance sheet and income statement.
Due to the large size of the complete dataset, a selected set of data representing a wide range of commonly used data items, has been created that can be easily managed and downloaded. The selected data file includes general hospital information, utilization data by payer, revenue data by payer, expense data by natural expense category, financial ratios, and labor information.
There are two groups of data contained in this dataset: 1) Selected Data - Calendar Year: To make it easier to compare hospitals by year, hospital reports with report periods ending within a given calendar year are grouped together. The Pivot Tables for a specific calendar year are also found here. 2) Selected Data - Fiscal Year: Hospital reports with report periods ending within a given fiscal year (July-June) are grouped together.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This parent dataset (collection of datasets) describes the general organization of data in the datasets for each growing season (year) when alfalfa (Medicago sativa L.) was grown as a reference evapotranspiration (ETr) crop at the USDA-ARS Conservation and Production Laboratory (CPRL), Soil and Water Management Research Unit (SWMRU), Bushland, Texas (Lat. 35.186714°, Long. -102.094189°, elevation 1170 m above MSL). Alfalfa was grown on two large, precision weighing lysimeters, calibrated to NIST standards (Howell et al., 1995). Each lysimeter was in the center of a 4.44 ha square field on which alfalfa was also grown (Evett et al., 2000). The two fields were contiguous and arranged with one (labeled northeast, NE) directly north of the other (labeled southeast, SE). See the resource "Geographic Coordinates, USDA, ARS, Bushland, Texas" for UTM geographic coordinates for field and lysimeter locations. Alfalfa was planted in Autumn 1995 and grown for hay in 1996, 1997, 1998, and 1999. The resource "Agronomic Calendar for the Bushland, Texas Alfalfa Datasets", gives a calendar listing by date the agronomic practices applied, severe weather, and activities (e.g. planting, thinning, fertilization, pesticide application, lysimeter maintenance, harvest) in and on lysimeters that could influence crop growth, water use, and lysimeter data. These include fertilizer and pesticide applications. There is one calendar, from before planting in autumn 1995 to after final harvest in 1999, for the NE and SE lysimeters and fields. There were 4 harvests each year except 1998 when 5 harvests were taken. Irrigation was by linear move sprinkler system equipped with pressure regulated low pressure sprays (mid-elevation spray application, MESA). Irrigations were managed to replenish soil water used by the crop on a weekly or more frequent basis as determined by soil profile water content readings via field-calibrated (Evett and Steiner, 1995) neutron probe from 0.10- to 2.4-m depth in the field. Lysimeters and fields were planted to the same plant density, row spacing, tillage depth (by hand on the lysimeters and by machine in the fields), and fertilizer and pesticide applications. Weighing lysimeters measured relative soil water storage to 0.05 mm accuracy at 5-min intervals, and the 5-min change in soil water storage was used along with precipitation, dew and frost accumulation, and irrigation amounts to calculate crop evapotranspiration (ET), reported at 15-min intervals. Each lysimeter was instrumented to sense wind speed, air temperature and humidity, radiant energy (incoming and reflected, typically both shortwave and longwave), surface temperature, soil heat flux, and soil temperature, all at 15-min intervals. Instruments used changed from season to season, thus subsidiary datasets and data dictionaries for each season are required. The Bushland weighing lysimeter research program is described by Evett et al. (2016), and lysimeter design is described by Marek et al. (1988). Important conventions concerning the data-time correspondence, sign conventions, and terminology specific to the USDA ARS, Bushland, TX, field operations are given in the resource "Conventions for Bushland, TX, Weighing Lysimeter Datasets". There are 5 datasets in this collection. Common symbols and abbreviations used are defined in the resource "Symbols and Abbreviations for Bushland, TX, Weighing Lysimeter Datasets". Datasets consist of Excel (xlsx) files. Each xlsx file contains an Introductory tab that explains the other tabs, lists the authors, describes conventions and symbols used, and lists instruments used. The remaining tabs in a file consist of dictionary and data tabs. The 5 datasets are:
Growth and Yield Data for the Bushland, Texas Alfalfa Datasets Weighing Lysimeter Data for The Bushland, Texas Alfalfa Datasets Soil Water Content Data for The Bushland, Texas, Large Weighing Lysimeter Experiments Evapotranspiration, Irrigation, Dew/frost - Water Balance Data for The Bushland, Texas Alfalfa Datasets Standard Quality Controlled Research Weather Data – USDA-ARS, Bushland, Texas
See README for descriptions of each dataset. The soil is a Pullman series fine, mixed, superactive, thermic Torrertic Paleustoll. Soil properties are given in the resource titled "Soil Properties for the Bushland, TX, Weighing Lysimeter Datasets". Land slope in the lysimeter fields is
We welcome any feedback on the structure of our data files, their usability, or any suggestions for improvements; please contact vehicles statistics.
Data tables containing aggregated information about vehicles in the UK are also available.
CSV files can be used either as a spreadsheet (using Microsoft Excel or similar spreadsheet packages) or digitally using software packages and languages (for example, R or Python).
When using as a spreadsheet, there will be no formatting, but the file can still be explored like our publication tables. Due to their size, older software might not be able to open the entire file.
df_VEH0120_GB: https://assets.publishing.service.gov.uk/media/6895d1963080e72710b2e2cf/df_VEH0120_GB.csv">Vehicles at the end of the quarter by licence status, body type, make, generic model and model: Great Britain (CSV, 59.1 MB)
Scope: All registered vehicles in Great Britain; from 1994 Quarter 4 (end December)
Schema: BodyType, Make, GenModel, Model, Fuel, LicenceStatus, [number of vehicles; 1 column per quarter]
df_VEH0120_UK: https://assets.publishing.service.gov.uk/media/6895d276586f9c9360656a18/df_VEH0120_UK.csv">Vehicles at the end of the quarter by licence status, body type, make, generic model and model: United Kingdom (CSV, 34.9 MB)
Scope: All registered vehicles in the United Kingdom; from 2014 Quarter 3 (end September)
Schema: BodyType, Make, GenModel, Model, Fuel, LicenceStatus, [number of vehicles; 1 column per quarter]
df_VEH0160_GB: https://assets.publishing.service.gov.uk/media/6895ef62586f9c9360656a2d/df_VEH0160_GB.csv">Vehicles registered for the first time by body type, make, generic model and model: Great Britain (CSV, 25.3 MB)
Scope: All vehicles registered for the first time in Great Britain; from 2001 Quarter 1 (January to March)
Schema: BodyType, Make, GenModel, Model, Fuel, [number of vehicles; 1 column per quarter]
df_VEH0160_UK: https://assets.publishing.service.gov.uk/media/6895f187e7be62b4f06431b1/df_VEH0160_UK.csv">Vehicles registered for the first time by body type, make, generic model and model: United Kingdom (CSV, 8.53 MB)
Scope: All vehicles registered for the first time in the United Kingdom; from 2014 Quarter 3 (July to September)
Schema: BodyType, Make, GenModel, Model, Fuel, [number of vehicles; 1 column per quarter]
In order to keep the datafile df_VEH0124 to a reasonable size, it has been split into 2 halves; 1 covering makes starting with A to M, and the other covering makes starting with N to Z.
df_VEH0124_AM: https://assets.publishing.service.gov.uk/media/68494acf91c75fd63dd3a3ae/df_VEH0124_AM.csv">Vehicles at the end of the year by licence status, body type, make (A to M), generic model, model, year of first use and year of manufacture: United Kingdom (CSV, 47.9 MB)
Scope: All licensed vehicles in the United Kingdom with Make starting with A to M; annually from 2014
Schema: BodyType, Make, GenModel, Model, YearFi
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Excel by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Excel. The dataset can be utilized to understand the population distribution of Excel by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Excel. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Excel.
Key observations
Largest age group (population): Male # 45-49 years (58) | Female # 5-9 years (55). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Excel Population by Gender. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Excel township by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Excel township. The dataset can be utilized to understand the population distribution of Excel township by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Excel township. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Excel township.
Key observations
Largest age group (population): Male # 55-59 years (17) | Female # 20-24 years (22). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Excel township Population by Gender. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cryptocurrency historical datasets from January 2012 (if available) to October 2021 were obtained and integrated from various sources and Application Programming Interfaces (APIs) including Yahoo Finance, Cryptodownload, CoinMarketCap, various Kaggle datasets, and multiple APIs. While these datasets used various formats of time (e.g., minutes, hours, days), in order to integrate the datasets days format was used for in this research study. The integrated cryptocurrency historical datasets for 80 cryptocurrencies including but not limited to Bitcoin (BTC), Ethereum (ETH), Binance Coin (BNB), Cardano (ADA), Tether (USDT), Ripple (XRP), Solana (SOL), Polkadot (DOT), USD Coin (USDC), Dogecoin (DOGE), Tron (TRX), Bitcoin Cash (BCH), Litecoin (LTC), EOS (EOS), Cosmos (ATOM), Stellar (XLM), Wrapped Bitcoin (WBTC), Uniswap (UNI), Terra (LUNA), SHIBA INU (SHIB), and 60 more cryptocurrencies were uploaded in this online Mendeley data repository. Although the primary attribute of including the mentioned cryptocurrencies was the Market Capitalization, a subject matter expert i.e., a professional trader has also guided the initial selection of the cryptocurrencies by analyzing various indicators such as Relative Strength Index (RSI), Moving Average Convergence/Divergence (MACD), MYC Signals, Bollinger Bands, Fibonacci Retracement, Stochastic Oscillator and Ichimoku Cloud. The primary features of this dataset that were used as the decision-making criteria of the CLUS-MCDA II approach are Timestamps, Open, High, Low, Closed, Volume (Currency), % Change (7 days and 24 hours), Market Cap and Weighted Price values. The available excel and CSV files in this data set are just part of the integrated data and other databases, datasets and API References that was used in this study are as follows: [1] https://finance.yahoo.com/ [2] https://coinmarketcap.com/historical/ [3] https://cryptodatadownload.com/ [4] https://kaggle.com/philmohun/cryptocurrency-financial-data [5] https://kaggle.com/deepshah16/meme-cryptocurrency-historical-data [6] https://kaggle.com/sudalairajkumar/cryptocurrencypricehistory [7] https://min-api.cryptocompare.com/data/price?fsym=BTC&tsyms=USD [8] https://min-api.cryptocompare.com/ [9] https://p.nomics.com/cryptocurrency-bitcoin-api [10] https://www.coinapi.io/ [11] https://www.coingecko.com/en/api [12] https://cryptowat.ch/ [13] https://www.alphavantage.co/
This dataset is part of the CLUS-MCDA (Cluster analysis for improving Multiple Criteria Decision Analysis) and CLUS-MCDAII Project: https://aimaghsoodi.github.io/CLUSMCDA-R-Package/ https://github.com/Aimaghsoodi/CLUS-MCDA-II https://github.com/azadkavian/CLUS-MCDA
These tables present high-level breakdowns and time series. A list of all tables, including those discontinued, is available in the table index. More detailed data is available in our data tools, or by downloading the open dataset.
We are proposing to make some changes to these tables in future, further details and a link to a feedback form can be found alongside the 2024 annual report.
The tables below are the latest final annual statistics for 2024, which are currently the latest available data. Provisional statistics for the first half of 2025 will be published in November 2025.
A list of all reported road collisions and casualties data tables and variables in our data download tool is available in the https://assets.publishing.service.gov.uk/media/68d3edb0ca266424b221b287/reported-road-casualties-gb-index-of-tables.ods">Tables index (ODS, 28.9 KB).
https://assets.publishing.service.gov.uk/media/68d42292b6c608ff9421b2d2/ras-all-tables-excel.zip">Reported road collisions and casualties data tables (zip file) (ZIP, 11.2 MB)
RAS0101: https://assets.publishing.service.gov.uk/media/68d3cdeeca266424b221b253/ras0101.ods">Collisions, casualties and vehicles involved by road user type since 1926 (ODS, 34.7 KB)
RAS0102: https://assets.publishing.service.gov.uk/media/68d3cdfee65dc716bfb1dcf3/ras0102.ods">Casualties and casualty rates, by road user type and age group, since 1979 (ODS, 129 KB)
RAS0201: https://assets.publishing.service.gov.uk/media/68d3ce0bc908572e81248c1f/ras0201.ods">Numbers and rates (ODS, 37.5 KB)
RAS0202: https://assets.publishing.service.gov.uk/media/68d3ce17b6c608ff9421b25e/ras0202.ods">Sex and age group (ODS, 178 KB)
RAS0203: https://assets.publishing.service.gov.uk/media/67600227b745d5f7a053ef74/ras0203.ods">Rates by mode, including air, water and rail modes (ODS, 24.2 KB) - this table will be updated for 2024 once data is available for other modes.
RAS0301: https://assets.publishing.service.gov.uk/media/68d3ce2b8c739d679fb1dcf6/ras0301.ods">Speed limit, built-up and non-built-up roads (ODS, 20.8 KB)
RAS0302: <span class="gem-c-attachmen
Excel file at postcode4 level of origins, destinations, preferences, motifs, percentages of freight transport in Randstad.
On 1 April 2025 responsibility for fire and rescue transferred from the Home Office to the Ministry of Housing, Communities and Local Government.
This information covers fires, false alarms and other incidents attended by fire crews, and the statistics include the numbers of incidents, fires, fatalities and casualties as well as information on response times to fires. The Ministry of Housing, Communities and Local Government (MHCLG) also collect information on the workforce, fire prevention work, health and safety and firefighter pensions. All data tables on fire statistics are below.
MHCLG has responsibility for fire services in England. The vast majority of data tables produced by the Ministry of Housing, Communities and Local Government are for England but some (0101, 0103, 0201, 0501, 1401) tables are for Great Britain split by nation. In the past the Department for Communities and Local Government (who previously had responsibility for fire services in England) produced data tables for Great Britain and at times the UK. Similar information for devolved administrations are available at https://www.firescotland.gov.uk/about/statistics/">Scotland: Fire and Rescue Statistics, https://statswales.gov.wales/Catalogue/Community-Safety-and-Social-Inclusion/Community-Safety">Wales: Community safety and https://www.nifrs.org/home/about-us/publications/">Northern Ireland: Fire and Rescue Statistics.
If you use assistive technology (for example, a screen reader) and need a version of any of these documents in a more accessible format, please email alternativeformats@communities.gov.uk. Please tell us what format you need. It will help us if you say what assistive technology you use.
Fire statistics guidance
Fire statistics incident level datasets
https://assets.publishing.service.gov.uk/media/686d2aa22557debd867cbe14/FIRE0101.xlsx">FIRE0101: Incidents attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 153 KB) Previous FIRE0101 tables
https://assets.publishing.service.gov.uk/media/686d2ab52557debd867cbe15/FIRE0102.xlsx">FIRE0102: Incidents attended by fire and rescue services in England, by incident type and fire and rescue authority (MS Excel Spreadsheet, 2.19 MB) Previous FIRE0102 tables
https://assets.publishing.service.gov.uk/media/686d2aca10d550c668de3c69/FIRE0103.xlsx">FIRE0103: Fires attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 201 KB) Previous FIRE0103 tables
https://assets.publishing.service.gov.uk/media/686d2ad92557debd867cbe16/FIRE0104.xlsx">FIRE0104: Fire false alarms by reason for false alarm, England (MS Excel Spreadsheet, 492 KB) Previous FIRE0104 tables
https://assets.publishing.service.gov.uk/media/686d2af42cfe301b5fb6789f/FIRE0201.xlsx">FIRE0201: Dwelling fires attended by fire and rescue services by motive, population and nation (MS Excel Spreadsheet, 192 KB) Previous FIRE0201 tables
<span class="gem
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Retail_Analysis_with_Walmart/main/Wallmart1.jpg" alt="">
One of the leading retail stores in the US, Walmart, would like to predict the sales and demand accurately. There are certain events and holidays which impact sales on each day. There are sales data available for 45 stores of Walmart. The business is facing a challenge due to unforeseen demands and runs out of stock some times, due to the inappropriate machine learning algorithm. An ideal ML algorithm will predict demand accurately and ingest factors like economic conditions including CPI, Unemployment Index, etc.
Walmart runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of all, which are the Super Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks. Part of the challenge presented by this competition is modeling the effects of markdowns on these holiday weeks in the absence of complete/ideal historical data. Historical sales data for 45 Walmart stores located in different regions are available.
The dataset is taken from Kaggle.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
In mountainous rivers, large relatively immobile grains partly control the local and reach-averaged flow hydraulics and sediment fluxes. When the flow depth in low relative submergence conditions plunging flow and the highly three-dimensional flow field can cause spatial distributions of bed surface elevations and grain size distributions, therefore, causing a spatially variable sediment transport rate. We conducted a set of experiments to study how the bed surface responds to this spatial variability and in particular the effect relative submergence in the formation of sediment patches around simulated large boulders. Same average sediment transport capacity, upstream sediment supply, and initial bed thickness and grain size distribution were imposed in all experiments. The detailed flow field around the boulders was obtained using a combination of laboratory measurements and a 3D flow model based on the Volume of Fluid technique. The local shear stress field displayed substantial variability and controlled the bedload transport rates and direction in which sediment moved. The divergence in shear stress caused by the hemispheres promoted size-selective bedload deposition, which formed patches of coarse sediment upstream of the hemisphere. Sediment deposition caused a decrease in local shear stress, which combined with the coarser grain size, enhanced the stability of this patch. The region downstream of the hemispheres was largely controlled by a recirculation zone and had little to no change in grain size, bed elevation, and shear stress. The formation, development and stability of sediment patches in mountain streams is controlled by the shear stress divergence and magnitude and direction of the local shear stress field.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Traffic camera images from the New York State Department of Transportation (511ny.org) are used to create a hand-labeled dataset of images classified into to one of six road surface conditions: 1) severe snow, 2) snow, 3) wet, 4) dry, 5) poor visibility, or 6) obstructed. Six labelers (authors Sutter, Wirz, Przybylo, Cains, Radford, and Evans) went through a series of four labeling trials where reliability across all six labelers were assessed using the Krippendorff’s alpha (KA) metric (Krippendorff, 2007). The online tool by Dr. Freelon (Freelon, 2013; Freelon, 2010) was used to calculate reliability metrics after each trial, and the group achieved inter-coder reliability with KA of 0.888 on the 4th trial. This process is known as quantitative content analysis, and three pieces of data used in this process are shared, including: 1) a PDF of the codebook which serves as a set of rules for labeling images, 2) images from each of the four labeling trials, including the use of New York State Mesonet weather observation data (Brotzge et al., 2020), and 3) an Excel spreadsheet including the calculated inter-coder reliability metrics and other summaries used to asses reliability after each trial.
The broader purpose of this work is that the six human labelers, after achieving inter-coder reliability, can then label large sets of images independently, each contributing to the creation of larger labeled dataset used for training supervised machine learning models to predict road surface conditions from camera images. The xCITE lab (xCITE, 2023) is used to store camera images from 511ny.org, and the lab provides computing resources for training machine learning models.
Note: This web page provides data on health facilities only. To file a complaint against a facility, please see: https://www.cdph.ca.gov/Programs/CHCQ/LCP/Pages/FileAComplaint.aspx The California Department of Public Health (CDPH), Center for Health Care Quality, Licensing and Certification (L&C) Program licenses more than 30 types of healthcare facilities. The Electronic Licensing Management System (ELMS) is a CDPH data system created to manage state licensing-related data and enforcement actions. The “Health Facilities’ State Enforcement Actions” dataset includes summary information for state enforcement actions (state citations or administrative penalties) issued to California healthcare facilities. This file, a sub-set of the ELMS system data, includes state enforcement actions that have been issued from July 1, 1997 through June 30, 2024. Data are presented for each citation/penalty, and include information about the type of enforcement action, violation category, penalty amount, violation date, appeal status, and facility. The “LTC Citation Narrative” dataset contains the full text of citations that were issued to long-term care (LTC) facilities between January 1, 2012 – December 31, 2017. DO NOT DOWNLOAD in Excel as this file has large blocks of text which may truncate. For example, Excel 2007 and later display, and allow up to, 32,767 characters in each cell, whereas earlier versions of Excel allow 32,767 characters, but only display the first 1,024 characters. Please refer to instructions in “E_Citation_Access_DB_How_To_Docs”, about how to download and view data. These files enable providers and the public to identify facility non-compliance and quality issues. By making this information available, quality issues can be identified and addressed. Please refer to the background paper, “About Health Facilities’ State Enforcement Actions” for information regarding California state enforcement actions before using these data. Data dictionaries and data summary charts are also available. Note: The Data Dictionary at the bottom of the dataset incorrectly lists the data column formats as all Text. For proper format labels, please go here.
Go to http://on.ny.gov/1J8tPSN on the New York Lottery website for past Mega Millions results and payouts.
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Following a review of our processes, NHS Digital has recently decided to bring forward the publishing date for Practice Level Prescribing data. This data is currently published on the 1st Friday of the month. The February 2017 data will be published on Friday 5 May 2017, however these dates are constantly under review and may move earlier. What does the data cover? General practice prescribing data is a list of all medicines, dressings and appliances that are prescribed and dispensed each month. A record will only be produced when this has occurred and there is no record for a zero total. For each practice in England, the following information is presented at presentation level for each medicine, dressing and appliance, (by presentation name): - the total number of items prescribed and dispensed - the total net ingredient cost - the total actual cost - the total quantity The data covers NHS prescriptions written in England and dispensed in the community in the UK. Prescriptions written in England but dispensed outside England are included. The data includes prescriptions written by GPs and other non-medical prescribers (such as nurses and pharmacists) who are attached to GP practices. GP practices are identified only by their national code, so an additional data file - linked to the first by the practice code - provides further detail in relation to the practice. Presentations are identified only by their BNF code, so an additional data file - linked to the first by the BNF code - provides the chemical name for that presentation. Warning: Large file size (over 1GB). Each monthly data set is large (over 10 million rows), but can be viewed in standard software such as Microsoft WordPad (save by right-clicking on the file name and selecting 'Save Target As', or equivalent on Mac OSX). It is then possible to select the required rows of data and copy and paste the information into another software application, such as a spreadsheet. Alternatively add-ons to existing software, such as the Microsoft PowerPivot add-on for Excel, to handle larger data sets, can be used. The Microsoft PowerPivot add-on for Excel is available using the link in the 'Related Links' section below. Once PowerPivot has been installed, to load the large files, please follow the instructions below. Note that it may take at least 20 to 30 minutes to load one monthly file. 1. Start Excel as normal 2. Click on the PowerPivot tab 3. Click on the PowerPivot Window icon (top left) 4. In the PowerPivot Window, click on the "From Other Sources" icon 5. In the Table Import Wizard e.g. scroll to the bottom and select Text File 6. Browse to the file you want to open and choose the file extension you require e.g. CSV Once the data has been imported you can view it in a spreadsheet.
The intention is to collect data for the calendar year 2009 (or the nearest year for which each business keeps its accounts. The survey is considered a one-off survey, although for accurate NAs, such a survey should be conducted at least every five years to enable regular updating of the ratios, etc., needed to adjust the ongoing indicator data (mainly VAGST) to NA concepts. The questionnaire will be drafted by FSD, largely following the previous BAS, updated to current accounting terminology where necessary. The questionnaire will be pilot tested, using some accountants who are likely to complete a number of the forms on behalf of their business clients, and a small sample of businesses. Consultations will also include Ministry of Finance, Ministry of Commerce, Industry and Labour, Central Bank of Samoa (CBS), Samoa Tourism Authority, Chamber of Commerce, and other business associations (hotels, retail, etc.).
The questionnaire will collect a number of items of information about the business ownership, locations at which it operates and each establishment for which detailed data can be provided (in the case of complex businesses), contact information, and other general information needed to clearly identify each unique business. The main body of the questionnaire will collect data on income and expenses, to enable value added to be derived accurately. The questionnaire will also collect data on capital formation, and will contain supplementary pages for relevant industries to collect volume of production data for selected commodities and to collect information to enable an estimate of value added generated by key tourism activities.
The principal user of the data will be FSD which will incorporate the survey data into benchmarks for the NA, mainly on the current published production measure of GDP. The information on capital formation and other relevant data will also be incorporated into the experimental estimates of expenditure on GDP. The supplementary data on volumes of production will be used by FSD to redevelop the industrial production index which has recently been transferred under the SBS from the CBS. The general information about the business ownership, etc., will be used to update the Business Register.
Outputs will be produced in a number of formats, including a printed report containing descriptive information of the survey design, data tables, and analysis of the results. The report will also be made available on the SBS website in “.pdf” format, and the tables will be available on the SBS website in excel tables. Data by region may also be produced, although at a higher level of aggregation than the national data. All data will be fully confidentialised, to protect the anonymity of all respondents. Consideration may also be made to provide, for selected analytical users, confidentialised unit record files (CURFs).
A high level of accuracy is needed because the principal purpose of the survey is to develop revised benchmarks for the NA. The initial plan was that the survey will be conducted as a stratified sample survey, with full enumeration of large establishments and a sample of the remainder.
National Coverage
The main statistical unit to be used for the survey is the establishment. For simple businesses that undertake a single activity at a single location there is a one-to-one relationship between the establishment and the enterprise. For large and complex enterprises, however, it is desirable to separate each activity of an enterprise into establishments to provide the most detailed information possible for industrial analysis. The business register will need to be developed in such a way that records the links between establishments and their parent enterprises. The business register will be created from administrative records and may not have enough information to recognize all establishments of complex enterprises. Large businesses will be contacted prior to the survey post-out to determine if they have separate establishments. If so, the extended structure of the enterprise will be recorded on the business register and a questionnaire will be sent to the enterprise to be completed for each establishment.
SBS has decided to follow the New Zealand simplified version of its statistical units model for the 2009 BAS. Future surveys may consider location units and enterprise groups if they are found to be useful for statistical collections.
It should be noted that while establishment data may enable the derivation of detailed benchmark accounts, it may be necessary to aggregate up to enterprise level data for the benchmarks if the ongoing data used to extrapolate the benchmark forward (mainly VAGST) are only available at the enterprise level.
The BAS's covered all employing units, and excluded small non-employing units such as the market sellers. The surveys also excluded central government agencies engaged in public administration (ministries, public education and health, etc.). It only covers businesses that pay the VAGST. (Threshold SAT$75,000 and upwards).
Sample survey data [ssd]
-Total Sample Size was 1240 -Out of the 1240, 902 successfully completed the questionnaire. -The other remaining 338 either never responded or were omitted (some businesses were ommitted from the sample as they do not meet the requirement to be surveyed) -Selection was all employing units paying VAGST (Threshold SAT $75,000 upwards)
WILL CONFIRM LATER!!
OSO LE MEA E LE FAASA...AEA :-)
Mail Questionnaire [mail]
Supplementary Pages Additional pages have been prepared to collect data for a limited range of industries. 1.Production data. To rebase and redevelop the Industrial Production Index (IPI), it is intended to collect volume of production information from a selection of large manufacturing businesses. The selection of businesses and products is critical to the usefulness of the IPI. The products must be homogeneous, and be of enough importance to the economy to justify collecting the data. Significance criteria should be established for the selection of products to include in the IPI, and the 2009 BAS provides an opportunity to collect benchmark data for a range of products known to be significant (based on information in the existing IPI, CPI weights, export data, etc.) as well as open questions for respondents to provide information on other significant products. 2.Tourism. There is a strong demand for estimates of tourism value added. To estimate tourism value added using the international standard Tourism Satellite Account methodology requires the use of an input-output table, which is beyond the capacity of SBS at present. However, some indicative estimates of the main parts of the economy influenced by tourism can be derived if the necessary data are collected. Tourism is a demand concept, based on defining tourists (the international standard includes both international and domestic tourists), what products are characteristically purchased by tourists, and which industries supply those products. Some questions targeted at those industries that have significant involvement with tourists (hotels, restaurants, transport and tour operators, vehicle hire, etc.), on how much of their income is sourced from tourism would provide valuable indicators of the size of the direct impact of tourism.
Partial imputation was done at the time of receipt of questionnaires, after follow-up procedures to obtain fully completed questionnaires have been followed. Imputation followed a process, i.e., apply ratios from responding units in the imputation cell to the partial data that was supplied. Procedures were established during the editing stage (a) to preserve the integrity of the questionnaires as supplied by respondents, and (b) to record all changes made to the questionnaires during editing. If SBS staff writes on the form, for example, this should only be done in red pen, to distinguish the alterations from the original information.
Additional edit checks were developed, including checking against external data at enterprise/establishment level. External data to be checked against include VAGST and SNPF for turnover and purchases, and salaries and wages and employment data respectively. Editing and imputation processes were undertaken by FSD using Excel.
NOT APPLICABLE!!
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Airline data holds immense importance as it offers insights into the functioning and efficiency of the aviation industry. It provides valuable information about flight routes, schedules, passenger demographics, and preferences, which airlines can leverage to optimize their operations and enhance customer experiences. By analyzing data on delays, cancellations, and on-time performance, airlines can identify trends and implement strategies to improve punctuality and mitigate disruptions. Moreover, regulatory bodies and policymakers rely on this data to ensure safety standards, enforce regulations, and make informed decisions regarding aviation policies. Researchers and analysts use airline data to study market trends, assess environmental impacts, and develop strategies for sustainable growth within the industry. In essence, airline data serves as a foundation for informed decision-making, operational efficiency, and the overall advancement of the aviation sector.
This dataset comprises diverse parameters relating to airline operations on a global scale. The dataset prominently incorporates fields such as Passenger ID, First Name, Last Name, Gender, Age, Nationality, Airport Name, Airport Country Code, Country Name, Airport Continent, Continents, Departure Date, Arrival Airport, Pilot Name, and Flight Status. These columns collectively provide comprehensive insights into passenger demographics, travel details, flight routes, crew information, and flight statuses. Researchers and industry experts can leverage this dataset to analyze trends in passenger behavior, optimize travel experiences, evaluate pilot performance, and enhance overall flight operations.
https://i.imgur.com/cUFuMeU.png" alt="">
The dataset provided here is a simulated example and was generated using the online platform found at Mockaroo. This web-based tool offers a service that enables the creation of customizable Synthetic datasets that closely resemble real data. It is primarily intended for use by developers, testers, and data experts who require sample data for a range of uses, including testing databases, filling applications with demonstration data, and crafting lifelike illustrations for presentations and tutorials. To explore further details, you can visit their website.
Cover Photo by: Kevin Woblick on Unsplash
Thumbnail by: Airplane icons created by Freepik - Flaticon