24 datasets found
  1. c

    Niagara Open Data

    • catalog.civicdataecosystem.org
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niagara Open Data [Dataset]. https://catalog.civicdataecosystem.org/dataset/niagara-open-data
    Explore at:
    Description

    The Ontario government, generates and maintains thousands of datasets. Since 2012, we have shared data with Ontarians via a data catalogue. Open data is data that is shared with the public. Click here to learn more about open data and why Ontario releases it. Ontario’s Open Data Directive states that all data must be open, unless there is good reason for it to remain confidential. Ontario’s Chief Digital and Data Officer also has the authority to make certain datasets available publicly. Datasets listed in the catalogue that are not open will have one of the following labels: If you want to use data you find in the catalogue, that data must have a licence – a set of rules that describes how you can use it. A licence: Most of the data available in the catalogue is released under Ontario’s Open Government Licence. However, each dataset may be shared with the public under other kinds of licences or no licence at all. If a dataset doesn’t have a licence, you don’t have the right to use the data. If you have questions about how you can use a specific dataset, please contact us. The Ontario Data Catalogue endeavors to publish open data in a machine readable format. For machine readable datasets, you can simply retrieve the file you need using the file URL. The Ontario Data Catalogue is built on CKAN, which means the catalogue has the following features you can use when building applications. APIs (Application programming interfaces) let software applications communicate directly with each other. If you are using the catalogue in a software application, you might want to extract data from the catalogue through the catalogue API. Note: All Datastore API requests to the Ontario Data Catalogue must be made server-side. The catalogue's collection of dataset metadata (and dataset files) is searchable through the CKAN API. The Ontario Data Catalogue has more than just CKAN's documented search fields. You can also search these custom fields. You can also use the CKAN API to retrieve metadata about a particular dataset and check for updated files. Read the complete documentation for CKAN's API. Some of the open data in the Ontario Data Catalogue is available through the Datastore API. You can also search and access the machine-readable open data that is available in the catalogue. How to use the API feature: Read the complete documentation for CKAN's Datastore API. The Ontario Data Catalogue contains a record for each dataset that the Government of Ontario possesses. Some of these datasets will be available to you as open data. Others will not be available to you. This is because the Government of Ontario is unable to share data that would break the law or put someone's safety at risk. You can search for a dataset with a word that might describe a dataset or topic. Use words like “taxes” or “hospital locations” to discover what datasets the catalogue contains. You can search for a dataset from 3 spots on the catalogue: the homepage, the dataset search page, or the menu bar available across the catalogue. On the dataset search page, you can also filter your search results. You can select filters on the left hand side of the page to limit your search for datasets with your favourite file format, datasets that are updated weekly, datasets released by a particular organization, or datasets that are released under a specific licence. Go to the dataset search page to see the filters that are available to make your search easier. You can also do a quick search by selecting one of the catalogue’s categories on the homepage. These categories can help you see the types of data we have on key topic areas. When you find the dataset you are looking for, click on it to go to the dataset record. Each dataset record will tell you whether the data is available, and, if so, tell you about the data available. An open dataset might contain several data files. These files might represent different periods of time, different sub-sets of the dataset, different regions, language translations, or other breakdowns. You can select a file and either download it or preview it. Make sure to read the licence agreement to make sure you have permission to use it the way you want. Read more about previewing data. A non-open dataset may be not available for many reasons. Read more about non-open data. Read more about restricted data. Data that is non-open may still be subject to freedom of information requests. The catalogue has tools that enable all users to visualize the data in the catalogue without leaving the catalogue – no additional software needed. Have a look at our walk-through of how to make a chart in the catalogue. Get automatic notifications when datasets are updated. You can choose to get notifications for individual datasets, an organization’s datasets or the full catalogue. You don’t have to provide and personal information – just subscribe to our feeds using any feed reader you like using the corresponding notification web addresses. Copy those addresses and paste them into your reader. Your feed reader will let you know when the catalogue has been updated. The catalogue provides open data in several file formats (e.g., spreadsheets, geospatial data, etc). Learn about each format and how you can access and use the data each file contains. A file that has a list of items and values separated by commas without formatting (e.g. colours, italics, etc.) or extra visual features. This format provides just the data that you would display in a table. XLSX (Excel) files may be converted to CSV so they can be opened in a text editor. How to access the data: Open with any spreadsheet software application (e.g., Open Office Calc, Microsoft Excel) or text editor. Note: This format is considered machine-readable, it can be easily processed and used by a computer. Files that have visual formatting (e.g. bolded headers and colour-coded rows) can be hard for machines to understand, these elements make a file more human-readable and less machine-readable. A file that provides information without formatted text or extra visual features that may not follow a pattern of separated values like a CSV. How to access the data: Open with any word processor or text editor available on your device (e.g., Microsoft Word, Notepad). A spreadsheet file that may also include charts, graphs, and formatting. How to access the data: Open with a spreadsheet software application that supports this format (e.g., Open Office Calc, Microsoft Excel). Data can be converted to a CSV for a non-proprietary format of the same data without formatted text or extra visual features. A shapefile provides geographic information that can be used to create a map or perform geospatial analysis based on location, points/lines and other data about the shape and features of the area. It includes required files (.shp, .shx, .dbt) and might include corresponding files (e.g., .prj). How to access the data: Open with a geographic information system (GIS) software program (e.g., QGIS). A package of files and folders. The package can contain any number of different file types. How to access the data: Open with an unzipping software application (e.g., WinZIP, 7Zip). Note: If a ZIP file contains .shp, .shx, and .dbt file types, it is an ArcGIS ZIP: a package of shapefiles which provide information to create maps or perform geospatial analysis that can be opened with ArcGIS (a geographic information system software program). A file that provides information related to a geographic area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open using a GIS software application to create a map or do geospatial analysis. It can also be opened with a text editor to view raw information. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format for sharing data in a machine-readable way that can store data with more unconventional structures such as complex lists. How to access the data: Open with any text editor (e.g., Notepad) or access through a browser. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format to store and organize data in a machine-readable way that can store data with more unconventional structures (not just data organized in tables). How to access the data: Open with any text editor (e.g., Notepad). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A file that provides information related to an area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open with a geospatial software application that supports the KML format (e.g., Google Earth). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. This format contains files with data from tables used for statistical analysis and data visualization of Statistics Canada census data. How to access the data: Open with the Beyond 20/20 application. A database which links and combines data from different files or applications (including HTML, XML, Excel, etc.). The database file can be converted to a CSV/TXT to make the data machine-readable, but human-readable formatting will be lost. How to access the data: Open with Microsoft Office Access (a database management system used to develop application software). A file that keeps the original layout and

  2. s

    Data from: Offgrid DC electrical power generation (PV) and AC electrical...

    • researchportal.scu.edu.au
    • researchdata.edu.au
    csv, docx
    Updated Aug 3, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Barry J Hill (2021). Offgrid DC electrical power generation (PV) and AC electrical supply performance data, gathered at installation of solar "Sunflower" prototype mobile solar generator at Splendour In the Grass Festival 2017 [Dataset]. https://researchportal.scu.edu.au/esploro/outputs/dataset/Offgrid-DC-electrical-power-generation-PV/991012820446102368
    Explore at:
    docx(113781 bytes), csv(3101109 bytes), docx(24983 bytes), docx(93902 bytes)Available download formats
    Dataset updated
    Aug 3, 2021
    Dataset provided by
    Southern Cross University
    Authors
    Barry J Hill
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Time period covered
    2017
    Area covered
    Dataset funded by
    Southern Cross University Sustainability Fund
    Description

    The data collection has two purposes... one to create data for a examination of the efficiency and compatibility of Battery PV generators with power consumption requirements of outdoor festivals and events.

    The other to create a data sonification based multimedia project to display this data in a multimedia artspace. Is there a data dictionary or any schema associated with the data? If so what is it?

    See attached readme files in regards to interpreting the data.

    Data collected via remote automated computer monitoring of electrical equipment. Please see Excel data readme file for more details.

    Viewing instructions:

    Excel and Word

  3. 4

    Data underlying the Bachelor Thesis: Using Newsletters to Analyze Curated...

    • data.4tu.nl
    zip
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Philip De Munck (2023). Data underlying the Bachelor Thesis: Using Newsletters to Analyze Curated Software Testing Content [Dataset]. http://doi.org/10.4121/9e59a43d-474b-46c2-9e29-c7c0b21bd6b4.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Philip De Munck
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The objective of this research was to analyze curated content published via newsletters and find out what software testing knowledge was present in those resources. Software Testing newsletters were analyzed in this research and the AtlasTI software was used with a grounded theory approach to tag the resources mentioned in these newsletters. The resources were obtained by visiting multiple curated software testing-related newsletters and downloading articles as PDFs. After downloading, open/axial coding was used to code each file based on various different categories. The attached excel files provide a detailed view of what common software testing technologies, techniques, problems, and more are mentioned in newsletter resources. This data set is linked to a bachelor thesis completed at the EEMCS faculty at the TU Delft. A link will be added after publication.

  4. Z

    Selkie GIS Techno-Economic Tool input datasets

    • data.niaid.nih.gov
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cullinane, Margaret (2023). Selkie GIS Techno-Economic Tool input datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10083960
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset authored and provided by
    Cullinane, Margaret
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data was prepared as input for the Selkie GIS-TE tool. This GIS tool aids site selection, logistics optimization and financial analysis of wave or tidal farms in the Irish and Welsh maritime areas. Read more here: https://www.selkie-project.eu/selkie-tools-gis-technoeconomic-model/

    This research was funded by the Science Foundation Ireland (SFI) through MaREI, the SFI Research Centre for Energy, Climate and the Marine and by the Sustainable Energy Authority of Ireland (SEAI). Support was also received from the European Union's European Regional Development Fund through the Ireland Wales Cooperation Programme as part of the Selkie project.

    File Formats

    Results are presented in three file formats:

    tif Can be imported into a GIS software (such as ARC GIS) csv Human-readable text format, which can also be opened in Excel png Image files that can be viewed in standard desktop software and give a spatial view of results

    Input Data

    All calculations use open-source data from the Copernicus store and the open-source software Python. The Python xarray library is used to read the data.

    Hourly Data from 2000 to 2019

    • Wind - Copernicus ERA5 dataset 17 by 27.5 km grid
      10m wind speed

    • Wave - Copernicus Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis dataset 3 by 5 km grid

    Accessibility

    The maximum limits for Hs and wind speed are applied when mapping the accessibility of a site.
    The Accessibility layer shows the percentage of time the Hs (Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis) and wind speed (ERA5) are below these limits for the month.

    Input data is 20 years of hourly wave and wind data from 2000 to 2019, partitioned by month. At each timestep, the accessibility of the site was determined by checking if
    the Hs and wind speed were below their respective limits. The percentage accessibility is the number of hours within limits divided by the total number of hours for the month.

    Environmental data is from the Copernicus data store (https://cds.climate.copernicus.eu/). Wave hourly data is from the 'Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis' dataset.
    Wind hourly data is from the ERA 5 dataset.

    Availability

    A device's availability to produce electricity depends on the device's reliability and the time to repair any failures. The repair time depends on weather
    windows and other logistical factors (for example, the availability of repair vessels and personnel.). A 2013 study by O'Connor et al. determined the
    relationship between the accessibility and availability of a wave energy device. The resulting graph (see Fig. 1 of their paper) shows the correlation between accessibility at Hs of 2m and wind speed of 15.0m/s and availability. This graph is used to calculate the availability layer from the accessibility layer.

    The input value, accessibility, measures how accessible a site is for installation or operation and maintenance activities. It is the percentage time the
    environmental conditions, i.e. the Hs (Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis) and wind speed (ERA5), are below operational limits.
    Input data is 20 years of hourly wave and wind data from 2000 to 2019, partitioned by month. At each timestep, the accessibility of the site was determined
    by checking if the Hs and wind speed were below their respective limits. The percentage accessibility is the number of hours within limits divided by the total
    number of hours for the month. Once the accessibility was known, the percentage availability was calculated using the O'Connor et al. graph of the relationship between the two. A mature technology reliability was assumed.

    Weather Window

    The weather window availability is the percentage of possible x-duration windows where weather conditions (Hs, wind speed) are below maximum limits for the
    given duration for the month.

    The resolution of the wave dataset (0.05° × 0.05°) is higher than that of the wind dataset
    (0.25° x 0.25°), so the nearest wind value is used for each wave data point. The weather window layer is at the resolution of the wave layer.

    The first step in calculating the weather window for a particular set of inputs (Hs, wind speed and duration) is to calculate the accessibility at each timestep.
    The accessibility is based on a simple boolean evaluation: are the wave and wind conditions within the required limits at the given timestep?

    Once the time series of accessibility is calculated, the next step is to look for periods of sustained favourable environmental conditions, i.e. the weather
    windows. Here all possible operating periods with a duration matching the required weather-window value are assessed to see if the weather conditions remain
    suitable for the entire period. The percentage availability of the weather window is calculated based on the percentage of x-duration windows with suitable
    weather conditions for their entire duration.The weather window availability can be considered as the probability of having the required weather window available
    at any given point in the month.

    Extreme Wind and Wave

    The Extreme wave layers show the highest significant wave height expected to occur during the given return period. The Extreme wind layers show the highest wind speed expected to occur during the given return period.

    To predict extreme values, we use Extreme Value Analysis (EVA). EVA focuses on the extreme part of the data and seeks to determine a model to fit this reduced
    portion accurately. EVA consists of three main stages. The first stage is the selection of extreme values from a time series. The next step is to fit a model
    that best approximates the selected extremes by determining the shape parameters for a suitable probability distribution. The model then predicts extreme values
    for the selected return period. All calculations use the python pyextremes library. Two methods are used - Block Maxima and Peaks over threshold.

    The Block Maxima methods selects the annual maxima and fits a GEVD probability distribution.

    The peaks_over_threshold method has two variable calculation parameters. The first is the percentile above which values must be to be selected as extreme (0.9 or 0.998). The second input is the time difference between extreme values for them to be considered independent (3 days). A Generalised Pareto Distribution is fitted to the selected
    extremes and used to calculate the extreme value for the selected return period.

  5. Cyclistic Trip Data Analysis

    • kaggle.com
    Updated Jan 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 1, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Fatima Gulraiz
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    About Cyclistic

    Cyclistic is a bike-share program that features more than 5,800 bicycles and 600 docking stations. They offer making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.

    Problem Statement

    The target/aim of marketing team is to convert casual riders into annual riders. In order to convert the causal riders into annual members need to understand the behavior of the users that how the annual members are using this service differently than causal riders. Need to understand how often this service is being used by annual members and casual riders.

    Solution

    For the analysis of this project, we picked/chose Excel with the mutual consent of our team to show our work. To help with our analysis we started with Ask, then we prepare our data according to what client was asking to provide then we process the data to make it clean, organize and easy to accessible and at the end we analyze that data to get the results.

    As per the requirement of our client, they wanted to increase the number of their annual members. To increase their annual members, they wanted to know How do annual members and casual riders use Cyclistic bike differently?

    After having company’s requirement now, it was the time to Prepare and Process the data. For this analysis we been told to use only previous 12 months of Cyclistic trip data. The data has been made available online by Motivational International Inc. we checked the integrity and credibility of data by making sure that online source is safe and secure through which the data is available to use.

    While preparing the data, we started with downloading the files on our machine. We saved the files and unzip them. Then we created the subfolders for the .csv and the .xls sheets. Before further analysis we cleaned the data. We used Filter option on our required columns to see if there are any NULLS or any data that it supposed to be not here.

    While cleaning the data in some of the monthly files we found that start_at and end_at columns had the custom format of mm: ss.0. For consistency with all other spreadsheets we changed the custom format to m/d/yy h:mm. We also found that some spreadsheets had the data from other months but after further analysis we figured it out that the ride was starting in that month and ending in the next month so that data supposed to belong from that worksheet.

    After cleaning the data, we created 2 new columns in each worksheet to perform our calculations. To perform our calculations, we made 2 new columns and named them: a) ride_length
    b) day_of _week

    To create ride_length column we used Subtraction Formula by choosing stareted_at and ended_at columns. That gave us the ride length of each ride for everyday of the month. To create day_of_week we used WEEKDAY command. After cleaning the data on monthly basis, it was the time to merge all 12 months into a single spreadsheet. After merging the whole data into a new sheet, it was time to Analyze! Before analyzing our team made sure one more time that the data is properly organize, formatted and there is no error or bug in our data to get the correct results. We made sure on more time that all the Formatting are correct.To analyze the data we ran few calculations to get a better sense of the data layout that we were using. We calculated: a) mean of ride_length b) max of ride_length c) mode of day_of_week

    To find out mean of ride_length, we used Average Formula, to get an estimate/ overview of how long rides usually last. By doing Max calculation we found out the longest ride length. Last but not the least mode function we calculate the most frequent day of the week when riders were using that service.

    To Support the requirement/ question that been asked by our client to identify the trends and relationship we made a Pivot Table in Excel so that we can show/ present our work/ insights/ results in an easy way to the client. By using Pivot Table its clearer to see the trend that annual members are using this service more than the casual riders and it’s also giving the good picture of the relation that how often annual members are using this service. By using the Pivot Table, we analyzed that total number of rides for annual members are more than the causal riders. On the basis of our analysis, we found out that the average length of ride is more for casual riders than the annual members, it means that casual members are riding for longer period of time than the annual members. But annual members are using more often than casual ri...

  6. U

    Statistical Abstract of the United States, 2011

    • dataverse-staging.rdmc.unc.edu
    Updated Oct 28, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UNC Dataverse (2011). Statistical Abstract of the United States, 2011 [Dataset]. https://dataverse-staging.rdmc.unc.edu/dataset.xhtml?persistentId=hdl:1902.29/CD-10849
    Explore at:
    Dataset updated
    Oct 28, 2011
    Dataset provided by
    UNC Dataverse
    License

    https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CD-10849https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CD-10849

    Description

    "The Statistical Abstract of the United States, published since 1878, is the standard summary of statistics on the social, political, and economic organization of the United States. It is designed to serve as a convenient volume for statistical reference and as a guide to other statistical publications and sources. The latter function is served by the introductory text to each section, the source note appearing below each table, and Appendix I, which comprises the Guide to Sources of Statisti cs, the Guide to State Statistical Abstracts, and the Guide to Foreign Statistical Abstracts. The Statistical Abstract sections and tables are compiled into one Adobe PDF named StatAbstract2009.pdf. This PDF is bookmarked by section and by table and can be searched using the Acrobat Search feature. The Statistical Abstract on CD-ROM is best viewed using Adobe Acrobat 5, or any subsequent version of Acrobat or Acrobat Reader. The Statistical Abstract tables and the metropolitan areas tables from Appendix II are available as Excel(.xls or .xlw) spreadsheets. In most cases, these spreadsheet files offer the user direct access to more data than are shown either in the publication or Adobe Acrobat. These files usually contain more years of data, more geographic areas, and/or more categories of subjects than those shown in the Acrobat version. The extensive selection of statistics is provided for the United States, with selected data for regions, divisions, states, metropolitan areas, cities, and foreign countries from reports and records of government and private agencies. Software on the disc can be used to perform full-text searches, view official statistics, open tables as Lotus worksheets or Excel workbooks, and link directly to source agencies and organizations for supporting information. Except as indicated, figures are for the United States as presently constituted. Although emphasis in the Statistical Abstract is primarily given to national data, many tables present data for regions and individual states and a smaller number for metropolitan areas and cities.Statistics for the Commonwealth of Puerto Rico and for island areas of the United States are included in many state tables and are supplemented by information in Section 29. Additional information for states, cities, counties, metropolitan areas, and other small units, as well as more historical data are available in various supplements to the Abstract. Statistics in this edition are generally for the most recent year or period available by summer 2006. Each year over 1,400 tables and charts are reviewed and evaluated; new tables and charts of current interest are added, continuing series are updated, and less timely data are condensed or eliminated. Text notes and appendices are revised as appropriate. This year we have introduced 72 new tables covering a wide range of subject areas. These cover a variety of topics including: learning disability for children, people impacted by the hurricanes in the Gulf Coast area, employees with alternative work arrangements, adult computer and Internet users by selected characteristics, North America cruise industry, women- and minority-owned businesses, and the percentage of the adult population considered to be obese. Some of the annually surveyed topics are population; vital statistics; health and nutrition; education; law enforcement, courts and prison; geography and environment; elections; state and local government; federal government finances and employment; national defense and veterans affairs; social insurance and human services; labor force, employment, and earnings; income, expenditures, and wealth; prices; business enterprise; science and technology; agriculture; natural resources; energy; construction and housing; manufactures; domestic trade and services; transportation; information and communication; banking, finance, and insurance; arts, entertainment, and recreation; accommodation, food services, and other services; foreign commerce and aid; outlying areas; and comparative international statistics." Note to Users: This CD is part of a collection located in the Data Archive of the Odum Institute for Research in Social Science, at the University of North Carolina at Chapel Hill. The collection is located in Room 10, Manning Hall. Users may check the CDs out subscribing to the honor system. Items can be checked out for a period of two weeks. Loan forms are located adjacent to the collection.

  7. Extended 1.0 Dataset of "Concentration and Geospatial Modelling of Health...

    • zenodo.org
    bin, csv, pdf
    Updated Sep 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Domjan; Peter Domjan; Viola Angyal; Viola Angyal; Istvan Vingender; Istvan Vingender (2024). Extended 1.0 Dataset of "Concentration and Geospatial Modelling of Health Development Offices' Accessibility for the Total and Elderly Populations in Hungary" [Dataset]. http://doi.org/10.5281/zenodo.13826993
    Explore at:
    bin, pdf, csvAvailable download formats
    Dataset updated
    Sep 23, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Peter Domjan; Peter Domjan; Viola Angyal; Viola Angyal; Istvan Vingender; Istvan Vingender
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Sep 23, 2024
    Area covered
    Hungary
    Description

    Introduction

    We are enclosing the database used in our research titled "Concentration and Geospatial Modelling of Health Development Offices' Accessibility for the Total and Elderly Populations in Hungary", along with our statistical calculations. For the sake of reproducibility, further information can be found in the file Short_Description_of_Data_Analysis.pdf and Statistical_formulas.pdf

    The sharing of data is part of our aim to strengthen the base of our scientific research. As of March 7, 2024, the detailed submission and analysis of our research findings to a scientific journal has not yet been completed.

    The dataset was expanded on 23rd September 2024 to include SPSS statistical analysis data, a heatmap, and buffer zone analysis around the Health Development Offices (HDOs) created in QGIS software.

    Short Description of Data Analysis and Attached Files (datasets):

    Our research utilised data from 2022, serving as the basis for statistical standardisation. The 2022 Hungarian census provided an objective basis for our analysis, with age group data available at the county level from the Hungarian Central Statistical Office (KSH) website. The 2022 demographic data provided an accurate picture compared to the data available from the 2023 microcensus. The used calculation is based on our standardisation of the 2022 data. For xlsx files, we used MS Excel 2019 (version: 1808, build: 10406.20006) with the SOLVER add-in.

    Hungarian Central Statistical Office served as the data source for population by age group, county, and regions: https://www.ksh.hu/stadat_files/nep/hu/nep0035.html, (accessed 04 Jan. 2024.) with data recorded in MS Excel in the Data_of_demography.xlsx file.

    In 2022, 108 Health Development Offices (HDOs) were operational, and it's noteworthy that no developments have occurred in this area since 2022. The availability of these offices and the demographic data from the Central Statistical Office in Hungary are considered public interest data, freely usable for research purposes without requiring permission.

    The contact details for the Health Development Offices were sourced from the following page (Hungarian National Population Centre (NNK)): https://www.nnk.gov.hu/index.php/efi (n=107). The Semmelweis University Health Development Centre was not listed by NNK, hence it was separately recorded as the 108th HDO. More information about the office can be found here: https://semmelweis.hu/egeszsegfejlesztes/en/ (n=1). (accessed 05 Dec. 2023.)

    Geocoordinates were determined using Google Maps (N=108): https://www.google.com/maps. (accessed 02 Jan. 2024.) Recording of geocoordinates (latitude and longitude according to WGS 84 standard), address data (postal code, town name, street, and house number), and the name of each HDO was carried out in the: Geo_coordinates_and_names_of_Hungarian_Health_Development_Offices.csv file.

    The foundational software for geospatial modelling and display (QGIS 3.34), an open-source software, can be downloaded from:

    https://qgis.org/en/site/forusers/download.html. (accessed 04 Jan. 2024.)

    The HDOs_GeoCoordinates.gpkg QGIS project file contains Hungary's administrative map and the recorded addresses of the HDOs from the

    Geo_coordinates_and_names_of_Hungarian_Health_Development_Offices.csv file,

    imported via .csv file.

    The OpenStreetMap tileset is directly accessible from www.openstreetmap.org in QGIS. (accessed 04 Jan. 2024.)

    The Hungarian county administrative boundaries were downloaded from the following website: https://data2.openstreetmap.hu/hatarok/index.php?admin=6 (accessed 04 Jan. 2024.)

    HDO_Buffers.gpkg is a QGIS project file that includes the administrative map of Hungary, the county boundaries, as well as the HDO offices and their corresponding buffer zones with a radius of 7.5 km.

    Heatmap.gpkg is a QGIS project file that includes the administrative map of Hungary, the county boundaries, as well as the HDO offices and their corresponding heatmap (Kernel Density Estimation).

    A brief description of the statistical formulas applied is included in the Statistical_formulas.pdf.

    Recording of our base data for statistical concentration and diversification measurement was done using MS Excel 2019 (version: 1808, build: 10406.20006) in .xlsx format.

    • Aggregated number of HDOs by county: Number_of_HDOs.xlsx
    • Standardised data (Number of HDOs per 100,000 residents): Standardized_data.xlsx
    • Calculation of the Lorenz curve: Lorenz_curve.xlsx
    • Calculation of the Gini index: Gini_Index.xlsx
    • Calculation of the LQ index: LQ_Index.xlsx
    • Calculation of the Herfindahl-Hirschman Index: Herfindahl_Hirschman_Index.xlsx
    • Calculation of the Entropy index: Entropy_Index.xlsx
    • Regression and correlation analysis calculation: Regression_correlation.xlsx

    Using the SPSS 29.0.1.0 program, we performed the following statistical calculations with the databases Data_HDOs_population_without_outliers.sav and Data_HDOs_population.sav:

    • Regression curve estimation with elderly population and number of HDOs, excluding outlier values (Types of analyzed equations: Linear, Logarithmic, Inverse, Quadratic, Cubic, Compound, Power, S, Growth, Exponential, Logistic, with summary and ANOVA analysis table): Curve_estimation_elderly_without_outlier.spv
    • Pearson correlation table between the total population, elderly population, and number of HDOs per county, excluding outlier values such as Budapest and Pest County: Pearson_Correlation_populations_HDOs_number_without_outliers.spv.
    • Dot diagram including total population and number of HDOs per county, excluding outlier values such as Budapest and Pest Counties: Dot_HDO_total_population_without_outliers.spv.
    • Dot diagram including elderly (64<) population and number of HDOs per county, excluding outlier values such as Budapest and Pest Counties: Dot_HDO_elderly_population_without_outliers.spv
    • Regression curve estimation with total population and number of HDOs, excluding outlier values (Types of analyzed equations: Linear, Logarithmic, Inverse, Quadratic, Cubic, Compound, Power, S, Growth, Exponential, Logistic, with summary and ANOVA analysis table): Curve_estimation_without_outlier.spv
    • Dot diagram including elderly (64<) population and number of HDOs per county: Dot_HDO_elderly_population.spv
    • Dot diagram including total population and number of HDOs per county: Dot_HDO_total_population.spv
    • Pearson correlation table between the total population, elderly population, and number of HDOs per county: Pearson_Correlation_populations_HDOs_number.spv
    • Regression curve estimation with total population and number of HDOs, (Types of analyzed equations: Linear, Logarithmic, Inverse, Quadratic, Cubic, Compound, Power, S, Growth, Exponential, Logistic, with summary and ANOVA analysis table): Curve_estimation_total_population.spv

    For easier readability, the files have been provided in both SPV and PDF formats.

    The translation of these supplementary files into English was completed on 23rd Sept. 2024.

    If you have any further questions regarding the dataset, please contact the corresponding author: domjan.peter@phd.semmelweis.hu

  8. U

    Statistical Abstract of the United States 1998

    • dataverse-staging.rdmc.unc.edu
    Updated Nov 30, 2007
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UNC Dataverse (2007). Statistical Abstract of the United States 1998 [Dataset]. https://dataverse-staging.rdmc.unc.edu/dataset.xhtml?persistentId=hdl:1902.29/CD-0013
    Explore at:
    Dataset updated
    Nov 30, 2007
    Dataset provided by
    UNC Dataverse
    License

    https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CD-0013https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CD-0013

    Description

    The Statistical Abstract is the Nation's best known and most popular single source of statistics on the social, political, and economic organization of the country. The print version of this reference source has been published since 1878 while the compact disc version first appeared in 1993. This disc is designed to serve as a convenient, easy-to-use statistical reference source and guide to statistical publications and sources. The disc contains over 1,400 tables from over 250 different gove rnmental, private, and international organizations. The 1998 Statistical Abstract on CD-ROM, like the book, is a statistical reference and guide to over 250 statistical publications and sources from government and private organizations. This compact disc (CD) has 1,500 tables and charts from over 250 sources. Text and tables can be viewed or searched with the software. Tables and charts cover these subjects in 31 sections and 2 appendices: Population, Vital Statistics, Health and Nutrition, Education, Law Enforcement, Courts and Prisons, Geography and Environment, Parks, Recreation and Travel, Elections, State and Local Government, Finances and Employment, Federal Government, Finances and Employment, National Defense and Veterans Affairs, Social Insurance and Human Services, Labor Force, Employment and Earnings, Income, Expenditure and Wealth, Prices, Banking, Finance and Insurance, Business Enterprise, Communications, Energy, Science, Transportation -- Land, Transportation -- Air and Water, Agriculture, Forests and Fisheries, Mining and Mineral Products, Construction and Housing, Manufactures, Domestic Trade and Services, Foreign Commerce and Aid, Outlying Areas, Comparative International Statistics, State Rankings, Population of MSAs, Congressional District Profiles. There are changes this year in both the content of the information on the disc and software used for accessing and installing the information. As usual, updates have been made to most of the more than 1,500 tables and charts that were on the previous disc with new or more recent data. The spreadsheet files which are available in both the Excel or Lotus formats for these ta bles will usually have more information than displayed in the book or Adobe Acrobat files. There are also 93 new tables on such subjects as family planning, women's health, persons with disabilities, health insurance coverage, ambulatory surgery, school violence, household use of public libraries, public library of the Internet, toxic chemical releases, leisure activity, NCAA sports and high school athletic programs, voter registration, licensed child care centers, foster care, home-based businesses, employee benefits, home equity debt, use of debit credit cards, alcohol-related fatal accidents, computer shipments, and foreign stock market indices. See Appendix V on the disc for a complete list of the new tables presented. In the software area, a new opening screen using the DemoShield software has been added. This provide better access to the electronic version of the booklet which is available from the opening screen, the new tutorial step the user through the principal ways to search for information on this disc and other related helpful information. It will also facilitate the installation process for the Adobe Acrobat Reader, the new Microsoft Excel Viewer, and QuickTime for viewing movies. The Adobe Acrobat Reader and Search engine, version 3.01, is on the disc. The Acrobat Reader allows users to view, navigate, search, and print on demand any of the pages from the book. Note to Users: This CD is part of a collection located in the Data Archive of the Odum Institute for Research in Social Science, at the University of North Carolina at Chapel Hill. The collection is located in Room 10, Manning Hall. Users may check the CDs out subscribing to the honor system. Items can be checked out for a period of two weeks. Loan forms are located adjacent to the collection.

  9. Data from: Particle dynamics of nanoplastics suspended in water with soil...

    • data.niaid.nih.gov
    • datasetcatalog.nlm.nih.gov
    • +2more
    zip
    Updated Jan 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anton Astner; Sai Venkatesh Pingali; Hugh O'Neill; Barbara Evans; Volker Urban; Kenneth Littrell; Douglas Hayes (2025). Particle dynamics of nanoplastics suspended in water with soil microparticles: Insights from small angle neutron scattering (SANS) and ultra-SANS [Dataset]. http://doi.org/10.5061/dryad.2z34tmpws
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 6, 2025
    Dataset provided by
    Oak Ridge National Laboratory
    University of Tennessee Institute of Agriculture
    Authors
    Anton Astner; Sai Venkatesh Pingali; Hugh O'Neill; Barbara Evans; Volker Urban; Kenneth Littrell; Douglas Hayes
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Small-angle neutron scattering (SANS) and Ultra-SANS (USANS) were employed to understand the agglomeration behavior of nanoplastics (NPs) formed from a biodegradable mulch film, and microparticles of vermiculite (V), an artificial soil, suspended in water in the presence of low convective shear (ex-situ stirring) prior to measurements. Neutron contrast matching was employed to minimize the signal of V (by 100-fold) and thereby isolate the signal due to NPs in the neutron beam, as the contrast match point (CMP) for V (67 vol% deuteration in water) differed from that of NPs by more than 20%. The original NPs’ size distribution was bimodal: < 200 nm and 500-1200 nm, referred to as small and large NPs, i.e., SNPs and LNPs, respectively. In the absence of V, SNPs formed agglomerates at higher concentrations, with size decreasing slightly with stirring time to 40-50 nm, while the size of LNPs remained unchanged. The presence of V at 2-fold lower concentration than NPs did not change the size of SNPs but reduced the size of LNPs by nearly 2-fold as stirring time increased. Because the size of SNPs and LNPs did not differ substantially between solvents, both at CMP and 100% D2O, even with nanosized V particles contributing toward scattered intensity for the latter solvent, it is evident that SNPs and LNPs are mainly composed of NPs and not V. The results suggest that LNPs are susceptible to size reduction through collisions with soil microparticles via convection, yielding SNPs near soil-water interfaces within vadose zones. Methods Data for Fig 1 (nanoplastic recovery suspended in water and settling out) was collected in the laboratory and the results were recorded in a Microsoft Excel file. Other data was collected on the small-angle neutron scattering (SANS) and ultra-SANS instruments at Oak Ridge National Laboratory, specifically, the Bio-SANS (high-flux isotope reactor) and Beamline 1A (spallation neutron source), respectively (downloaded into Microsoft Excel files and displayed in Figs 2 and S1-S5). Data for Figs 3 and S6 include form factor-structure factor modeling of merged SANS and USANS data, after subtraction of a power law relationship. Modeling was done using Igor Pro-based software written by National Institute of Standards scientists and the model fits to the data and resultant parameters were downloaded to Microsoft Excel files. The models' parameters allowed for determination of box plots and histograms of nanoplastic size and size distribution under several different conditions, Figs 4 and S7, respectively. The latter two figures were generated using JMP software, and were subsequently downloaded to Microsoft Excel files.

  10. o

    Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...

    • openicpsr.org
    Updated Mar 29, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob Kaplan (2018). Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race, 1974-2018 [Dataset]. http://doi.org/10.3886/E102263V11
    Explore at:
    Dataset updated
    Mar 29, 2018
    Dataset provided by
    University of Pennsylvania
    Authors
    Jacob Kaplan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    1974 - 2018
    Area covered
    United States
    Description

    Version 11 release notes:Changes release notes description, does not change data.Version 10 release notes:The data now has the following age categories (which were previously aggregated into larger groups to reduce file size): under 10, 10-12, 13-14, 40-44, 45-49, 50-54, 55-59, 60-64, over 64. These categories are available for female, male, and total (female+male) arrests. The previous aggregated categories (under 15, 40-49, and over 49 have been removed from the data). Version 9 release notes:For each offense, adds a variable indicating the number of months that offense was reported - these variables are labeled as "num_months_[crime]" where [crime] is the offense name. These variables are generated by the number of times one or more arrests were reported per month for that crime. For example, if there was at least one arrest for assault in January, February, March, and August (and no other months), there would be four months reported for assault. Please note that this does not differentiate between an agency not reporting that month and actually having zero arrests. The variable "number_of_months_reported" is still in the data and is the number of months that any offense was reported. So if any agency reports murder arrests every month but no other crimes, the murder number of months variable and the "number_of_months_reported" variable will both be 12 while every other offense number of month variable will be 0. Adds data for 2017 and 2018.Version 8 release notes:Adds annual data in R format.Changes project name to avoid confusing this data for the ones done by NACJD.Fixes bug where bookmaking was excluded as an arrest category. Changed the number of categories to include more offenses per category to have fewer total files. Added a "total_race" file for each category - this file has total arrests by race for each crime and a breakdown of juvenile/adult by race. Version 7 release notes: Adds 1974-1979 dataAdds monthly data (only totals by sex and race, not by age-categories). All data now from FBI, not NACJD. Changes some column names so all columns are <=32 characters to be usable in Stata.Changes how number of months reported is calculated. Now it is the number of unique months with arrest data reported - months of data from the monthly header file (i.e. juvenile disposition data) are not considered in this calculation. Version 6 release notes: Fix bug where juvenile female columns had the same value as juvenile male columns.Version 5 release notes: Removes support for SPSS and Excel data.Changes the crimes that are stored in each file. There are more files now with fewer crimes per file. The files and their included crimes have been updated below.Adds in agencies that report 0 months of the year.Adds a column that indicates the number of months reported. This is generated summing up the number of unique months an agency reports data for. Note that this indicates the number of months an agency reported arrests for ANY crime. They may not necessarily report every crime every month. Agencies that did not report a crime with have a value of NA for every arrest column for that crime.Removes data on runaways.Version 4 release notes: Changes column names from "poss_coke" and "sale_coke" to "poss_heroin_coke" and "sale_heroin_coke" to clearly indicate that these column includes the sale of heroin as well as similar opiates such as morphine, codeine, and opium. Also changes column names for the narcotic columns to indicate that they are only for synthetic narcotics. Version 3 release notes: Add data for 2016.Order rows by year (descending) and ORI.Version 2 release notes: Fix bug where Philadelphia Police Department had incorrect FIPS county code. The Arrests by Age, Sex, and Race (ASR) data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains highly granular data on the number of people arrested for a variety of crimes (see below for a full list of included crimes). The data sets here combine data from the years 1974-2018 into a single file for each group of crimes. Each monthly file is only a single year as my laptop can't handle combining all the years together. These files are quite large and may take some time to load. Columns are crime-arrest category units. For example, If you choose the data set that includes murder, you would have rows for each age

  11. a

    TMS daily traffic counts CSV

    • hub.arcgis.com
    Updated Aug 30, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Waka Kotahi (2020). TMS daily traffic counts CSV [Dataset]. https://hub.arcgis.com/datasets/9cb86b342f2d4f228067a7437a7f7313
    Explore at:
    Dataset updated
    Aug 30, 2020
    Dataset authored and provided by
    Waka Kotahi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    You can also access an API version of this dataset.

    TMS

    (traffic monitoring system) daily-updated traffic counts API

    Important note: due to the size of this dataset, you won't be able to open it fully in Excel. Use notepad / R / any software package which can open more than a million rows.

    Data reuse caveats: as per license.

    Data quality

    statement: please read the accompanying user manual, explaining:

    how

     this data is collected identification 
    
     of count stations traffic 
    
     monitoring technology monitoring 
    
     hierarchy and conventions typical 
    
     survey specification data 
    
     calculation TMS 
    
     operation. 
    

    Traffic

    monitoring for state highways: user manual

    [PDF 465 KB]

    The data is at daily granularity. However, the actual update

    frequency of the data depends on the contract the site falls within. For telemetry

    sites it's once a week on a Wednesday. Some regional sites are fortnightly, and

    some monthly or quarterly. Some are only 4 weeks a year, with timing depending

    on contractors’ programme of work.

    Data quality caveats: you must use this data in

    conjunction with the user manual and the following caveats.

    The

     road sensors used in data collection are subject to both technical errors and 
    
     environmental interference.Data 
    
     is compiled from a variety of sources. Accuracy may vary and the data 
    
     should only be used as a guide.As 
    
     not all road sections are monitored, a direct calculation of Vehicle 
    
     Kilometres Travelled (VKT) for a region is not possible.Data 
    
     is sourced from Waka Kotahi New Zealand Transport Agency TMS data.For 
    
     sites that use dual loops classification is by length. Vehicles with a length of less than 5.5m are 
    
     classed as light vehicles. Vehicles over 11m long are classed as heavy 
    
     vehicles. Vehicles between 5.5 and 11m are split 50:50 into light and 
    
     heavy.In September 2022, the National Telemetry contract was handed to a new contractor. During the handover process, due to some missing documents and aged technology, 40 of the 96 national telemetry traffic count sites went offline. Current contractor has continued to upload data from all active sites and have gradually worked to bring most offline sites back online. Please note and account for possible gaps in data from National Telemetry Sites. 
    

    The NZTA Vehicle

    Classification Relationships diagram below shows the length classification (typically dual loops) and axle classification (typically pneumatic tube counts),

    and how these map to the Monetised benefits and costs manual, table A37,

    page 254.

    Monetised benefits and costs manual [PDF 9 MB]

    For the full TMS

    classification schema see Appendix A of the traffic counting manual vehicle

    classification scheme (NZTA 2011), below.

    Traffic monitoring for state highways: user manual [PDF 465 KB]

    State highway traffic monitoring (map)

    State highway traffic monitoring sites

  12. Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race,...

    • search.datacite.org
    • openicpsr.org
    Updated 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob Kaplan (2018). Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race, 1980-2016 [Dataset]. http://doi.org/10.3886/e102263v5-10021
    Explore at:
    Dataset updated
    2018
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    DataCitehttps://www.datacite.org/
    Authors
    Jacob Kaplan
    Description

    Version 5 release notes:
    Removes support for SPSS and Excel data.Changes the crimes that are stored in each file. There are more files now with fewer crimes per file. The files and their included crimes have been updated below.
    Adds in agencies that report 0 months of the year.Adds a column that indicates the number of months reported. This is generated summing up the number of unique months an agency reports data for. Note that this indicates the number of months an agency reported arrests for ANY crime. They may not necessarily report every crime every month. Agencies that did not report a crime with have a value of NA for every arrest column for that crime.Removes data on runaways.
    Version 4 release notes:
    Changes column names from "poss_coke" and "sale_coke" to "poss_heroin_coke" and "sale_heroin_coke" to clearly indicate that these column includes the sale of heroin as well as similar opiates such as morphine, codeine, and opium. Also changes column names for the narcotic columns to indicate that they are only for synthetic narcotics.
    Version 3 release notes:
    Add data for 2016.Order rows by year (descending) and ORI.Version 2 release notes:
    Fix bug where Philadelphia Police Department had incorrect FIPS county code.
    The Arrests by Age, Sex, and Race data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains highly granular data on the number of people arrested for a variety of crimes (see below for a full list of included crimes). The data sets here combine data from the years 1980-2015 into a single file. These files are quite large and may take some time to load.
    All the data was downloaded from NACJD as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. For the R code used to clean this data, see here. https://github.com/jacobkap/crime_data. If you have any questions, comments, or suggestions please contact me at jkkaplan6@gmail.com.

    I did not make any changes to the data other than the following. When an arrest column has a value of "None/not reported", I change that value to zero. This makes the (possible incorrect) assumption that these values represent zero crimes reported. The original data does not have a value when the agency reports zero arrests other than "None/not reported." In other words, this data does not differentiate between real zeros and missing values. Some agencies also incorrectly report the following numbers of arrests which I change to NA: 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 99999, 99998.

    To reduce file size and make the data more manageable, all of the data is aggregated yearly. All of the data is in agency-year units such that every row indicates an agency in a given year. Columns are crime-arrest category units. For example, If you choose the data set that includes murder, you would have rows for each agency-year and columns with the number of people arrests for murder. The ASR data breaks down arrests by age and gender (e.g. Male aged 15, Male aged 18). They also provide the number of adults or juveniles arrested by race. Because most agencies and years do not report the arrestee's ethnicity (Hispanic or not Hispanic) or juvenile outcomes (e.g. referred to adult court, referred to welfare agency), I do not include these columns.

    To make it easier to merge with other data, I merged this data with the Law Enforcement Agency Identifiers Crosswalk (LEAIC) data. The data from the LEAIC add FIPS (state, county, and place) and agency type/subtype. Please note that some of the FIPS codes have leading zeros and if you open it in Excel it will automatically delete those leading zeros.

    I created 9 arrest categories myself. The categories are:
    Total Male JuvenileTotal Female JuvenileTotal Male AdultTotal Female AdultTotal MaleTotal FemaleTotal JuvenileTotal AdultTotal ArrestsAll of these categories are based on the sums of the sex-age categories (e.g. Male under 10, Female aged 22) rather than using the provided age-race categories (e.g. adult Black, juvenile Asian). As not all agencies report the race data, my method is more accurate. These categories also make up the data in the "simple" version of the data. The "simple" file only includes the above 9 columns as the arrest data (all other columns in the data are just agency identifier columns). Because this "simple" data set need fewer columns, I include all offenses.

    As the arrest data is very granular, and each category of arrest is its own column, there are dozens of columns per crime. To keep the data somewhat manageable, there are nine different files, eight which contain different crimes and the "simple" file. Each file contains the data for all years. The eight categories each have crimes belonging to a major crime category and do not overlap in crimes other than with the index offenses. Please note that the crime names provided below are not the same as the column names in the data. Due to Stata limiting column names to 32 characters maximum, I have abbreviated the crime names in the data. The files and their included crimes are:

    Index Crimes
    MurderRapeRobberyAggravated AssaultBurglaryTheftMotor Vehicle TheftArsonAlcohol CrimesDUIDrunkenness
    LiquorDrug CrimesTotal DrugTotal Drug SalesTotal Drug PossessionCannabis PossessionCannabis SalesHeroin or Cocaine PossessionHeroin or Cocaine SalesOther Drug PossessionOther Drug SalesSynthetic Narcotic PossessionSynthetic Narcotic SalesGrey Collar and Property CrimesForgeryFraudStolen PropertyFinancial CrimesEmbezzlementTotal GamblingOther GamblingBookmakingNumbers LotterySex or Family CrimesOffenses Against the Family and Children
    Other Sex Offenses
    ProstitutionRapeViolent CrimesAggravated AssaultMurderNegligent ManslaughterRobberyWeapon Offenses
    Other CrimesCurfewDisorderly ConductOther Non-trafficSuspicion
    VandalismVagrancy
    Simple
    This data set has every crime and only the arrest categories that I created (see above).
    If you have any questions, comments, or suggestions please contact me at jkkaplan6@gmail.com.

  13. o

    Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...

    • openicpsr.org
    Updated Mar 29, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob Kaplan (2018). Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race, 1974-2020 [Dataset]. http://doi.org/10.3886/E102263V14
    Explore at:
    Dataset updated
    Mar 29, 2018
    Dataset provided by
    Princeton University
    Authors
    Jacob Kaplan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    1974 - 2020
    Area covered
    United States
    Description

    For a comprehensive guide to this data and other UCR data, please see my book at ucrbook.comVersion 14 release notes:Adds 2020 data. Please note that the FBI has retired UCR data ending in 2020 data so this will be the last Arrests by Age, Sex, and Race data they release. Version 13 release notes:Changes R files from .rda to .rds.Fixes bug where the number_of_months_reported variable incorrectly was the largest of the number of months reported for a specific crime variable. For example, if theft was reported Jan-June and robbery was reported July-December in an agency, in total there were 12 months reported. But since each crime (and let's assume no other crime was reported more than 6 months of the year) only was reported 6 months, the number_of_months_reported variable was incorrectly set at 6 months. Now it is the total number of months reported of any crime. So it would be set to 12 months in this example. Thank you to Nick Eubank for alerting me to this issue.Adds rows even when a agency reported zero arrests that month; all arrest values are set to zero for these rows.Version 12 release notes:Adds 2019 data.Version 11 release notes:Changes release notes description, does not change data.Version 10 release notes:The data now has the following age categories (which were previously aggregated into larger groups to reduce file size): under 10, 10-12, 13-14, 40-44, 45-49, 50-54, 55-59, 60-64, over 64. These categories are available for female, male, and total (female+male) arrests. The previous aggregated categories (under 15, 40-49, and over 49 have been removed from the data). Version 9 release notes:For each offense, adds a variable indicating the number of months that offense was reported - these variables are labeled as "num_months_[crime]" where [crime] is the offense name. These variables are generated by the number of times one or more arrests were reported per month for that crime. For example, if there was at least one arrest for assault in January, February, March, and August (and no other months), there would be four months reported for assault. Please note that this does not differentiate between an agency not reporting that month and actually having zero arrests. The variable "number_of_months_reported" is still in the data and is the number of months that any offense was reported. So if any agency reports murder arrests every month but no other crimes, the murder number of months variable and the "number_of_months_reported" variable will both be 12 while every other offense number of month variable will be 0. Adds data for 2017 and 2018.Version 8 release notes:Adds annual data in R format.Changes project name to avoid confusing this data for the ones done by NACJD.Fixes bug where bookmaking was excluded as an arrest category. Changed the number of categories to include more offenses per category to have fewer total files. Added a "total_race" file for each category - this file has total arrests by race for each crime and a breakdown of juvenile/adult by race. Version 7 release notes: Adds 1974-1979 dataAdds monthly data (only totals by sex and race, not by age-categories). All data now from FBI, not NACJD. Changes some column names so all columns are <=32 characters to be usable in Stata.Changes how number of months reported is calculated. Now it is the number of unique months with arrest data reported - months of data from the monthly header file (i.e. juvenile disposition data) are not considered in this calculation. Version 6 release notes: Fix bug where juvenile female columns had the same value as juvenile male columns.Version 5 release notes: Removes support for SPSS and Excel data.Changes the crimes that are stored in each file. There are more files now with fewer crimes per file. The files and their included crimes have been updated below.Adds in agencies that report 0 months of the year.Adds a column that indicates the number of months reported. This is generated summing up the number of unique months an agency reports data for. Note that this indicates the number of months an agency reported arrests for ANY crime. They may not necessarily report every crime every month. Agencies that did not report a crime with have a value of NA for every arrest column for that crime.Removes data on runaways.Version 4 release notes: Changes column names from "poss_coke" and "sale_coke" to "poss_heroi

  14. a

    Community Level Interaction Program (CLIP) 2 Excel Data, Latnjajaure site,...

    • arcticdata.io
    • search.dataone.org
    Updated Sep 28, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juha Alatalo (2020). Community Level Interaction Program (CLIP) 2 Excel Data, Latnjajaure site, Sweden, 1995-2000 [Dataset]. http://doi.org/10.18739/A2B56D55W
    Explore at:
    Dataset updated
    Sep 28, 2020
    Dataset provided by
    Arctic Data Center
    Authors
    Juha Alatalo
    Time period covered
    Jul 28, 1995 - Jul 26, 2000
    Area covered
    Sweden,
    Description

    This dataset contains CLIP 2 Excel Community data from the Latnjajaure site, Sweden in 1995, 1996, 1997, 1999 & 2000. The Community Level Interaction Program (CLIP) data comprises a block in a mesic meadow. This dataset is in excel format and contains all the data for CLIP 2. For more information, please see the readme file.

  15. d

    CompanyData.com (BoldData) - Technographic Data (IT data from 380M+...

    • datarade.ai
    Updated Oct 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CompanyData.com (BoldData) (2023). CompanyData.com (BoldData) - Technographic Data (IT data from 380M+ businesses, 150+ countries) [Dataset]. https://datarade.ai/data-products/technographic-data-it-data-from-340m-businesses-150-count-bolddata
    Explore at:
    .json, .csv, .xls, .txtAvailable download formats
    Dataset updated
    Oct 19, 2023
    Dataset authored and provided by
    CompanyData.com (BoldData)
    Area covered
    Indonesia, Timor-Leste, Hong Kong, Malawi, Dominican Republic, Andorra, Samoa, Faroe Islands, Singapore, Belarus
    Description

    CompanyData.com (BoldData) provides in-depth technographic data on over 380 million verified businesses across 150+ countries. Our service combines official trade registry data with detailed technology usage insights—giving you a comprehensive view of a company’s IT stack, software tools, infrastructure, and digital maturity. Whether you're selling SaaS, targeting specific platforms, or building data models around tech adoption, our global technographic intelligence is built for precision.

    Each company profile is enriched with firmographic details (like industry, size, and location), contact information (including emails, direct dials, and key decision-makers), and layered with verified insights into the technologies they use—from CRM systems and cloud platforms to cybersecurity solutions and developer tools. All data is continuously updated, cross-checked with public records, and structured to support enterprise-grade analytics and outreach.

    Our technographic data is used across industries for a wide range of strategic initiatives: tech sales targeting, account-based marketing, competitive intelligence, CRM enrichment, AI model training, and more. Companies rely on our insights to segment audiences, align messaging with tech stacks, and identify the right moment to engage. Whether you're pursuing mid-market tech buyers or global enterprise leads, our data empowers smarter engagement.

    Data can be delivered however you need it—through tailored CSV or Excel files, real-time API connections, or our user-friendly self-service platform. With global coverage and decades of expertise in structured business intelligence, CompanyData.com (BoldData) enables companies to navigate the digital economy with sharper focus, deeper insights, and data they can trust.

  16. p

    Population and Housing Census 2011 - Niue

    • microdata.pacificdata.org
    Updated Aug 18, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niue Statistics (2013). Population and Housing Census 2011 - Niue [Dataset]. https://microdata.pacificdata.org/index.php/catalog/24
    Explore at:
    Dataset updated
    Aug 18, 2013
    Dataset authored and provided by
    Niue Statistics
    Time period covered
    2011
    Area covered
    Niue
    Description

    Abstract

    The main aim and objectives of the census is to provide benchmark statistics and a comprehensive profile of the population and households of Niue at a given time. This information obtained from the census is very crucial and useful in providing evidence to decision making and policy formulation for the Government, Business Community, Local Communities or Village Councils, Non Government Organisations of Niue and The International Communities who have an interest in Niue and its people.

    Geographic coverage

    National Coverage

    Analysis unit

    A Population and Household Census have the following units of analysis: - Households - Individuals/Persons - Members Overseas

    Universe

    All households in Niue and all persons in the household including those temporarily overseas and those absent for not more than 12 months.

    Kind of data

    Census/enumeration data [cen]

    Sampling procedure

    Not Applicable to a complete Enumeration Census.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionaire was published in English, a translated questionnaire was on hand when on demand by the respondent.

    The questionnaire design differed slightly from the design of previous census questionnaires. As usual, government departments were asked to submit a list of questions on any specific topic they would like to add. Responses were not forthcoming in this census, although a few new questions were included.

    There were two types of questionaires used in the census: the household questionaire and the individual questionnaire. An enumerator manual was prepared to assist the enumerators in their duties.

    The questionnaire was pre-tested by the enumerators before they were to go out for field enumeration.

    Cleaning operations

    Census processing began as soon as questionaires were checked and coded. Forms were checked, edited and coded before being entered into the computer database.

    Data processing was assisted by the Secretariat of the Pacific Community (SPC) using the computer software program CSPro for data entry and for generating tables. Tables were then exported to Excel for analysis.

    Occupation and Industry were coded using the United Nations International Standard Classification of Occupation and International Standard Industrial Classification.

    It is standard practice that as each area was completed the forms were first checked by the field supervisors for missing information and obvious inconsistencies. Omissions and errors identified at this stage were corrected by the enumerators.

    The next stage was for the field supervisors to go through the completed forms again in the office to check in more detail for omissions and logical inconsistencies. Where they were found, the supervisors were responsible to take the necessary action.

    Once the questionnaires had been thoroughly checked and edited, they were then coded in preparation for data processing.

    Checking, editing and coding of the questionnaires in office were done after normal working hours as to ensure that the confidentiality of the survey is well observed.

    Response rate

    Complete enumeration of all households

    Sampling error estimates

    Not Applicable

  17. d

    CompanyData.com (BoldData) - BoldData - B2B Contact Data & Company Data in...

    • datarade.ai
    Updated Aug 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CompanyData.com (BoldData) (2025). CompanyData.com (BoldData) - BoldData - B2B Contact Data & Company Data in Transport and Logistics (1.3M contacts) [Dataset]. https://datarade.ai/data-products/transport-and-logistics-data-bolddata
    Explore at:
    .json, .csv, .xls, .txtAvailable download formats
    Dataset updated
    Aug 5, 2025
    Dataset authored and provided by
    CompanyData.com (BoldData)
    Area covered
    Northern Mariana Islands, Saint Martin (French part), Jordan, Egypt, Suriname, Taiwan, Bosnia and Herzegovina, Palestine, Latvia, Argentina
    Description

    CompanyData.com (powered by BoldData) offers industry-specific B2B intelligence with a focus on accuracy, compliance, and global reach. Our transport and logistics database features 1.3 million verified contacts across logistics providers, freight carriers, warehousing companies, shipping firms, and supply chain operators. Sourced from official trade registers and public authorities, this dataset connects you to the core of one of the world’s most vital and fast-moving sectors.

    Each record includes firmographic data such as company size, location, and industry codes—along with verified contact details like emails, mobile numbers, direct dials, and decision-maker names. You also get access to company hierarchies and operational insights that allow for deeper segmentation and more effective outreach. The data is GDPR-compliant, constantly refreshed, and structured for seamless use in sales, analytics, and automation tools.

    Businesses use this data for a wide range of purposes: lead generation, CRM enrichment, marketing automation, AI modeling, supply chain research, KYC checks, and more. Whether you’re selling fleet management software or analyzing international freight networks, our data gives you the insight and accuracy to engage the right people at the right companies—at the right time.

    We offer flexible delivery options tailored to your workflow. Choose from customized Excel or CSV exports, access data in real-time via our robust API, or explore companies through our user-friendly self-service platform. With coverage spanning over 380 million companies across 190+ countries, CompanyData.com (BoldData) helps you unlock smarter strategies and measurable results in transport and logistics.

  18. p

    RCS Data Austria

    • listtodata.com
    • jw.listtodata.com
    .csv, .xls, .txt
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    List to Data (2025). RCS Data Austria [Dataset]. https://listtodata.com/rcs-data-austria
    Explore at:
    .csv, .xls, .txtAvailable download formats
    Dataset updated
    Jul 17, 2025
    Authors
    List to Data
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2025 - Dec 31, 2025
    Area covered
    Austria
    Variables measured
    phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
    Description

    RCS Data Austria is a very significant element countrywide for direct marketing campaigns. Likewise, it is the most influential directory that offers all B2C contacts. Moreover, this database website is very prevalent worldwide for delivering 95% accurate contact numbers. Around 9.01 million people live here and we have their active contacts list. As a vendor, you can advertise company details instantly. It increases productivity and gets a huge return on investment (ROI). However, RCS Data Austria can be a potential tool for SMS marketing now. Besides, the website provides you with many genuine sales leads at an affordable price. In other words, the seller will bring more profit than expenses from the business. The economic change is increasing day by day in the country so you can start any business from here. Moreover, RCS Data Austria is very helpful for marketing and business. Further, this lead will play a crucial role in your direct business method. Austria RCS Data will give many potential contacts for digital marketing. Additionally, our skilled unit collects these contact leads from very genuine sites. In fact, it takes less time to express with many new clients. Thus, it creates huge possibilities for the company to increase sales. Mainly, we do not compromise on protection so we uphold the accurate rules of GDPR. For this reason, you can carry it without any mistrust. Above all, this Austria RCS Data is very effective for business publicity through SMS. Also, it is beneficial to share your trade info by sending text messages to the customers. They will know about this dataset instantly and show you feedback. After buying this lead, we deliver it to you in a CSV or Excel layout. Everyone can operate this in CRM software anytime. In the end, buy this Austria RCS Data right now from our site.

  19. e

    Supplement to: Smart speed imaging in Digital Image Correlation: Application...

    • b2find.eudat.eu
    Updated Nov 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Supplement to: Smart speed imaging in Digital Image Correlation: Application to Seismotectonic Scale Modelling - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/2736910e-455d-5036-a308-bbfb8eee87d6
    Explore at:
    Dataset updated
    Nov 29, 2022
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The presented datasets and scripts have been obtained for testing the performance of a trigger algorithm for use in combination with a ringshear tester ‘RST-01.pc’. Glass beads (fused quartz microbeads, 300-400 µm diameter) and thai rice are sheared at varying velocity, stiffness and normal load. The data is provided as preprocessed mat-files ('*.mat') to be opened with Matlab R2015a and later. Several scripts are provided to reproduce the figures found in (Rudolf et al., submitted). A detailed list of files together with the respective software needed to view and execute them is available in 'List_of_Files_Rudolf-et-al-2018.pdf' (also available in MS Excel Format). More information on the datasets and a small documentation of the scripts is given in 'Explanations_Rudolf-et-al-2018.pdf'. The complete data publication, including all descriptions, datasets, and evaluation scripts is available as 'Dataset_Rudolf-et-al-2018.zip'.

  20. Asylum and resettlement - Historic datasets

    • gov.uk
    Updated Aug 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Home Office (2023). Asylum and resettlement - Historic datasets [Dataset]. https://www.gov.uk/government/statistical-data-sets/asylum-and-resettlement-datasets
    Explore at:
    Dataset updated
    Aug 24, 2023
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Home Office
    Description

    This page contains data for the immigration system statistics up to March 2023.

    For current immigration system data, visit ‘Immigration system statistics data tables’.

    Asylum applications, decisions and resettlement

    https://assets.publishing.service.gov.uk/media/64625e6894f6df0010f5eaab/asylum-applications-datasets-mar-2023.xlsx">Asylum applications, initial decisions and resettlement (MS Excel Spreadsheet, 9.13 MB)
    Asy_D01: Asylum applications raised, by nationality, age, sex, UASC, applicant type, and location of application
    Asy_D02: Outcomes of asylum applications at initial decision, and refugees resettled in the UK, by nationality, age, sex, applicant type, and UASC
    This is not the latest data

    https://assets.publishing.service.gov.uk/media/64625ec394f6df0010f5eaac/asylum-applications-awaiting-decision-datasets-mar-2023.xlsx">Asylum applications awaiting a decision (MS Excel Spreadsheet, 1.26 MB)
    Asy_D03: Asylum applications awaiting an initial decision or further review, by nationality and applicant type
    This is not the latest data

    https://assets.publishing.service.gov.uk/media/62fa17698fa8f50b54374371/outcome-analysis-asylum-applications-datasets-jun-2022.xlsx">Outcome analysis of asylum applications (MS Excel Spreadsheet, 410 KB)
    Asy_D04: The initial decision and final outcome of all asylum applications raised in a period, by nationality
    This is not the latest data

    Age disputes

    https://assets.publishing.service.gov.uk/media/64625ef1427e41000cb437cb/age-disputes-datasets-mar-2023.xlsx">Age disputes (MS Excel Spreadsheet, 178 KB)
    Asy_D05: Age disputes raised and outcomes of age disputes
    This is not the latest data

    Asylum appeals

    https://assets.publishing.service.gov.uk/media/64625f0ca09dfc000c3c17cf/asylum-appeals-lodged-datasets-mar-2023.xlsx">Asylum appeals lodged and determined (MS Excel Spreadsheet, 817 KB)
    Asy_D06: Asylum appeals raised at the First-Tier Tribunal, by nationality and sex
    Asy_D07: Outcomes of asylum appeals raised at the First-Tier Tribunal, by nationality and sex
    This is not the latest data

    https://assets.publishing.service.gov.uk/media/64625f29427e41000cb437cd/asylum-claims-certified-section-94-datasets-mar-2023.xlsx"> Asylum claims certified under Section 94 (MS Excel Spreadsheet, 150 KB)
    Asy_D08: Initial decisions on asylum applications certified under Section 94, by nationality
    This is not the latest data

    Asylum support

    https://assets.publishing.service.gov.uk/media/6463a618d3231e000c32da99/asylum-seekers-receipt-support-datasets-mar-2023.xlsx">Asylum seekers in receipt of support (MS Excel Spreadsheet, 2.16 MB)
    Asy_D09: Asylum seekers in receipt of support at end of period, by nationality, support type, accommodation type, and UK region
    This is not the latest data

    https://assets.publishing.service.gov.uk/media/63ecd7388fa8f5612a396c40/applications-section-95-support-datasets-dec-2022.xlsx">Applications for section 95 su

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Niagara Open Data [Dataset]. https://catalog.civicdataecosystem.org/dataset/niagara-open-data

Niagara Open Data

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
Description

The Ontario government, generates and maintains thousands of datasets. Since 2012, we have shared data with Ontarians via a data catalogue. Open data is data that is shared with the public. Click here to learn more about open data and why Ontario releases it. Ontario’s Open Data Directive states that all data must be open, unless there is good reason for it to remain confidential. Ontario’s Chief Digital and Data Officer also has the authority to make certain datasets available publicly. Datasets listed in the catalogue that are not open will have one of the following labels: If you want to use data you find in the catalogue, that data must have a licence – a set of rules that describes how you can use it. A licence: Most of the data available in the catalogue is released under Ontario’s Open Government Licence. However, each dataset may be shared with the public under other kinds of licences or no licence at all. If a dataset doesn’t have a licence, you don’t have the right to use the data. If you have questions about how you can use a specific dataset, please contact us. The Ontario Data Catalogue endeavors to publish open data in a machine readable format. For machine readable datasets, you can simply retrieve the file you need using the file URL. The Ontario Data Catalogue is built on CKAN, which means the catalogue has the following features you can use when building applications. APIs (Application programming interfaces) let software applications communicate directly with each other. If you are using the catalogue in a software application, you might want to extract data from the catalogue through the catalogue API. Note: All Datastore API requests to the Ontario Data Catalogue must be made server-side. The catalogue's collection of dataset metadata (and dataset files) is searchable through the CKAN API. The Ontario Data Catalogue has more than just CKAN's documented search fields. You can also search these custom fields. You can also use the CKAN API to retrieve metadata about a particular dataset and check for updated files. Read the complete documentation for CKAN's API. Some of the open data in the Ontario Data Catalogue is available through the Datastore API. You can also search and access the machine-readable open data that is available in the catalogue. How to use the API feature: Read the complete documentation for CKAN's Datastore API. The Ontario Data Catalogue contains a record for each dataset that the Government of Ontario possesses. Some of these datasets will be available to you as open data. Others will not be available to you. This is because the Government of Ontario is unable to share data that would break the law or put someone's safety at risk. You can search for a dataset with a word that might describe a dataset or topic. Use words like “taxes” or “hospital locations” to discover what datasets the catalogue contains. You can search for a dataset from 3 spots on the catalogue: the homepage, the dataset search page, or the menu bar available across the catalogue. On the dataset search page, you can also filter your search results. You can select filters on the left hand side of the page to limit your search for datasets with your favourite file format, datasets that are updated weekly, datasets released by a particular organization, or datasets that are released under a specific licence. Go to the dataset search page to see the filters that are available to make your search easier. You can also do a quick search by selecting one of the catalogue’s categories on the homepage. These categories can help you see the types of data we have on key topic areas. When you find the dataset you are looking for, click on it to go to the dataset record. Each dataset record will tell you whether the data is available, and, if so, tell you about the data available. An open dataset might contain several data files. These files might represent different periods of time, different sub-sets of the dataset, different regions, language translations, or other breakdowns. You can select a file and either download it or preview it. Make sure to read the licence agreement to make sure you have permission to use it the way you want. Read more about previewing data. A non-open dataset may be not available for many reasons. Read more about non-open data. Read more about restricted data. Data that is non-open may still be subject to freedom of information requests. The catalogue has tools that enable all users to visualize the data in the catalogue without leaving the catalogue – no additional software needed. Have a look at our walk-through of how to make a chart in the catalogue. Get automatic notifications when datasets are updated. You can choose to get notifications for individual datasets, an organization’s datasets or the full catalogue. You don’t have to provide and personal information – just subscribe to our feeds using any feed reader you like using the corresponding notification web addresses. Copy those addresses and paste them into your reader. Your feed reader will let you know when the catalogue has been updated. The catalogue provides open data in several file formats (e.g., spreadsheets, geospatial data, etc). Learn about each format and how you can access and use the data each file contains. A file that has a list of items and values separated by commas without formatting (e.g. colours, italics, etc.) or extra visual features. This format provides just the data that you would display in a table. XLSX (Excel) files may be converted to CSV so they can be opened in a text editor. How to access the data: Open with any spreadsheet software application (e.g., Open Office Calc, Microsoft Excel) or text editor. Note: This format is considered machine-readable, it can be easily processed and used by a computer. Files that have visual formatting (e.g. bolded headers and colour-coded rows) can be hard for machines to understand, these elements make a file more human-readable and less machine-readable. A file that provides information without formatted text or extra visual features that may not follow a pattern of separated values like a CSV. How to access the data: Open with any word processor or text editor available on your device (e.g., Microsoft Word, Notepad). A spreadsheet file that may also include charts, graphs, and formatting. How to access the data: Open with a spreadsheet software application that supports this format (e.g., Open Office Calc, Microsoft Excel). Data can be converted to a CSV for a non-proprietary format of the same data without formatted text or extra visual features. A shapefile provides geographic information that can be used to create a map or perform geospatial analysis based on location, points/lines and other data about the shape and features of the area. It includes required files (.shp, .shx, .dbt) and might include corresponding files (e.g., .prj). How to access the data: Open with a geographic information system (GIS) software program (e.g., QGIS). A package of files and folders. The package can contain any number of different file types. How to access the data: Open with an unzipping software application (e.g., WinZIP, 7Zip). Note: If a ZIP file contains .shp, .shx, and .dbt file types, it is an ArcGIS ZIP: a package of shapefiles which provide information to create maps or perform geospatial analysis that can be opened with ArcGIS (a geographic information system software program). A file that provides information related to a geographic area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open using a GIS software application to create a map or do geospatial analysis. It can also be opened with a text editor to view raw information. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format for sharing data in a machine-readable way that can store data with more unconventional structures such as complex lists. How to access the data: Open with any text editor (e.g., Notepad) or access through a browser. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format to store and organize data in a machine-readable way that can store data with more unconventional structures (not just data organized in tables). How to access the data: Open with any text editor (e.g., Notepad). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A file that provides information related to an area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open with a geospatial software application that supports the KML format (e.g., Google Earth). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. This format contains files with data from tables used for statistical analysis and data visualization of Statistics Canada census data. How to access the data: Open with the Beyond 20/20 application. A database which links and combines data from different files or applications (including HTML, XML, Excel, etc.). The database file can be converted to a CSV/TXT to make the data machine-readable, but human-readable formatting will be lost. How to access the data: Open with Microsoft Office Access (a database management system used to develop application software). A file that keeps the original layout and

Search
Clear search
Close search
Google apps
Main menu