24 datasets found

Data Make False Dataset
universe.roboflow.com
zip
Updated Mar 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Syngenta 2 (2025). Data Make False Dataset [Dataset]. https://universe.roboflow.com/data-syngenta-2/data-make-false
Explore at:
zipAvailable download formats
Dataset updated
Mar 20, 2025
Dataset provided by
Syngenta
Authors
Data Syngenta 2
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Forklift Person Bounding Boxes
Description
Data Make False

## Overview Data Make False is a dataset for object detection tasks - it contains Forklift Person annotations for 765 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Statewide Crop Mapping
data.cnra.ca.gov
data.ca.gov
+3more
data, gdb, html +3
Updated Mar 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Water Resources (2025). Statewide Crop Mapping [Dataset]. https://data.cnra.ca.gov/dataset/statewide-crop-mapping
Explore at:
rest service, zip(140021333), shp(126828193), zip(159870566), gdb(86886429), shp(126548912), shp(107610538), gdb(86655350), gdb(85891531), zip(144060723), data, html, zip(169400976), zip(189880202), zip(98690638), zip(179113742), zip(94630663), zip(88308707), gdb(76631083)Available download formats
Dataset updated
Mar 3, 2025
Dataset authored and provided by
California Department of Water Resourceshttp://www.water.ca.gov/
Description
NOTICE TO PROVISIONAL 2023 LAND USE DATA USERS: Please note that on December 6, 2024 the Department of Water Resources (DWR) published the Provisional 2023 Statewide Crop Mapping dataset. The link for the shapefile format of the data mistakenly linked to the wrong dataset. The link was updated with the appropriate data on January 27, 2025. If you downloaded the Provisional 2023 Statewide Crop Mapping dataset in shapefile format between December 6, 2024 and January 27, we encourage you to redownload the data. The Map Service and Geodatabase formats were correct as posted on December 06, 2024.

Thank you for your interest in DWR land use datasets.

The California Department of Water Resources (DWR) has been collecting land use data throughout the state and using it to develop agricultural water use estimates for statewide and regional planning purposes, including water use projections, water use efficiency evaluations, groundwater model developments, climate change mitigation and adaptations, and water transfers. These data are essential for regional analysis and decision making, which has become increasingly important as DWR and other state agencies seek to address resource management issues, regulatory compliances, environmental impacts, ecosystem services, urban and economic development, and other issues. Increased availability of digital satellite imagery, aerial photography, and new analytical tools make remote sensing-based land use surveys possible at a field scale that is comparable to that of DWR’s historical on the ground field surveys. Current technologies allow accurate large-scale crop and land use identifications to be performed at desired time increments and make possible more frequent and comprehensive statewide land use information. Responding to this need, DWR sought expertise and support for identifying crop types and other land uses and quantifying crop acreages statewide using remotely sensed imagery and associated analytical techniques. Currently, Statewide Crop Maps are available for the Water Years 2014, 2016, 2018- 2022 and PROVISIONALLY for 2023.

Historic County Land Use Surveys spanning 1986 - 2015 may also be accessed using the CADWR Land Use Data Viewer: https://gis.water.ca.gov/app/CADWRLandUseViewer.

For Regional Land Use Surveys follow: https://data.cnra.ca.gov/dataset/region-land-use-surveys.

For County Land Use Surveys follow: https://data.cnra.ca.gov/dataset/county-land-use-surveys.

For a collection of ArcGIS Web Applications that provide information on the DWR Land Use Program and our data products in various formats, visit the DWR Land Use Gallery: https://storymaps.arcgis.com/collections/dd14ceff7d754e85ab9c7ec84fb8790a.

Recommended citation for DWR land use data: California Department of Water Resources. (Water Year for the data). Statewide Crop Mapping—California Natural Resources Agency Open Data. Retrieved “Month Day, YEAR,” from https://data.cnra.ca.gov/dataset/statewide-crop-mapping.
H
Replication Data for: How to Make Causal Inferences with Time-Series...
dataverse.harvard.edu
text/markdown, tsv +1
Updated Aug 10, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2018). Replication Data for: How to Make Causal Inferences with Time-Series Cross-Sectional Data under Selection on Observables [Dataset]. http://doi.org/10.7910/DVN/SFBX6Z
Explore at:
text/markdown(1156), type/x-r-syntax(5168), tsv(19127)Available download formats
Unique identifier
https://doi.org/10.7910/DVN/SFBX6Z
Dataset updated
Aug 10, 2018
Dataset provided by
Harvard Dataverse
Description
Data and code to replicate findings from "How to Make Causal Inferences with Time-Series Cross-Sectional Data under Selection on Observables" by Matthew Blackwell and Adam Glynn. Paper Abstract: Repeated measurements of the same countries, people, or groups over time are vital to many fields of political science. These measurements, sometimes called time-series cross-sectional (TSCS) data, allow researchers to estimate a broad set of causal quantities, including contemporaneous effects and direct effects of lagged treatments. Unfortunately, popular methods for TSCS data can only produce valid inferences for lagged effects under very strong assumptions. In this paper, we use potential outcomes to define causal quantities of interest in this settings and clarify how standard models like the autoregressive distributed lag model can produce biased estimates of these quantities due to post-treatment conditioning. We then describe two estimation strategies that avoid these post-treatment biases---inverse probability weighting and structural nested mean models---and show via simulations that they can outperform standard approaches in small sample settings. We illustrate these methods in a study of how welfare spending affects terrorism.
a
Integrating Data in ArcGIS Pro
hub.arcgis.com
Updated Mar 25, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
State of Delaware (2020). Integrating Data in ArcGIS Pro [Dataset]. https://hub.arcgis.com/documents/3a11f895a7dc4d28ad45cee9cc5ba6d8
Explore at:
Dataset updated
Mar 25, 2020
Dataset authored and provided by
State of Delaware
Description
In this course, you will learn about some common types of data used for GIS mapping and analysis, and practice adding data to a file geodatabase to support a planned project.Goals Create a file geodatabase. Add data to a file geodatabase. Create an empty geodatabase feature class.
A standardized and reproducible method to measure decision-making in mice:...
figshare.com
png
Updated Feb 7, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
International Brain Laboratory (2020). A standardized and reproducible method to measure decision-making in mice: Data [Dataset]. http://doi.org/10.6084/m9.figshare.11636748.v7
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11636748.v7
Dataset updated
Feb 7, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
International Brain Laboratory
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Behavioral data associated with the IBL paper: A standardized and reproducible method to measure decision-making in mice.This data set contains contains 3 million choices 101 mice across seven laboratories at six different research institutions in three countries obtained during a perceptual decision making task.When citing this data, please also cite the associated paper: https://doi.org/10.1101/2020.01.17.909838This data can also be accessed using DataJoint and web browser tools at data.internationalbrainlab.orgAdditionally, we provide a Binder hosted interactive Jupyter notebook showing how to access the data via the Open Neurophysiology Environment (ONE) interface in Python : https://mybinder.org/v2/gh/int-brain-lab/paper-behavior-binder/master?filepath=one_example.ipynbFor more information about the International Brain Laboratory please see our website: www.internationalbrainlab.comBeta Disclaimer. Please note that this is a beta version of the IBL dataset, which is still undergoing final quality checks. If you find any issues or inconsistencies in the data, please contact us at info+behavior@internationalbrainlab.org .
U
Data to create and evaluate distribution models for invasive species for...
data.usgs.gov
s.cnmilf.com
+1more
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Catherine Jarnevich; Helen Sofaer; Pairsa Belamaric; Peder Engelstad, Data to create and evaluate distribution models for invasive species for different geographic extents [Dataset]. http://doi.org/10.5066/P90AL0PN
Explore at:
Unique identifier
https://doi.org/10.5066/P90AL0PN
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Catherine Jarnevich; Helen Sofaer; Pairsa Belamaric; Peder Engelstad
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
1980 - 2021
Description
We developed habitat suitability models for invasive plant species selected by Department of Interior land management agencies. We applied the modeling workflow developed in Young et al. 2020 to species not included in the original case studies. Our methodology balanced trade-offs between developing highly customized models for a few species versus fitting non-specific and generic models for numerous species. We developed a national library of environmental variables known to physiologically limit plant distributions (Engelstad et al. 2022 Table S1: https://doi.org/10.1371/journal.pone.0263056) and relied on human input based on natural history knowledge to further narrow the variable set for each species before developing habitat suitability models. We developed models using five algorithms with VisTrails: Software for Assisted Habitat Modeling [SAHM 2.1.2]. We accounted for uncertainty related to sampling bias by using two alternative sources of background samples, and construct ...
C
National Hydrography Data - NHD and 3DHP
data.cnra.ca.gov
data.ca.gov
+3more
Updated Jul 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Water Resources (2025). National Hydrography Data - NHD and 3DHP [Dataset]. https://data.cnra.ca.gov/dataset/national-hydrography-dataset-nhd
Explore at:
pdf, csv(12977), zip(73817620), pdf(3684753), website, zip(13901824), pdf(4856863), web videos, zip(578260992), pdf(1436424), zip(128966494), pdf(182651), zip(972664), zip(10029073), zip(1647291), pdf(1175775), zip(4657694), pdf(1634485), zip(15824984), zip(39288832), arcgis geoservices rest api, pdf(437025), pdf(9867020)Available download formats
Dataset updated
Jul 1, 2025
Dataset authored and provided by
California Department of Water Resources
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
The USGS National Hydrography Dataset (NHD) downloadable data collection from The National Map (TNM) is a comprehensive set of digital spatial data that encodes information about naturally occurring and constructed bodies of surface water (lakes, ponds, and reservoirs), paths through which water flows (canals, ditches, streams, and rivers), and related entities such as point features (springs, wells, stream gages, and dams). The information encoded about these features includes classification and other characteristics, delineation, geographic name, position and related measures, a "reach code" through which other information can be related to the NHD, and the direction of water flow. The network of reach codes delineating water and transported material flow allows users to trace movement in upstream and downstream directions. In addition to this geographic information, the dataset contains metadata that supports the exchange of future updates and improvements to the data. The NHD supports many applications, such as making maps, geocoding observations, flow modeling, data maintenance, and stewardship. For additional information on NHD, go to https://www.usgs.gov/core-science-systems/ngp/national-hydrography.

DWR was the steward for NHD and Watershed Boundary Dataset (WBD) in California. We worked with other organizations to edit and improve NHD and WBD, using the business rules for California. California's NHD improvements were sent to USGS for incorporation into the national database. The most up-to-date products are accessible from the USGS website. Please note that the California portion of the National Hydrography Dataset is appropriate for use at the 1:24,000 scale.

For additional derivative products and resources, including the major features in geopackage format, please go to this page: https://data.cnra.ca.gov/dataset/nhd-major-features Archives of previous statewide extracts of the NHD going back to 2018 may be found at https://data.cnra.ca.gov/dataset/nhd-archive.

In September 2022, USGS officially notified DWR that the NHD would become static as USGS resources will be devoted to the transition to the new 3D Hydrography Program (3DHP). 3DHP will consist of LiDAR-derived hydrography at a higher resolution than NHD. Upon completion, 3DHP data will be easier to maintain, based on a modern data model and architecture, and better meet the requirements of users that were documented in the Hydrography Requirements and Benefits Study (2016). The initial releases of 3DHP include NHD data cross-walked into the 3DHP data model. It will take several years for the 3DHP to be built out for California. Please refer to the resources on this page for more information.

The FINAL,STATIC version of the National Hydrography Dataset for California was published for download by USGS on December 27, 2023. This dataset can no longer be edited by the state stewards. The next generation of national hydrography data is the USGS 3D Hydrography Program (3DHP).

Questions about the California stewardship of these datasets may be directed to nhd_stewardship@water.ca.gov.
H
Replication Data for: Revisiting 'The Rise and Decline' in a Population of...
dataverse.harvard.edu
search.dataone.org
Updated May 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nathan TeBlunthuis; Aaron Shaw; Benjamin Mako Hill (2020). Replication Data for: Revisiting 'The Rise and Decline' in a Population of Peer Production Projects [Dataset]. http://doi.org/10.7910/DVN/SG3LP1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/SG3LP1
Dataset updated
May 5, 2020
Dataset provided by
Harvard Dataverse
Authors
Nathan TeBlunthuis; Aaron Shaw; Benjamin Mako Hill
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.2/customlicense?persistentId=doi:10.7910/DVN/SG3LP1https://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.2/customlicense?persistentId=doi:10.7910/DVN/SG3LP1
Description
This archive contains code and data for reproducing the analysis for “Replication Data for Revisiting ‘The Rise and Decline’ in a Population of Peer Production Projects”. Depending on what you hope to do with the data you probabbly do not want to download all of the files. Depending on your computation resources you may not be able to run all stages of the analysis. The code for all stages of the analysis, including typesetting the manuscript and running the analysis, is in code.tar. If you only want to run the final analysis or to play with datasets used in the analysis of the paper, you want intermediate_data.7z or the uncompressed tab and csv files. The data files are created in a four-stage process. The first stage uses the program “wikiq” to parse mediawiki xml dumps and create tsv files that have edit data for each wiki. The second stage generates all.edits.RDS file which combines these tsvs into a dataset of edits from all the wikis. This file is expensive to generate and at 1.5GB is pretty big. The third stage builds smaller intermediate files that contain the analytical variables from these tsv files. The fourth stage uses the intermediate files to generate smaller RDS files that contain the results. Finally, knitr and latex typeset the manuscript. A stage will only run if the outputs from the previous stages do not exist. So if the intermediate files exist they will not be regenerated. Only the final analysis will run. The exception is that stage 4, fitting models and generating plots, always runs. If you only want to replicate from the second stage onward, you want wikiq_tsvs.7z. If you want to replicate everything, you want wikia_mediawiki_xml_dumps.7z.001 wikia_mediawiki_xml_dumps.7z.002, and wikia_mediawiki_xml_dumps.7z.003. These instructions work backwards from building the manuscript using knitr, loading the datasets, running the analysis, to building the intermediate datasets. Building the manuscript using knitr This requires working latex, latexmk, and knitr installations. Depending on your operating system you might install these packages in different ways. On Debian Linux you can run apt install r-cran-knitr latexmk texlive-latex-extra. Alternatively, you can upload the necessary files to a project on Overleaf.com. Download code.tar. This has everything you need to typeset the manuscript. Unpack the tar archive. On a unix system this can be done by running tar xf code.tar. Navigate to code/paper_source. Install R dependencies. In R. run install.packages(c("data.table","scales","ggplot2","lubridate","texreg")) On a unix system you should be able to run make to build the manuscript generalizable_wiki.pdf. Otherwise you should try uploading all of the files (including the tables, figure, and knitr folders) to a new project on Overleaf.com. Loading intermediate datasets The intermediate datasets are found in the intermediate_data.7z archive. They can be extracted on a unix system using the command 7z x intermediate_data.7z. The files are 95MB uncompressed. These are RDS (R data set) files and can be loaded in R using the readRDS. For example newcomer.ds <- readRDS("newcomers.RDS"). If you wish to work with these datasets using a tool other than R, you might prefer to work with the .tab files. Running the analysis Fitting the models may not work on machines with less than 32GB of RAM. If you have trouble, you may find the functions in lib-01-sample-datasets.R useful to create stratified samples of data for fitting models. See line 89 of 02_model_newcomer_survival.R for an example. Download code.tar and intermediate_data.7z to your working folder and extract both archives. On a unix system this can be done with the command tar xf code.tar && 7z x intermediate_data.7z. Install R dependencies. install.packages(c("data.table","ggplot2","urltools","texreg","optimx","lme4","bootstrap","scales","effects","lubridate","devtools","roxygen2")). On a unix system you can simply run regen.all.sh to fit the models, build the plots and create the RDS files. Generating datasets Building the intermediate files The intermediate files are generated from all.edits.RDS. This process requires about 20GB of memory. Download all.edits.RDS, userroles_data.7z,selected.wikis.csv, and code.tar. Unpack code.tar and userroles_data.7z. On a unix system this can be done using tar xf code.tar && 7z x userroles_data.7z. Install R dependencies. In R run install.packages(c("data.table","ggplot2","urltools","texreg","optimx","lme4","bootstrap","scales","effects","lubridate","devtools","roxygen2")). Run 01_build_datasets.R. Building all.edits.RDS The intermediate RDS files used in the analysis are created from all.edits.RDS. To replicate building all.edits.RDS, you only need to run 01_build_datasets.R when the intermediate RDS files and all.edits.RDS files do not exist in the working directory. all.edits.RDS is generated from the tsv files generated by wikiq. This may take several hours. By default building the dataset will...
c
Niagara Open Data
catalog.civicdataecosystem.org
Updated May 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Niagara Open Data [Dataset]. https://catalog.civicdataecosystem.org/dataset/niagara-open-data
Explore at:
Dataset updated
May 13, 2025
Description
The Ontario government, generates and maintains thousands of datasets. Since 2012, we have shared data with Ontarians via a data catalogue. Open data is data that is shared with the public. Click here to learn more about open data and why Ontario releases it. Ontario’s Open Data Directive states that all data must be open, unless there is good reason for it to remain confidential. Ontario’s Chief Digital and Data Officer also has the authority to make certain datasets available publicly. Datasets listed in the catalogue that are not open will have one of the following labels: If you want to use data you find in the catalogue, that data must have a licence – a set of rules that describes how you can use it. A licence: Most of the data available in the catalogue is released under Ontario’s Open Government Licence. However, each dataset may be shared with the public under other kinds of licences or no licence at all. If a dataset doesn’t have a licence, you don’t have the right to use the data. If you have questions about how you can use a specific dataset, please contact us. The Ontario Data Catalogue endeavors to publish open data in a machine readable format. For machine readable datasets, you can simply retrieve the file you need using the file URL. The Ontario Data Catalogue is built on CKAN, which means the catalogue has the following features you can use when building applications. APIs (Application programming interfaces) let software applications communicate directly with each other. If you are using the catalogue in a software application, you might want to extract data from the catalogue through the catalogue API. Note: All Datastore API requests to the Ontario Data Catalogue must be made server-side. The catalogue's collection of dataset metadata (and dataset files) is searchable through the CKAN API. The Ontario Data Catalogue has more than just CKAN's documented search fields. You can also search these custom fields. You can also use the CKAN API to retrieve metadata about a particular dataset and check for updated files. Read the complete documentation for CKAN's API. Some of the open data in the Ontario Data Catalogue is available through the Datastore API. You can also search and access the machine-readable open data that is available in the catalogue. How to use the API feature: Read the complete documentation for CKAN's Datastore API. The Ontario Data Catalogue contains a record for each dataset that the Government of Ontario possesses. Some of these datasets will be available to you as open data. Others will not be available to you. This is because the Government of Ontario is unable to share data that would break the law or put someone's safety at risk. You can search for a dataset with a word that might describe a dataset or topic. Use words like “taxes” or “hospital locations” to discover what datasets the catalogue contains. You can search for a dataset from 3 spots on the catalogue: the homepage, the dataset search page, or the menu bar available across the catalogue. On the dataset search page, you can also filter your search results. You can select filters on the left hand side of the page to limit your search for datasets with your favourite file format, datasets that are updated weekly, datasets released by a particular organization, or datasets that are released under a specific licence. Go to the dataset search page to see the filters that are available to make your search easier. You can also do a quick search by selecting one of the catalogue’s categories on the homepage. These categories can help you see the types of data we have on key topic areas. When you find the dataset you are looking for, click on it to go to the dataset record. Each dataset record will tell you whether the data is available, and, if so, tell you about the data available. An open dataset might contain several data files. These files might represent different periods of time, different sub-sets of the dataset, different regions, language translations, or other breakdowns. You can select a file and either download it or preview it. Make sure to read the licence agreement to make sure you have permission to use it the way you want. Read more about previewing data. A non-open dataset may be not available for many reasons. Read more about non-open data. Read more about restricted data. Data that is non-open may still be subject to freedom of information requests. The catalogue has tools that enable all users to visualize the data in the catalogue without leaving the catalogue – no additional software needed. Have a look at our walk-through of how to make a chart in the catalogue. Get automatic notifications when datasets are updated. You can choose to get notifications for individual datasets, an organization’s datasets or the full catalogue. You don’t have to provide and personal information – just subscribe to our feeds using any feed reader you like using the corresponding notification web addresses. Copy those addresses and paste them into your reader. Your feed reader will let you know when the catalogue has been updated. The catalogue provides open data in several file formats (e.g., spreadsheets, geospatial data, etc). Learn about each format and how you can access and use the data each file contains. A file that has a list of items and values separated by commas without formatting (e.g. colours, italics, etc.) or extra visual features. This format provides just the data that you would display in a table. XLSX (Excel) files may be converted to CSV so they can be opened in a text editor. How to access the data: Open with any spreadsheet software application (e.g., Open Office Calc, Microsoft Excel) or text editor. Note: This format is considered machine-readable, it can be easily processed and used by a computer. Files that have visual formatting (e.g. bolded headers and colour-coded rows) can be hard for machines to understand, these elements make a file more human-readable and less machine-readable. A file that provides information without formatted text or extra visual features that may not follow a pattern of separated values like a CSV. How to access the data: Open with any word processor or text editor available on your device (e.g., Microsoft Word, Notepad). A spreadsheet file that may also include charts, graphs, and formatting. How to access the data: Open with a spreadsheet software application that supports this format (e.g., Open Office Calc, Microsoft Excel). Data can be converted to a CSV for a non-proprietary format of the same data without formatted text or extra visual features. A shapefile provides geographic information that can be used to create a map or perform geospatial analysis based on location, points/lines and other data about the shape and features of the area. It includes required files (.shp, .shx, .dbt) and might include corresponding files (e.g., .prj). How to access the data: Open with a geographic information system (GIS) software program (e.g., QGIS). A package of files and folders. The package can contain any number of different file types. How to access the data: Open with an unzipping software application (e.g., WinZIP, 7Zip). Note: If a ZIP file contains .shp, .shx, and .dbt file types, it is an ArcGIS ZIP: a package of shapefiles which provide information to create maps or perform geospatial analysis that can be opened with ArcGIS (a geographic information system software program). A file that provides information related to a geographic area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open using a GIS software application to create a map or do geospatial analysis. It can also be opened with a text editor to view raw information. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format for sharing data in a machine-readable way that can store data with more unconventional structures such as complex lists. How to access the data: Open with any text editor (e.g., Notepad) or access through a browser. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format to store and organize data in a machine-readable way that can store data with more unconventional structures (not just data organized in tables). How to access the data: Open with any text editor (e.g., Notepad). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A file that provides information related to an area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open with a geospatial software application that supports the KML format (e.g., Google Earth). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. This format contains files with data from tables used for statistical analysis and data visualization of Statistics Canada census data. How to access the data: Open with the Beyond 20/20 application. A database which links and combines data from different files or applications (including HTML, XML, Excel, etc.). The database file can be converted to a CSV/TXT to make the data machine-readable, but human-readable formatting will be lost. How to access the data: Open with Microsoft Office Access (a database management system used to develop application software). A file that keeps the original layout and
o
Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...
openicpsr.org
Updated May 18, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacob Kaplan (2018). Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Hate Crime Data 1991-2021 [Dataset]. http://doi.org/10.3886/E103500V9
Explore at:
Unique identifier
https://doi.org/10.3886/E103500V9
Dataset updated
May 18, 2018
Dataset provided by
Princeton University
Authors
Jacob Kaplan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
1991 - 2021
Area covered
United States
Description
!!!WARNING~~~This dataset has a large number of flaws and is unable to properly answer many questions that people generally use it to answer, such as whether national hate crimes are changing (or at least they use the data so improperly that they get the wrong answer). A large number of people using this data (academics, advocates, reporting, US Congress) do so inappropriately and get the wrong answer to their questions as a result. Indeed, many published papers using this data should be retracted. Before using this data I highly recommend that you thoroughly read my book on UCR data, particularly the chapter on hate crimes (https://ucrbook.com/hate-crimes.html) as well as the FBI's own manual on this data. The questions you could potentially answer well are relatively narrow and generally exclude any causal relationships. ~~~WARNING!!!For a comprehensive guide to this data and other UCR data, please see my book at ucrbook.comVersion 9 release notes:Adds 2021 data.Version 8 release notes:Adds 2019 and 2020 data. Please note that the FBI has retired UCR data ending in 2020 data so this will be the last UCR hate crime data they release. Changes .rda file to .rds.Version 7 release notes:Changes release notes description, does not change data.Version 6 release notes:Adds 2018 dataVersion 5 release notes:Adds data in the following formats: SPSS, SAS, and Excel.Changes project name to avoid confusing this data for the ones done by NACJD.Adds data for 1991.Fixes bug where bias motivation "anti-lesbian, gay, bisexual, or transgender, mixed group (lgbt)" was labeled "anti-homosexual (gay and lesbian)" prior to 2013 causing there to be two columns and zero values for years with the wrong label.All data is now directly from the FBI, not NACJD. The data initially comes as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. Version 4 release notes: Adds data for 2017.Adds rows that submitted a zero-report (i.e. that agency reported no hate crimes in the year). This is for all years 1992-2017. Made changes to categorical variables (e.g. bias motivation columns) to make categories consistent over time. Different years had slightly different names (e.g. 'anti-am indian' and 'anti-american indian') which I made consistent. Made the 'population' column which is the total population in that agency. Version 3 release notes: Adds data for 2016.Order rows by year (descending) and ORI.Version 2 release notes: Fix bug where Philadelphia Police Department had incorrect FIPS county code. The Hate Crime data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains information about hate crimes reported in the United States. Please note that the files are quite large and may take some time to open.Each row indicates a hate crime incident for an agency in a given year. I have made a unique ID column ("unique_id") by combining the year, agency ORI9 (the 9 character Originating Identifier code), and incident number columns together. Each column is a variable related to that incident or to the reporting agency. Some of the important columns are the incident date, what crime occurred (up to 10 crimes), the number of victims for each of these crimes, the bias motivation for each of these crimes, and the location of each crime. It also includes the total number of victims, total number of offenders, and race of offenders (as a group). Finally, it has a number of columns indicating if the victim for each offense was a certain type of victim or not (e.g. individual victim, business victim religious victim, etc.). The only changes I made to the data are the following. Minor changes to column names to make all column names 32 characters or fewer (so it can be saved in a Stata format), made all character values lower case, reordered columns. I also generated incident month, weekday, and month-day variables from the incident date variable included in the original data.
a
California County Boundaries and Identifiers with Coastal Buffers
hub.arcgis.com
data.ca.gov
+1more
Updated Oct 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Technology (2024). California County Boundaries and Identifiers with Coastal Buffers [Dataset]. https://hub.arcgis.com/datasets/28c9f9dd8c3d4eb5a534cb30ddb3ce39
Explore at:
Dataset updated
Oct 24, 2024
Dataset authored and provided by
California Department of Technology
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered

Description
WARNING: This is a pre-release dataset and its fields names and data structures are subject to change. It should be considered pre-release until the end of March 2025. The schema changed in February 2025 - please see below. We will post a roadmap of upcoming changes, but service URLs and schema are now stable. For deployment status of new services in February 2025, see https://gis.data.ca.gov/pages/city-and-county-boundary-data-status. Additional roadmap and status links at the bottom of this metadata.This dataset is continuously updated as the source data from CDTFA is updated, as often as many times a month. If you require unchanging point-in-time data, export a copy for your own use rather than using the service directly in your applications.PurposeCounty boundaries along with third party identifiers used to join in external data. Boundaries are from the California Department of Tax and Fee Administration (CDTFA). These boundaries are the best available statewide data source in that CDTFA receives changes in incorporation and boundary lines from the Board of Equalization, who receives them from local jurisdictions for tax purposes. Boundary accuracy is not guaranteed, and though CDTFA works to align boundaries based on historical records and local changes, errors will exist. If you require a legal assessment of boundary location, contact a licensed surveyor.This dataset joins in multiple attributes and identifiers from the US Census Bureau and Board on Geographic Names to facilitate adding additional third party data sources. In addition, we attach attributes of our own to ease and reduce common processing needs and questions. Finally, coastal buffers are separated into separate polygons, leaving the land-based portions of jurisdictions and coastal buffers in adjacent polygons. This feature layer is for public use.Related LayersThis dataset is part of a grouping of many datasets:Cities: Only the city boundaries and attributes, without any unincorporated areasWith Coastal BuffersWithout Coastal BuffersCounties: Full county boundaries and attributes, including all cities within as a single polygonWith Coastal Buffers (this dataset)Without Coastal BuffersCities and Full Counties: A merge of the other two layers, so polygons overlap within city boundaries. Some customers require this behavior, so we provide it as a separate service.With Coastal BuffersWithout Coastal BuffersCity and County AbbreviationsUnincorporated Areas (Coming Soon)Census Designated PlacesCartographic CoastlinePolygonLine source (Coming Soon)Working with Coastal BuffersThe dataset you are currently viewing includes the coastal buffers for cities and counties that have them in the source data from CDTFA. In the versions where they are included, they remain as a second polygon on cities or counties that have them, with all the same identifiers, and a value in the COASTAL field indicating if it"s an ocean or a bay buffer. If you wish to have a single polygon per jurisdiction that includes the coastal buffers, you can run a Dissolve on the version that has the coastal buffers on all the fields except OFFSHORE and AREA_SQMI to get a version with the correct identifiers.Point of ContactCalifornia Department of Technology, Office of Digital Services, odsdataservices@state.ca.govField and Abbreviation DefinitionsCDTFA_COUNTY: CDTFA county name. For counties, this will be the name of the polygon itself. For cities, it is the name of the county the city polygon is within.CDTFA_COPRI: county number followed by the 3-digit city primary number used in the Board of Equalization"s 6-digit tax rate area numbering system. The boundary data originate with CDTFA's teams managing tax rate information, so this field is preserved and flows into this dataset.CENSUS_GEOID: numeric geographic identifiers from the US Census BureauCENSUS_PLACE_TYPE: City, County, or Town, stripped off the census name for identification purpose.GNIS_PLACE_NAME: Board on Geographic Names authorized nomenclature for area names published in the Geographic Name Information SystemGNIS_ID: The numeric identifier from the Board on Geographic Names that can be used to join these boundaries to other datasets utilizing this identifier.CDT_COUNTY_ABBR: Abbreviations of county names - originally derived from CalTrans Division of Local Assistance and now managed by CDT. Abbreviations are 3 characters.CDT_NAME_SHORT: The name of the jurisdiction (city or county) with the word "City" or "County" stripped off the end. Some changes may come to how we process this value to make it more consistent.AREA_SQMI: The area of the administrative unit (city or county) in square miles, calculated in EPSG 3310 California Teale Albers.OFFSHORE: Indicates if the polygon is a coastal buffer. Null for land polygons. Additional values include "ocean" and "bay".PRIMARY_DOMAIN: Currently empty/null for all records. Placeholder field for official URL of the city or countyCENSUS_POPULATION: Currently null for all records. In the future, it will include the most recent US Census population estimate for the jurisdiction.GlobalID: While all of the layers we provide in this dataset include a GlobalID field with unique values, we do not recommend you make any use of it. The GlobalID field exists to support offline sync, but is not persistent, so data keyed to it will be orphaned at our next update. Use one of the other persistent identifiers, such as GNIS_ID or GEOID instead.Boundary AccuracyCounty boundaries were originally derived from a 1:24,000 accuracy dataset, with improvements made in some places to boundary alignments based on research into historical records and boundary changes as CDTFA learns of them. City boundary data are derived from pre-GIS tax maps, digitized at BOE and CDTFA, with adjustments made directly in GIS for new annexations, detachments, and corrections. Boundary accuracy within the dataset varies. While CDTFA strives to correctly include or exclude parcels from jurisdictions for accurate tax assessment, this dataset does not guarantee that a parcel is placed in the correct jurisdiction. When a parcel is in the correct jurisdiction, this dataset cannot guarantee accurate placement of boundary lines within or between parcels or rights of way. This dataset also provides no information on parcel boundaries. For exact jurisdictional or parcel boundary locations, please consult the county assessor's office and a licensed surveyor.CDTFA's data is used as the best available source because BOE and CDTFA receive information about changes in jurisdictions which otherwise need to be collected independently by an agency or company to compile into usable map boundaries. CDTFA maintains the best available statewide boundary information.CDTFA's source data notes the following about accuracy:City boundary changes and county boundary line adjustments filed with the Board of Equalization per Government Code 54900. This GIS layer contains the boundaries of the unincorporated county and incorporated cities within the state of California. The initial dataset was created in March of 2015 and was based on the State Board of Equalization tax rate area boundaries. As of April 1, 2024, the maintenance of this dataset is provided by the California Department of Tax and Fee Administration for the purpose of determining sales and use tax rates. The boundaries are continuously being revised to align with aerial imagery when areas of conflict are discovered between the original boundary provided by the California State Board of Equalization and the boundary made publicly available by local, state, and federal government. Some differences may occur between actual recorded boundaries and the boundaries used for sales and use tax purposes. The boundaries in this map are representations of taxing jurisdictions for the purpose of determining sales and use tax rates and should not be used to determine precise city or county boundary line locations. Boundary ProcessingThese data make a structural change from the source data. While the full boundaries provided by CDTFA include coastal buffers of varying sizes, many users need boundaries to end at the shoreline of the ocean or a bay. As a result, after examining existing city and county boundary layers, these datasets provide a coastline cut generally along the ocean facing coastline. For county boundaries in northern California, the cut runs near the Golden Gate Bridge, while for cities, we cut along the bay shoreline and into the edge of the Delta at the boundaries of Solano, Contra Costa, and Sacramento counties.In the services linked above, the versions that include the coastal buffers contain them as a second (or third) polygon for the city or county, with the value in the COASTAL field set to whether it"s a bay or ocean polygon. These can be processed back into a single polygon by dissolving on all the fields you wish to keep, since the attributes, other than the COASTAL field and geometry attributes (like areas) remain the same between the polygons for this purpose.SliversIn cases where a city or county"s boundary ends near a coastline, our coastline data may cross back and forth many times while roughly paralleling the jurisdiction"s boundary, resulting in many polygon slivers. We post-process the data to remove these slivers using a city/county boundary priority algorithm. That is, when the data run parallel to each other, we discard the coastline cut and keep the CDTFA-provided boundary, even if it extends into the ocean a small amount. This processing supports consistent boundaries for Fort Bragg, Point Arena, San Francisco, Pacifica, Half Moon Bay, and Capitola, in addition to others. More information on this algorithm will be provided soon.Coastline CaveatsSome cities have buffers extending into water bodies that we do not cut at the shoreline. These include
n
Data from: Environmental impact assessment for large carnivores: a...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Apr 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gonçalo Ferrão da Costa; Miguel Mascarenhas; Carlos Fonseca; Chris Sutherland (2024). Environmental impact assessment for large carnivores: a methodological review of the wolf (Canis lupus) monitoring in Portugal [Dataset]. http://doi.org/10.5061/dryad.t1g1jwt87
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.t1g1jwt87
Dataset updated
Apr 19, 2024
Dataset provided by
University of St Andrews
BE Bioinsight & Ecoa
University of Aveiro
Authors
Gonçalo Ferrão da Costa; Miguel Mascarenhas; Carlos Fonseca; Chris Sutherland
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
Portugal
Description
The continuous growth of the global human population results in increased use and change of landscapes, with infrastructures like transportation or energy facilities, being a particular risk to large carnivores. Environmental Impact Assessments were established to identify the probable environmental consequences of any new proposed project, find ways to reduce impacts, and provide evidence to inform decision making and mitigation. Portugal has a wolf population of around 300 individuals, designated as an endangered species with full legal protection. They occupy the northern mountainous areas of the country which has also been the focus of new human infrastructures over the last 20 years. Consequently, dozens of wolf monitoring programs have been established to evaluate wolf population status, to identify impacts, and to inform appropriate mitigation or compensation measures. We reviewed Portuguese wolf monitoring programs to answer four key questions: do wolf programs examine adequate biological parameters to meet monitoring objectives? is the study design suitable for measuring impacts? are data collection methods and effort sufficient for the stated inference objectives? and do statistical analyses of the data lead to robust conclusions? Overall, we found a mismatch between the stated aims of wolf monitoring and the results reported, and often neither aligns with the existing national wolf monitoring guidelines. Despite the vast effort expended and the diversity of methods used, data analysis makes almost exclusive use of relative indices or summary statistics, with little consideration of the potential biases that arise through the (imperfect) observational process. This makes comparisons of impacts across space and time difficult and is therefore unlikely to contribute to a general understanding of wolf responses to infrastructure-related disturbance. We recommend the development of standardized monitoring protocols and advocate for the use of statistical methods that account for imperfect detection to guarantee accuracy, reproducibility, and efficacy of the programs. Methods We reviewed all major wolf monitoring programs developed for environmental impact assessments in Portugal since 2002 (Table S1, Supplementary material). Given that the focus here is on the adequacy of targeted wolf monitoring for delivering conclusions about the effects of infrastructure development, we reviewed only monitoring programs that were specifically designed for wolves and not those concerned with general mammalian assessment. The starting point was a compilation from the 2019-2021 National Wolf Census (Pimenta et al., 2023), where every wolf monitoring program that occurred between 2014 and 2019 in Portugal was identified. The list was completed with projects that started before 2014 or after 2019 based on personal knowledge, inquires to principal scientific teams, governmental agencies, and EIA consultants. Depending on duration, wolf monitoring programs can produce several, usually annual, reports that are not peer-reviewed and do not appear on standard search engines (e.g., Web of Science or Google Schoolar) but are publicly available from the Portuguese Environmental Agency (APA – www.apambiente.pt). We conducted an online search on APA´s search engine (https://siaia.apambiente.pt/) and identified a total of 30 projects. For each of these projects, we were interested in the first and the last report to identify any methodological changes. If the last report was not present, we reviewed the most recent one. If no report was present, we requested it from the team responsible. Our investigation centred on characterizing and quantifying four components of wolf monitoring programs that are interlinked and that should be ideally determined by the initial objectives: (1) biological parameters, i.e., what wolf parameters were studied to assess impacts; (2) study design, i.e., what sampling schemes were followed to collect and analyse data; (3) data collection, i.e., which sampling methodology and how much effort was used to collect data; and (4) data analysis, i.e., how data were analysed to estimate relevant parameters and assess impact. Biological parameters were identified and classified under two categories: occurrence and demography, which broadly correspond to the necessary inputs to assess impacts like exclusion effect and changes in reproductive patterns. Occurrence-related parameters refer to variables used to measure the presence or absence of wolves, whereas demographic parameters refer to variables that intend to measure population-level effects such as abundance, density, survival, or reproduction. We also recorded whether any effort was made to quantify prey population distribution or abundance as recommended in the guidelines. For study design, we reviewed the sampling design of the project, with specific focus on the spatial and temporal aspect of the study such as total area surveyed, the definition of a sampling site within this region (i.e., resolution), the duration of the study and the number of sampling seasons. The goal here was to determine whether the sampling scheme used was appropriate for assessing infrastructure impacts on wolf distribution or demography, depending on what the focus was. For data collection, we identified the main data collection methodologies used and the corresponding sampling effort. By far the most frequent method used is sign surveys, and specifically scat surveys, and for these studies we recorded whether genetic identification of species or individuals based on faecal DNA was attempted. We compare how sampling effort varies by the various inference objectives and, as above, assess which, if any, project or data collection approach is most likely to produce evidence of impact. We divided the Analysis component into two groups: single-year and multi-year analyses. For single-year analysis we identified how monitoring projects used data to make inferences about the state biological parameters of interest and discuss the associated strengths and weaknesses. For multi-year analyses, we recorded how differences or trends were quantified and associated with infrastructure impacts, commenting on the statistical robustness of the analyses used across the projects.
H
Data: Seasonal wetlands make a relatively limited contribution to the...
hydroshare.org
beta.hydroshare.org
zip
Updated Feb 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vanessa Solano; Clement Duvert; Lindsay Hutley; Dioni I. Cendón; Damien T. Maher; Christian Birkel (2024). Data: Seasonal wetlands make a relatively limited contribution to the dissolved carbon pool of a lowland headwater tropical stream [Dataset]. https://www.hydroshare.org/resource/f8d30b3669894f248de4ca415935c285
Explore at:
zip(63.4 KB)Available download formats
Dataset updated
Feb 13, 2024
Dataset provided by
HydroShare
Authors
Vanessa Solano; Clement Duvert; Lindsay Hutley; Dioni I. Cendón; Damien T. Maher; Christian Birkel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Nov 21, 2017 - Dec 31, 2021
Area covered

Description
This resource includes isotopic, water quality and carbon concentrations from high frequency monitoring stations and discrete sampling collected in Manton Creek, NT, Australia between 2017 and 2021. It also includes the initial and posterior parameters, performance parameters and plots related to the application of SAVTAM model in Manton Creek.

These data are associated with the following manuscript:

Solano, V., Duvert, C., Hutley, L. B., Cendón, D. I., Maher, D. T., & Birkel, C. (2024). Seasonal wetlands make a relatively limited contribution to the dissolved carbon pool of a lowland headwater tropical stream. Journal of Geophysical Research: Biogeosciences, 129, e2023JG007556. https://doi.org/10.1029/2023JG007556
d
Football API | World Plan | SportMonks Sports data for 100 + leagues...
datarade.ai
.json
Updated Jun 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Football API | World Plan | SportMonks Sports data for 100 + leagues worldwide [Dataset]. https://datarade.ai/data-products/football-api-world-plan-sportsdata-for-100-leagues-worldwide-sportmonks
Explore at:
.jsonAvailable download formats
Dataset updated
Jun 9, 2021
Dataset authored and provided by
SportMonks
Area covered
United Arab Emirates, Poland, Ukraine, United States of America, Malta, United Kingdom, China, Romania, Switzerland, Iran (Islamic Republic of)
Description
Use our trusted SportMonks Football API to build your own sports application and be at the forefront of football data today.

Our Football API is designed for iGaming, media, developers and football enthusiasts alike, ensuring you can create a football application that meets your needs.

Over 20,000 sports fanatics make use of our data. We know what data works best for you, so we ensured that our Football API has all the necessary tools you need to create a successful football application.

Livescores and schedules Our Football API features extremely fast livescores and up-to-date season schedules, meaning your app will be the first to notify its customers about a goal scored. This also works to further improve the look and feel of your website.

Statistics and line-ups We offer various kinds of football statistics, ranging from (live) player statistics to team, match and season statistics. And that’s not all - we also provide pre-match lineups for all important leagues.

Coverage and historical data Our Football API covers over 1,200 leagues, all managed by our in-house scouts and data platform. That means there’s up to 14 years of historical data available.

Bookmakers and odds Build your football sportsbook, odds comparison or betting portal with our pre-match and in-play odds collated from all major bookmakers and markets.

TV Stations and highlights Show your customers where the football games are broadcasted and provide video highlights of major match events.

Standings and topscorers Enhance your football website with standings and live standings, and allow your customers to see the top scorers and what the season's standings are.
c
ckanext-s3-exporter
catalog.civicdataecosystem.org
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). ckanext-s3-exporter [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-s3-exporter
Explore at:
Dataset updated
Jun 4, 2025
Description
The s3-exporter extension for CKAN is designed to streamline the creation of dataset resources directly from files stored in an AWS S3 bucket. This extension facilitates the integration of S3-hosted data into CKAN, allowing for efficient data management and access. It is compatible with CKAN 2.10 and later, offering a way to leverage AWS S3 storage within the CKAN environment. Key Features: S3 Data Integration: Allows CKAN to create dataset resources from files stored in AWS S3 buckets, simplifying data import processes. Configurable AWS Connection: Uses configuration settings to connect to AWS S3, including access key, secret key, and bucket name. Queue Management: Offers the ability to specify a queue for export jobs, with a default queue option if none is specified. Cloud Storage Integration: This enhancement allows CKAN to efficiently make use of cloud resources for data serving and availability. Technical Integration: The extension requires configuration settings within the CKAN ini file to define the AWS credentials, bucket name, and optionally, the queue for handling export jobs. CKAN's task queue mechanism is used for asynchronous handling of the import process, ensuring better user experience. Benefits & Impact: By integrating data stored in AWS S3 directly into CKAN, this extension reduces the complexity of data management workflows, making it easier to register and utilize cloud-hosted resources within CKAN datasets. This can improve the overall efficiency of data catalogs that rely on AWS for storage.
Data from: Supervised Machine Learning for Understanding and Improving the...
figshare.com
xlsx
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Boeun Kim; Christos T. Maravelias (2023). Supervised Machine Learning for Understanding and Improving the Computational Performance of Chemical Production Scheduling MIP Models [Dataset]. http://doi.org/10.1021/acs.iecr.2c02734.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.iecr.2c02734.s002
Dataset updated
Jun 4, 2023
Dataset provided by
ACS Publications
Authors
Boeun Kim; Christos T. Maravelias
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
We adopt a supervised learning approach to predict runtimes of batch production scheduling mixed-integer programming (MIP) models with the aim of understanding what instance features make a model computationally expensive. We introduce novel features to characterize instance difficulty according to problem type. The developed machine learning models trained on runtime data obtained from a wide variety of instances show good predictive performances. Then, we discuss informative features and their effects on computational performance. Finally, based on the derived insights, we propose solution methods for improving the computational performance of batch scheduling MIP models.
f
The description of class labels of MFPT.
plos.figshare.com
xls
Updated Oct 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qianqian Zhang; Caiyun Hao; Zhongwei Lv; Qiuxia Fan (2023). The description of class labels of MFPT. [Dataset]. http://doi.org/10.1371/journal.pone.0292381.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0292381.t004
Dataset updated
Oct 5, 2023
Dataset provided by
PLOS ONE
Authors
Qianqian Zhang; Caiyun Hao; Zhongwei Lv; Qiuxia Fan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Learning powerful discriminative features is the key for machine fault diagnosis. Most existing methods based on convolutional neural network (CNN) have achieved promising results. However, they primarily focus on global features derived from sample signals and fail to explicitly mine relationships between signals. In contrast, graph convolutional network (GCN) is able to efficiently mine data relationships by taking graph data with topological structure as input, making them highly effective for feature representation in non-Euclidean space. In this article, to make good use of the advantages of CNN and GCN, we propose a graph attentional convolutional neural network (GACNN) for effective intelligent fault diagnosis, which includes two subnetworks of fully CNN and GCN to extract the multilevel features information, and uses Efficient Channel Attention (ECA) attention mechanism to reduce information loss. Extensive experiments on three datasets show that our framework improves the representation ability of features and fault diagnosis performance, and achieves competitive accuracy against other approaches. And the results show that GACNN can achieve superior performance even under a strong background noise environment.
Global import data of Button Making
volza.com
csv
Updated Sep 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volza FZ LLC (2025). Global import data of Button Making [Dataset]. https://www.volza.com/imports-taiwan/taiwan-import-data-of-button+making
Explore at:
csvAvailable download formats
Dataset updated
Sep 7, 2025
Dataset provided by
Volza
Authors
Volza FZ LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
Description
25 Global import shipment records of Button Making with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
f
Data from: Tracking the norms: A regression-based approach to trail making...
tandf.figshare.com
docx
Updated May 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tuğçe Taşkıran; Mehmet Can Tanfer; Derya Durusu Emek-Savaş (2025). Tracking the norms: A regression-based approach to trail making test performance in the Turkish population [Dataset]. http://doi.org/10.6084/m9.figshare.28776703.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28776703.v1
Dataset updated
May 12, 2025
Dataset provided by
Taylor & Francis
Authors
Tuğçe Taşkıran; Mehmet Can Tanfer; Derya Durusu Emek-Savaş
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Objective: The Trail Making Test (TMT) is a widely used neuropsychological tool for assessing executive functions. This study aimed to establish regression-based normative data for TMT performance in a Turkish population aged 18–80, accounting for the effects of age, education, and sex on both basic (TMT A and TMT B) and derived scores (TMT B-A and TMT B/A). Method: A total of 462 participants were recruited, with 409 included in the final analysis after applying exclusion criteria. Participants completed the international version of the TMT. Pearson correlation analyses and multiple linear regression models assessed relationships between TMT scores and demographic variables. Education was treated as a continuous variable, and regression-based norms were developed for all TMT scores. Results: Age and education were significant predictors of TMT performance. Age primarily affected TMT A scores, while education was the strongest predictor for TMT B, TMT B-A, and TMT B/A scores. The regression models explained 36–38% of the variance in basic scores and 6–24% in derived scores. Women performed better than men on the TMT B/A ratio score, but overall, sex had a less pronounced effect than age and education. Conclusions: This study provides the first regression-based normative data for the TMT in a Turkish population. These norms are crucial for improving the accuracy of neuropsychological assessments in Turkey and facilitating cross-cultural comparisons in cognitive research. The findings emphasize the importance of adjusting for demographic factors in clinical and research settings to ensure precise evaluations of cognitive functioning.
Great Smoky Mountains National Park Vital Signs Watersheds
catalog.data.gov
datasets.ai
+1more
Updated Jun 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Park Service (2024). Great Smoky Mountains National Park Vital Signs Watersheds [Dataset]. https://catalog.data.gov/dataset/great-smoky-mountains-national-park-vital-signs-watersheds
Explore at:
Dataset updated
Jun 5, 2024
Dataset provided by
National Park Servicehttp://www.nps.gov/
Area covered
Great Smoky Mountains
Description
The natural resource monitoring component of the parks I&M Program provides park managers, planners, and other key audiences with scientifically-credible data and information on the status and trends of selected park resources. The information is used as a basis for making decisions and working with other agencies and the public for the long-term protection of park ecosystems. The park I&M monitoring program is designed to provide site-specific information needed to identify and understand change in park ecosystems that are characterized by complexity, variability, and surprises. The information helps to determine whether observed changes are within natural levels of variability, or if they may be the result of unwanted human influences. The broad-based, scientifically sound information obtained through this systems-based, long-term ecological monitoring program has multiple applications for management decision-making, research, education, and promoting public understanding of park resources. These data depict the primary and secondary watersheds within which Vital Signs data collection efforts have been tailored around aquatic ecosystems. Park goals for Vital Signs Monitoring determine the status and trends, in selected indicators, of the condition of park ecosystems to allow managers to make better-informed decisions and to work more effectively with other agencies and individuals for the benefit of park resources.Provide early warning of abnormal conditions of selected resources to help develop effective mitigation measures and reduce costs of management. Provide data to better understand the dynamic nature and condition of park ecosystems and to provide reference points for comparisons with other, altered environments.Provide data to meet certain legal and Congressional mandates related to natural resource protection and visitor enjoyment.Provide a means of measuring progress towards performance goalsVital signs monitoring tracks a subset of physical, chemical, and biological elements and processes of park ecosystems that are selected to represent the overall health or condition of park resources, known or hypothesized effects of stressors, or elements that have important human values. Monitoring results are used by park managers, planners, interpreters, and partners to support management decision-making, park planning, research, education, and public understanding of park resources.More than 1,000 scientists, resource specialists, park managers, and data managers actively contributed to the design and implementation of this long-term program. This highly collaborative effort has resulted in an integrative, park-based program with a strong link between scientific and technical information and management needs, and an emphasis on helping to "put science into the hands of park managers and planners" in the National Park Service.

Facebook

Twitter

Click to copy link

Link copied

Cite

Data Syngenta 2 (2025). Data Make False Dataset [Dataset]. https://universe.roboflow.com/data-syngenta-2/data-make-false

Data Make False Dataset

data-make-false

data-make-false-dataset

Explore at:

7 scholarly articles cite this dataset (View in Google Scholar)

zipAvailable download formats

Dataset updated

Mar 20, 2025

Dataset provided by

Syngenta

Authors

Data Syngenta 2

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured

Forklift Person Bounding Boxes

Description

Data Make False

## Overview

Data Make False is a dataset for object detection tasks - it contains Forklift Person annotations for 765 images.

## Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

  ## License

  This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).

Clear search

Close search

Google apps

Main menu

Data Make False Dataset

Data Make False

Statewide Crop Mapping

Replication Data for: How to Make Causal Inferences with Time-Series...

Integrating Data in ArcGIS Pro

A standardized and reproducible method to measure decision-making in mice:...

Data to create and evaluate distribution models for invasive species for...

National Hydrography Data - NHD and 3DHP

Replication Data for: Revisiting 'The Rise and Decline' in a Population of...

Niagara Open Data

Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...

California County Boundaries and Identifiers with Coastal Buffers

Data from: Environmental impact assessment for large carnivores: a...

Data: Seasonal wetlands make a relatively limited contribution to the...

Football API | World Plan | SportMonks Sports data for 100 + leagues...

ckanext-s3-exporter

Data from: Supervised Machine Learning for Understanding and Improving the...

The description of class labels of MFPT.

Global import data of Button Making

Data from: Tracking the norms: A regression-based approach to trail making...

Great Smoky Mountains National Park Vital Signs Watersheds

Data Make False Dataset

data-make-false

data-make-false-dataset

Data Make False