100+ datasets found

Logs and Mined Sequential Patterns of Programming Processes from...
figshare.com
txt
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Minji Kong; Lori Pollock (2023). Logs and Mined Sequential Patterns of Programming Processes from "Semi-Automatically Mining Students' Common Scratch Programming Behaviors" [Dataset]. http://doi.org/10.6084/m9.figshare.12100797.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12100797.v1
Dataset updated
Jun 3, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Minji Kong; Lori Pollock
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present a ProgSnap2-based dataset containing anonymized logs of over 34,000 programming events exhibited by 81 programming students in Scratch, a visual programming environment, during our designed study as described in the paper "Semi-Automatically Mining Students' Common Scratch Programming Behaviors." We also include a list of approx. 3100 mined sequential patterns of programming processes that are performed by at least 10% of the 62 of the 81 students who are novice programmers, and represent maximal patterns generated by the MG-FSM algorithm while allowing a gap of one programming event. README.txt — overview of the dataset and its propertiesmainTable.csv — main event table of the dataset holding rows of programming eventscodeState.csv — table holding XML representations of code snapshots at the time of each programming eventdatasetMetadata.csv — describes features of the datasetScratch-SeqPatterns.txt — list of sequential patterns mined from the Main Event Table
d
Fuzzy Spatiotemporal Data Mining to Activity Recognition in Smart Homes
catalog.data.gov
s.cnmilf.com
Updated Aug 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Fuzzy Spatiotemporal Data Mining to Activity Recognition in Smart Homes [Dataset]. https://catalog.data.gov/dataset/fuzzy-spatiotemporal-data-mining-to-activity-recognition-in-smart-homes
Explore at:
Dataset updated
Aug 23, 2025
Dataset provided by
Dashlink
Description
A primary goal to design smart homes is to provide automatic assistance for the residents to make them able to live independently at home. Activity recognition is done to achieve the mentioned goal and then to provide assistance, we would need three sort of information. First, we would need to know the goal of the resident, then the pattern that the resident should obey to achieve its goal and third sort of needed information is the deviations from the previously known patterns. In the presented paper, spatiotemporal aspects of daily activities are surveyed to mine the patterns of activities realized by the smart homes residents. Necessary data to model the spatiotemporal aspects of daily activities is provided by embedded sensors in the smart home. We believe that to accomplish daily activities, specific objects are applied and by analyzing the movement of objects and resident(s), we would obtain valuable information to model the daily activities of the Smart Home’s residents.
Z
Data Analysis for the Systematic Literature Review of DL4SE
data.niaid.nih.gov
data-staging.niaid.nih.gov
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cody Watson; Nathan Cooper; David Nader; Kevin Moran; Denys Poshyvanyk (2024). Data Analysis for the Systematic Literature Review of DL4SE [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4768586
Explore at:
Dataset updated
Jul 19, 2024
Dataset provided by
College of William and Mary
Washington and Lee University
Authors
Cody Watson; Nathan Cooper; David Nader; Kevin Moran; Denys Poshyvanyk
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data Analysis is the process that supports decision-making and informs arguments in empirical studies. Descriptive statistics, Exploratory Data Analysis (EDA), and Confirmatory Data Analysis (CDA) are the approaches that compose Data Analysis (Xia & Gong; 2014). An Exploratory Data Analysis (EDA) comprises a set of statistical and data mining procedures to describe data. We ran EDA to provide statistical facts and inform conclusions. The mined facts allow attaining arguments that would influence the Systematic Literature Review of DL4SE.

The Systematic Literature Review of DL4SE requires formal statistical modeling to refine the answers for the proposed research questions and formulate new hypotheses to be addressed in the future. Hence, we introduce DL4SE-DA, a set of statistical processes and data mining pipelines that uncover hidden relationships among Deep Learning reported literature in Software Engineering. Such hidden relationships are collected and analyzed to illustrate the state-of-the-art of DL techniques employed in the software engineering context.

Our DL4SE-DA is a simplified version of the classical Knowledge Discovery in Databases, or KDD (Fayyad, et al; 1996). The KDD process extracts knowledge from a DL4SE structured database. This structured database was the product of multiple iterations of data gathering and collection from the inspected literature. The KDD involves five stages:

Selection. This stage was led by the taxonomy process explained in section xx of the paper. After collecting all the papers and creating the taxonomies, we organize the data into 35 features or attributes that you find in the repository. In fact, we manually engineered features from the DL4SE papers. Some of the features are venue, year published, type of paper, metrics, data-scale, type of tuning, learning algorithm, SE data, and so on.

Preprocessing. The preprocessing applied was transforming the features into the correct type (nominal), removing outliers (papers that do not belong to the DL4SE), and re-inspecting the papers to extract missing information produced by the normalization process. For instance, we normalize the feature “metrics” into “MRR”, “ROC or AUC”, “BLEU Score”, “Accuracy”, “Precision”, “Recall”, “F1 Measure”, and “Other Metrics”. “Other Metrics” refers to unconventional metrics found during the extraction. Similarly, the same normalization was applied to other features like “SE Data” and “Reproducibility Types”. This separation into more detailed classes contributes to a better understanding and classification of the paper by the data mining tasks or methods.

Transformation. In this stage, we omitted to use any data transformation method except for the clustering analysis. We performed a Principal Component Analysis to reduce 35 features into 2 components for visualization purposes. Furthermore, PCA also allowed us to identify the number of clusters that exhibit the maximum reduction in variance. In other words, it helped us to identify the number of clusters to be used when tuning the explainable models.

Data Mining. In this stage, we used three distinct data mining tasks: Correlation Analysis, Association Rule Learning, and Clustering. We decided that the goal of the KDD process should be oriented to uncover hidden relationships on the extracted features (Correlations and Association Rules) and to categorize the DL4SE papers for a better segmentation of the state-of-the-art (Clustering). A clear explanation is provided in the subsection “Data Mining Tasks for the SLR od DL4SE”. 5.Interpretation/Evaluation. We used the Knowledge Discover to automatically find patterns in our papers that resemble “actionable knowledge”. This actionable knowledge was generated by conducting a reasoning process on the data mining outcomes. This reasoning process produces an argument support analysis (see this link).

We used RapidMiner as our software tool to conduct the data analysis. The procedures and pipelines were published in our repository.

Overview of the most meaningful Association Rules. Rectangles are both Premises and Conclusions. An arrow connecting a Premise with a Conclusion implies that given some premise, the conclusion is associated. E.g., Given that an author used Supervised Learning, we can conclude that their approach is irreproducible with a certain Support and Confidence.

Support = Number of occurrences this statement is true divided by the amount of statements Confidence = The support of the statement divided by the number of occurrences of the premise
Fuzzy Spatiotemporal Data Mining to Activity Recognition in Smart Homes -...
data.nasa.gov
Updated Mar 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). Fuzzy Spatiotemporal Data Mining to Activity Recognition in Smart Homes - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/fuzzy-spatiotemporal-data-mining-to-activity-recognition-in-smart-homes
Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
A primary goal to design smart homes is to provide automatic assistance for the residents to make them able to live independently at home. Activity recognition is done to achieve the mentioned goal and then to provide assistance, we would need three sort of information. First, we would need to know the goal of the resident, then the pattern that the resident should obey to achieve its goal and third sort of needed information is the deviations from the previously known patterns. In the presented paper, spatiotemporal aspects of daily activities are surveyed to mine the patterns of activities realized by the smart homes residents. Necessary data to model the spatiotemporal aspects of daily activities is provided by embedded sensors in the smart home. We believe that to accomplish daily activities, specific objects are applied and by analyzing the movement of objects and resident(s), we would obtain valuable information to model the daily activities of the Smart Home’s residents.
Data from: STRATEGY FOR EXTRACTION OF FOURSQUARE’S SOCIAL MEDIA GEOGRAPHIC...
scielo.figshare.com
jpeg
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paula Fernandez Costa; Irving da Silva Badolato; Rogério Luís Ribeiro Borba; Julia Celia Mercedes Strauch (2023). STRATEGY FOR EXTRACTION OF FOURSQUARE’S SOCIAL MEDIA GEOGRAPHIC INFORMATION THROUGH DATA MINING [Dataset]. http://doi.org/10.6084/m9.figshare.8031641.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.8031641.v1
Dataset updated
May 31, 2023
Dataset provided by
SciELOhttp://www.scielo.org/
Authors
Paula Fernandez Costa; Irving da Silva Badolato; Rogério Luís Ribeiro Borba; Julia Celia Mercedes Strauch
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract This aim of this paper is the acquisition of geographic data from the Foursquare application, using data mining to perform exploratory and spatial analyses of the distribution of tourist attraction and their density distribution in Rio de Janeiro city. Thus, in accordance with the Extraction, Transformation, and Load methodology, three research algorithms were developed using a tree hierarchical structure to collect information for the categories of Museums, Monuments and Landmarks, Historic Sites, Scenic Lookouts, and Trails, in the foursquare database. Quantitative analysis was performed of check-ins per neighborhood of Rio de Janeiro city, and kernel density (hot spot) maps were generated The results presented in this paper show the need for the data filtering process - less than 50% of the mined data were used, and a large part of the density of the Museums, Historic Sites, and Monuments and Landmarks categories is in the center of the city; while the Scenic Lookouts and Trails categories predominate in the south zone. This kind of analysis was shown to be a tool to support the city's tourist management in relation to the spatial localization of these categories, the tourists’ evaluations of the places, and the frequency of the target public.
Data Mine
kaggle.com
zip
Updated Jul 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ateeb Shamas (2025). Data Mine [Dataset]. https://www.kaggle.com/datasets/ateebshamas/data-mine
Explore at:
zip(919068 bytes)Available download formats
Dataset updated
Jul 28, 2025
Authors
Ateeb Shamas
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Ateeb Shamas

Released under CC0: Public Domain

Contents
r
Data from: A Comprehensive Dataset for Australian Mine Production, 1799 to...
researchdata.edu.au
Updated Nov 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gavin Mudd (2023). A Comprehensive Dataset for Australian Mine Production, 1799 to 2021 [Dataset]. http://doi.org/10.25439/RMT.22724081.V2
Explore at:
Unique identifier
https://doi.org/10.25439/RMT.22724081.V2
Dataset updated
Nov 20, 2023
Dataset provided by
RMIT University, Australia
Authors
Gavin Mudd
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Australia
Description
Given that metals, minerals and energy resources extracted through mining are fundamental to human society, it follows that accurate data describing mine production are equally important. Although there are often national statistical sources, this typically includes data for metals (e.g., gold), minerals (e.g., iron ore) or energy resources (e.g., coal). No such study has ever compiled a national mine production data set which includes basic mining data such as ore processed, grades, extracted products (e.g., metals, concentrates, saleable ore) and waste rock. These data are crucial for geological assessments of mineable resources, environmental impacts, material flows (including losses during mining, smelting-refining, use and disposal or recycling) as well as facilitating more quantitative assessments of critical mineral potential (including possible extraction from tailings and/or waste rock left by mining). This data set achieves these needs for Australia, providing a world-first and comprehensive review of a national mining industry and an exemplar of what can be achieved for other countries with mining industry sectors.
Data from: Mutation Testing in the Wild: Findings from GitHub
zenodo.org
data.niaid.nih.gov
bin
Updated Nov 19, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ana B. Sánchez; Ana B. Sánchez; Pedro Delgado-Pérez; Inmaculada Medina-Bulo; Sergio Segura; Pedro Delgado-Pérez; Inmaculada Medina-Bulo; Sergio Segura (2021). Mutation Testing in the Wild: Findings from GitHub [Dataset]. http://doi.org/10.5281/zenodo.5713585
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5713585
Dataset updated
Nov 19, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ana B. Sánchez; Ana B. Sánchez; Pedro Delgado-Pérez; Inmaculada Medina-Bulo; Sergio Segura; Pedro Delgado-Pérez; Inmaculada Medina-Bulo; Sergio Segura
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Supplementary material for the articule "Mutation Testing in the Wild: Findings from GitHub" submitted to the Empirical Software Engineering Journal. It includes:

Literature_search_for_mutation_tools_2018-2021.xlsx. This document contains the papers found in the search for mutation testing tools performed between 2018 and 2021.

Mutation_tools.xlsx: This document includes the 127 mutation testing tools identified in our study along with tables and graphs.

Repositories_raw_data.xlsx: All repositories data mined from Github as evidence of use of the top 10 mutation tools analyzed in our study. In addition, tables with calculations and graphs used in our work are included.

Search_strings_mutation_tools.xlsx: The search strings used to look for repositories using each mutation tool in Github.
Open database on global coal and metal mine production
zenodo.org
data.niaid.nih.gov
+1more
zip
Updated Feb 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simon Jasansky; Simon Jasansky; Mirko Lieber; Mirko Lieber; Stefan Giljum; Stefan Giljum; Victor Maus; Victor Maus (2023). Open database on global coal and metal mine production [Dataset]. http://doi.org/10.5281/zenodo.6325109
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6325109
Dataset updated
Feb 14, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Simon Jasansky; Simon Jasansky; Mirko Lieber; Mirko Lieber; Stefan Giljum; Stefan Giljum; Victor Maus; Victor Maus
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data set covers global extraction and production of coal and metal ores on an individual mine level. It covers
1171 individual mines, reporting mine-level production for 80 different materials in the period 2000-2021. Furthermore, also data on mining coordinates, ownership, mineral reserves, mining waste, transportation of mining products, as well
as mineral processing capacities (smelters and mineral refineries) and production is included. The data was gathered manually from more than 1900 openly available sources, such as annual or sustainability reports of mining companies. All datapoints are linked to their respective sources. After manual screening and entry of the data, automatic cleaning, harmonization and data checking was conducted. Geoinformation was obtained either from coordinates available in company reports, or by retrieving the coordinates via Google Maps API and subsequent manual checking. For mines where no coordinates could be found, other geospatial attributes such as province, region, district or municipality were recorded, and linked to the GADM data set, available at www.gadm.org.

The data set consists of 12 tables. The table “facilities” contains descriptive and spatial information of mines and processing facilities, and is available as a GeoPackage (GPKG) file. All other tables are available in comma-separated values (CSV) format. A schematic depiction of the database is provided as in PNG format in the file database_model.png.
Z
Data from: Learning the Rules of Peptide Self-assembly through Data Mining...
data-staging.niaid.nih.gov
Updated Mar 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yang, Zhenze; Yorke, Sarah K.; Knowles, Tuomas; Buehler, Markus (2025). Learning the Rules of Peptide Self-assembly through Data Mining with Large Language Models [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_14787834
Explore at:
Dataset updated
Mar 22, 2025
Dataset provided by
Massachusetts Institute of Technology
University of Cambridge
Authors
Yang, Zhenze; Yorke, Sarah K.; Knowles, Tuomas; Buehler, Markus
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Peptides are biologically ubiquitous and important molecules that self-assemble into diverse structures. While extensive research has explored the effects of chemical composition and environmental conditions on self-assembly, a systematic study consolidating this data to uncover global rules is lacking. In this work, we curate a peptide assembly database through a combination of manual processing by human experts and literature mining with a large language model. As a result, we collect more than 1,000 experimental data entries with information about peptide sequence, experimental conditions and corresponding self-assembly phases. Utilizing the data, machine learning models are trained and evaluated, demonstrating excellent accuracy (> 80%) and efficiency in assembly phase classification. Moreover, we fine-tune our GPT model for peptide literature mining with the developed dataset, which exhibits markedly superior performance in extracting information from academic publications relative to the pre-trained model. This workflow can improve efficiency when exploring potential self-assembling peptide candidates, through guiding experimental work, while also deepening our understanding of the mechanisms governing peptide self-assembly.

--- phase_data_clean.csv stores 1000+ peptide self-assembly data under different experimental conditions.

---mined_paper_list.csv stores the corresponding papers we used to collect data.

--- trainset.jsonl and testset.jsonl are data we used for fine-tuning the LLM.

--- fine-tuning.ipynb: code used to fine-tune ChatGPT model.

--- pretrain.ipynb: code used to test the pretrained ChatGPT model.

--- train_and_inference.ipynb: code to use mined data to train and test a ML predictor for phase classification.
Z
Lex-Atlas:Covid-19 Parliaments Dataset
data.niaid.nih.gov
Updated Mar 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
King, Jeff; Ferraz, Octávio; Jones, Andrew; Agon, Roxane (2022). Lex-Atlas:Covid-19 Parliaments Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6363124
Explore at:
Dataset updated
Mar 17, 2022
Dataset provided by
Kings College London
University College London
Authors
King, Jeff; Ferraz, Octávio; Jones, Andrew; Agon, Roxane
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data on the impact on national parliaments resulting from the Covid-19 pandemic mined from country reports published by the Lex-Atlas: Covid-19 project and the Oxford University Press. For more information see https://lexatlas-c19.org
q
Air Quality Data Mining: Mining the US EPA AirData website for student-led...
qubeshub.org
Updated Aug 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mary Williams; Katherine Barry; Deena Wassenberg (2021). Air Quality Data Mining: Mining the US EPA AirData website for student-led evaluation of air quality issues [Dataset]. http://doi.org/10.24918/cs.2015.17
Explore at:
Unique identifier
https://doi.org/10.24918/cs.2015.17
Dataset updated
Aug 24, 2021
Dataset provided by
QUBES
Authors
Mary Williams; Katherine Barry; Deena Wassenberg
Description
Air pollution directly affects human health endpoints including growth, respiratory processes, cardiovascular health, fertility, pregnancy outcomes, and cancer. Therefore, the distribution of air pollution is a topic that is relevant to all, and of direct interest to many students. Air quality varies across space and time, often disproportionally affecting minority communities and impoverished neighborhoods. Air pollution is usually higher in locations where pollution sources are concentrated, such as industrial production facilities, highways, and coal-fired power plants. The United States Environmental Protection Agency manages a national air quality-monitoring program to measure and report air-pollutant levels across the United States. These data cover multiple decades and are publicly available via a website interface. For this lesson, students learn how to mine data from this website. They work in pairs to develop their own questions about air quality or air pollution that span spatial and/or temporal scales, and then gather the data needed to answer their question. The students analyze their data and write a scientific paper describing their work. This laboratory experience requires the students to generate their own questions, gather and interpret data, and draw conclusions, allowing for creativity and instilling ownership and motivation for deeper learning gains.
s
Mine yours USA Import & Buyer Data
seair.co.in
Updated Sep 27, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2018). Mine yours USA Import & Buyer Data [Dataset]. https://www.seair.co.in
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Sep 27, 2018
Dataset provided by
Seair Info Solutions PVT LTD
Authors
Seair Exim
Area covered
United States
Description
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
f
Table1_A real-world disproportionality analysis of Everolimus: data mining...
datasetcatalog.nlm.nih.gov
frontiersin.figshare.com
Updated Mar 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luo, Lan; Chen, Xiangning; Fu, Yumei; Zhao, Bin; Liu, Shu; Cui, Shichao (2024). Table1_A real-world disproportionality analysis of Everolimus: data mining of the public version of FDA adverse event reporting system.DOCX [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001334331
Explore at:
Dataset updated
Mar 12, 2024
Authors
Luo, Lan; Chen, Xiangning; Fu, Yumei; Zhao, Bin; Liu, Shu; Cui, Shichao
Description
Background: Everolimus is an inhibitor of the mammalian target of rapamycin and is used to treat various tumors. The presented study aimed to evaluate the Everolimus-associated adverse events (AEs) through data mining of the US Food and Drug Administration Adverse Event Reporting System (FAERS).Methods: The AE records were selected by searching the FDA Adverse Event Reporting System database from the first quarter of 2009 to the first quarter of 2022. Potential adverse event signals were mined using the disproportionality analysis, including reporting odds ratio the proportional reporting ratio the Bayesian confidence propagation neural network and the empirical Bayes geometric mean and MedDRA was used to systematically classify the results.Results: A total of 24,575 AE reports of Everolimus were obtained using data from the FAERS database, and Everolimus-induced AEs occurrence targeted 24 system organ classes after conforming to the four algorithms simultaneously. The common significant SOCs were identified, included benign, malignant and unspecified neoplasms, reproductive system and breast disorders, etc. The significant AEs were then mapped to preferred terms such as stomatitis, pneumonitis and impaired insulin secretion, which have emerged in the study usually reported in patients with Everolimus. Of note, unexpected significant AEs, including biliary ischaemia, angiofibroma, and tuberous sclerosis complex were uncovered in the label.Conclusion: This study provided novel insights into the monitoring, surveillance, and management of adverse drug reaction associated with Everolimus. The outcome of serious adverse events and the corresponding detection signals, as well as the unexpected significant adverse events signals are worthy of attention in order to improving clinical medication safety during treatment of Everolimus.
d
Data from: Locations of mines and mining activity in the contiguous United...
catalog.data.gov
Updated Nov 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Locations of mines and mining activity in the contiguous United States 2013 [Dataset]. https://catalog.data.gov/dataset/locations-of-mines-and-mining-activity-in-the-contiguous-united-states-2013
Explore at:
Dataset updated
Nov 19, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States, Contiguous United States
Description
This dataset includes locations and associated information about mines and mining activity in the contiguous United States. The database was developed by combining publicly available national datasets of mineral mines, uranium mines, and minor and major coal mine activities. This database was developed in 2013, but temporal range of mine data varied dependent on source. Uranium mine information came from the TENORM Uranium Location Database produced by the US Environmental Protection Agency (U.S. EPA) in 2003. Major and minor coal mine information was from the USTRAT (Stratigraphic data related to coal) database 2012, and the mineral mine data came from the USGS Mineral Resource Program.
d
Data from: GIS data and scripts for Colorado Legacy Mine Lands Watershed...
catalog.data.gov
data.usgs.gov
+1more
Updated Nov 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). GIS data and scripts for Colorado Legacy Mine Lands Watershed Delineation and Scoring tool (WaDeS) [Dataset]. https://catalog.data.gov/dataset/gis-data-and-scripts-for-colorado-legacy-mine-lands-watershed-delineation-and-scoring-tool
Explore at:
Dataset updated
Nov 26, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Colorado
Description
This data release includes GIS datasets supporting the Colorado Legacy Mine Lands Watershed Delineation and Scoring tool (WaDeS), a web mapping application available at https://geonarrative.usgs.gov/colmlwades/. Water chemistry data were compiled from the U.S. Geological Survey (USGS) National Water Information System (NWIS), U.S. Environmental Protection Agency (EPA) STORET database, and the USGS Central Colorado Assessment Project (CCAP) (Church and others, 2009). The CCAP study area was used for this application. Samples were summarized at each monitoring station and hardness-dependent chronic and acute toxicity thresholds for aquatic life protections under Colorado Regulation No. 31 (CDPHE, 5 CCR 1002-31) for cadmium, copper, lead, and/or zinc were calculated. Samples were scored according to how metal concentrations compared with acute and chronic toxicity thresholds. The results were used in combination with remote sensing derived hydrothermal alteration (Rockwell and Bonham, 2017) and mine-related features (Horton and San Juan, 2016) to identify potential mine remediation sites within the headwaters of the central Colorado mineral belt. Headwaters were defined by watersheds delineated from a 10-meter digital elevation dataset (DEM), ranging in 5-35 square kilometers in size. Python and R scripts used to derive these products are included with this data release as documentation of the processing steps and to enable users to adapt the methods for their own applications. References Church, S.E., San Juan, C.A., Fey, D.L., Schmidt, T.S., Klein, T.L. DeWitt, E.H., Wanty, R.B., Verplanck, P.L., Mitchell, K.A., Adams, M.G., Choate, L.M., Todorov, T.I., Rockwell, B.W., McEachron, Luke, and Anthony, M.W., 2012, Geospatial database for regional environmental assessment of central Colorado: U.S. Geological Survey Data Series 614, 76 p., https://doi.org/10.3133/ds614. Colorado Department of Public Health and Environment (CDPHE), Water Quality Control Commission 5 CCR 1002-31. Regulation No. 31 The Basic Standards and Methodologies for Surface Water. Effective 12/31/2021, accessed on July 28, 2023 at https://cdphe.colorado.gov/water-quality-control-commission-regulations. Horton, J.D., and San Juan, C.A., 2022, Prospect- and mine-related features from U.S. Geological Survey 7.5- and 15-minute topographic quadrangle maps of the United States (ver. 8.0, September 2022): U.S. Geological Survey data release, https://doi.org/10.5066/F78W3CHG. Rockwell, B.W. and Bonham, L.C., 2017, Digital maps of hydrothermal alteration type, key mineral groups, and green vegetation of the western United States derived from automated analysis of ASTER satellite data: U.S. Geological Survey data release, https://doi.org/10.5066/F7CR5RK7.
F
Gross Domestic Product: Mining, Quarrying, and Oil and Gas Extraction (21)...
fred.stlouisfed.org
json
Updated Sep 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Gross Domestic Product: Mining, Quarrying, and Oil and Gas Extraction (21) in Oklahoma [Dataset]. https://fred.stlouisfed.org/series/OKMINNGSP
Explore at:
jsonAvailable download formats
Dataset updated
Sep 26, 2025
License
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Area covered
Oklahoma
Description
Graph and download economic data for Gross Domestic Product: Mining, Quarrying, and Oil and Gas Extraction (21) in Oklahoma (OKMINNGSP) from 1997 to 2024 about OK, mining, GSP, private industries, private, industry, GDP, and USA.
Lex-Atlas:Covid-19 Emergency Powers Dataset
zenodo.org
data.niaid.nih.gov
Updated Jul 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeff King; Octávio Ferraz; Andrew Jones; Roxane Agon; Jeff King; Octávio Ferraz; Andrew Jones; Roxane Agon (2022). Lex-Atlas:Covid-19 Emergency Powers Dataset [Dataset]. http://doi.org/10.5281/zenodo.6363096
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.6363096
Dataset updated
Jul 26, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jeff King; Octávio Ferraz; Andrew Jones; Roxane Agon; Jeff King; Octávio Ferraz; Andrew Jones; Roxane Agon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data on the use of emergency powers used to handle the Covid-19 pandemic mined from country reports published by the Lex-Atlas: Covid-19 project and the Oxford University Press. For more information see https://lexatlas-c19.org
Data from: An open dataset for intelligent recognition and classification of...
springernature.figshare.com
bin
Updated Jul 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xuhui Zhang; Wenjuan Yang; Bing Ma; Yanqun Wang; Yujia Wu; Jianxin Yan; Yongwei Liu; Chao Zhang; Jicheng Wan; Yue Wang; Mengyao Huang; Yuyang Li; Dian Zhao (2024). An open dataset for intelligent recognition and classification of abnormal condition in longwall mining [Dataset]. http://doi.org/10.6084/m9.figshare.22654945.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22654945.v1
Dataset updated
Jul 8, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Xuhui Zhang; Wenjuan Yang; Bing Ma; Yanqun Wang; Yujia Wu; Jianxin Yan; Yongwei Liu; Chao Zhang; Jicheng Wan; Yue Wang; Mengyao Huang; Yuyang Li; Dian Zhao
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This work developed image dataset of underground longwall mining face (DsLMF+), which consists of 138004 images with annotation 6 categories of mine personnel, hydraulic support guard plate, large coal, towline, miners’ behaviour and mine safety helmet. All the labels of dataset are publicly available in YOLO format and COCO format.The dataset aims to support further research and advancement of the intelligent identification and classification of abnormal conditions for underground mining.
e
Mine under 3811210010 global trade Data, Mine trade data
eximpedia.app
Updated Jan 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Mine under 3811210010 global trade Data, Mine trade data [Dataset]. https://www.eximpedia.app/search/hs-code-3811210010-of-mine-global-trade
Explore at:
Dataset updated
Jan 31, 2023
Description
Global trade data of Mine under 3811210010, 3811210010 global trade data, trade data of Mine from 80+ Countries.

Facebook

Twitter

Click to copy link

Link copied

Cite

Minji Kong; Lori Pollock (2023). Logs and Mined Sequential Patterns of Programming Processes from "Semi-Automatically Mining Students' Common Scratch Programming Behaviors" [Dataset]. http://doi.org/10.6084/m9.figshare.12100797.v1

Logs and Mined Sequential Patterns of Programming Processes from "Semi-Automatically Mining Students' Common Scratch Programming Behaviors"

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.12100797.v1

Dataset updated

Jun 3, 2023

Dataset provided by

Figsharehttp://figshare.com/

Authors

Minji Kong; Lori Pollock

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

We present a ProgSnap2-based dataset containing anonymized logs of over 34,000 programming events exhibited by 81 programming students in Scratch, a visual programming environment, during our designed study as described in the paper "Semi-Automatically Mining Students' Common Scratch Programming Behaviors." We also include a list of approx. 3100 mined sequential patterns of programming processes that are performed by at least 10% of the 62 of the 81 students who are novice programmers, and represent maximal patterns generated by the MG-FSM algorithm while allowing a gap of one programming event. README.txt — overview of the dataset and its propertiesmainTable.csv — main event table of the dataset holding rows of programming eventscodeState.csv — table holding XML representations of code snapshots at the time of each programming eventdatasetMetadata.csv — describes features of the datasetScratch-SeqPatterns.txt — list of sequential patterns mined from the Main Event Table

Clear search

Close search

Google apps

Main menu

Logs and Mined Sequential Patterns of Programming Processes from...

Fuzzy Spatiotemporal Data Mining to Activity Recognition in Smart Homes

Data Analysis for the Systematic Literature Review of DL4SE

Fuzzy Spatiotemporal Data Mining to Activity Recognition in Smart Homes -...

Data from: STRATEGY FOR EXTRACTION OF FOURSQUARE’S SOCIAL MEDIA GEOGRAPHIC...

Data Mine

Dataset

Contents

Data from: A Comprehensive Dataset for Australian Mine Production, 1799 to...

Data from: Mutation Testing in the Wild: Findings from GitHub

Open database on global coal and metal mine production

Data from: Learning the Rules of Peptide Self-assembly through Data Mining...

Lex-Atlas:Covid-19 Parliaments Dataset

Air Quality Data Mining: Mining the US EPA AirData website for student-led...

Mine yours USA Import & Buyer Data

Table1_A real-world disproportionality analysis of Everolimus: data mining...

Data from: Locations of mines and mining activity in the contiguous United...

Data from: GIS data and scripts for Colorado Legacy Mine Lands Watershed...

Gross Domestic Product: Mining, Quarrying, and Oil and Gas Extraction (21)...

Lex-Atlas:Covid-19 Emergency Powers Dataset

Data from: An open dataset for intelligent recognition and classification of...

Mine under 3811210010 global trade Data, Mine trade data

Logs and Mined Sequential Patterns of Programming Processes from "Semi-Automatically Mining Students' Common Scratch Programming Behaviors"