33 datasets found

Meta-Analysis and modeling of vegetated filter removal of sediment using...
catalog.data.gov
Updated Nov 22, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2021). Meta-Analysis and modeling of vegetated filter removal of sediment using global dataset [Dataset]. https://catalog.data.gov/dataset/meta-analysis-and-modeling-of-vegetated-filter-removal-of-sediment-using-global-dataset
Explore at:
Dataset updated
Nov 22, 2021
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Data on vegetated filter strips, sediment loading into and out of riparian corridors/buffers (VFS), removal efficiency of sediment, meta-analysis of removal efficiencies, dimensional analysis of predictor variables, and regression modeling of VFS removal efficiencies. This dataset is associated with the following publication: Ramesh, R., L. Kalin, M. Hantush, and A. Chaudhary. A secondary assessment of sediment trapping effectiveness by vegetated buffers. ECOLOGICAL ENGINEERING. Elsevier Science Ltd, New York, NY, USA, 159: 106094, (2021).
h
record-test-will-delete-later-latterrrr
huggingface.co
Updated Jun 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Camila Feijoo (2025). record-test-will-delete-later-latterrrr [Dataset]. https://huggingface.co/datasets/camilasfeijoo/record-test-will-delete-later-latterrrr
Explore at:
Dataset updated
Jun 26, 2025
Authors
Camila Feijoo
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset was created using LeRobot.

Dataset Structure

meta/info.json: { "codebase_version": "v2.1", "robot_type": "so101_follower", "total_episodes": 1, "total_frames": 530, "total_tasks": 1, "total_videos": 2, "total_chunks": 1, "chunks_size": 1000, "fps": 30, "splits": { "train": "0:1"}, "data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet", "video_path":… See the full description on the dataset page: https://huggingface.co/datasets/camilasfeijoo/record-test-will-delete-later-latterrrr.
d
US Restaurant POI dataset with metadata
datarade.ai
.csv
Updated Jul 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Geolytica (2022). US Restaurant POI dataset with metadata [Dataset]. https://datarade.ai/data-products/us-restaurant-poi-dataset-with-metadata-geolytica
Explore at:
.csvAvailable download formats
Dataset updated
Jul 30, 2022
Dataset authored and provided by
Geolytica
Area covered
United States of America
Description
Point of Interest (POI) is defined as an entity (such as a business) at a ground location (point) which may be (of interest). We provide high-quality POI data that is fresh, consistent, customizable, easy to use and with high-density coverage for all countries of the world.

This is our process flow:

Our machine learning systems continuously crawl for new POI data Our geoparsing and geocoding calculates their geo locations Our categorization systems cleanup and standardize the datasets Our data pipeline API publishes the datasets on our data store

A new POI comes into existence. It could be a bar, a stadium, a museum, a restaurant, a cinema, or store, etc.. In today's interconnected world its information will appear very quickly in social media, pictures, websites, press releases. Soon after that, our systems will pick it up.

POI Data is in constant flux. Every minute worldwide over 200 businesses will move, over 600 new businesses will open their doors and over 400 businesses will cease to exist. And over 94% of all businesses have a public online presence of some kind tracking such changes. When a business changes, their website and social media presence will change too. We'll then extract and merge the new information, thus creating the most accurate and up-to-date business information dataset across the globe.

We offer our customers perpetual data licenses for any dataset representing this ever changing information, downloaded at any given point in time. This makes our company's licensing model unique in the current Data as a Service - DaaS Industry. Our customers don't have to delete our data after the expiration of a certain "Term", regardless of whether the data was purchased as a one time snapshot, or via our data update pipeline.

Customers requiring regularly updated datasets may subscribe to our Annual subscription plans. Our data is continuously being refreshed, therefore subscription plans are recommended for those who need the most up to date data. The main differentiators between us vs the competition are our flexible licensing terms and our data freshness.

Data samples may be downloaded at https://store.poidata.xyz/us
h
record-remove-debris
huggingface.co
Updated Jun 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rohan Chacko (2025). record-remove-debris [Dataset]. https://huggingface.co/datasets/rohanc007/record-remove-debris
Explore at:
Dataset updated
Jun 17, 2025
Authors
Rohan Chacko
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset was created using LeRobot.

Dataset Structure

meta/info.json: { "codebase_version": "v2.1", "robot_type": "so101_follower", "total_episodes": 31, "total_frames": 11720, "total_tasks": 1, "total_videos": 62, "total_chunks": 1, "chunks_size": 1000, "fps": 30, "splits": { "train": "0:31" }, "data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet", "video_path":… See the full description on the dataset page: https://huggingface.co/datasets/rohanc007/record-remove-debris.
d
test to delete extra entities version 2
search.test.dataone.org
Updated Apr 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jing Tao (2022). test to delete extra entities version 2 [Dataset]. https://search.test.dataone.org/view/urn%3Auuid%3Af5c3b39f-9c30-43b2-9e02-8c2cf8904e4e
Explore at:
Dataset updated
Apr 18, 2022
Dataset provided by
urn:node:mnTestKNB
Authors
Jing Tao
Time period covered
Jan 1, 1111
Area covered

Variables measured
test
Description
abstract. Visit https://dataone.org/datasets/urn%3Auuid%3Af5c3b39f-9c30-43b2-9e02-8c2cf8904e4e for complete metadata about this dataset.
Data from: Metadata for: Pilot-scale H2S and swine odor removal system using...
catalog.data.gov
agdatacommons.nal.usda.gov
+1more
Updated Jun 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Metadata for: Pilot-scale H2S and swine odor removal system using commercially available biochar [Dataset]. https://catalog.data.gov/dataset/metadata-for-pilot-scale-h2s-and-swine-odor-removal-system-using-commercially-available-bi
Explore at:
Dataset updated
Jun 5, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
This is digital research data corresponding to a published manuscript in "Pilot-scale H2S and swine odor removal system using commercially available biochar" Agronomy 2021, 11, 1611. Dataset may be assessed via the included link at the Dryad data repository.Although biochars made in laboratory seem to remove H2S and odorous compounds effectively, very few studies are available for commercial biochars. This study evaluated the efficacy of a commercial biochar (CBC) for removing H2S.Methods are described in the manuscript https://www.mdpi.com/2073-4395/11/8/1611. Descriptions corresponding to each figure and table in the manuscript are placed on separate tabs to clarify abbreviations and summarize the data headings and units.The data file, Table1-4Fig3-5.xslx is an Excel spreadsheet consisting of multiple sub-tabs which are associated with Tables 1, 2, 3, and 4, Figures 3, 4, and 5.Tab “Table1” – raw data for physico-chemical characteristics of the commercial pine biochar for Table 1Tab “Table2” – raw data for laboratory absorption column variables for Table 1. For dry or humid conditions, “Dry” or “Humid” is headed for each parameter name.Tab “Table3” - analytical results for odorous volatile organic compounds for 21 days of operation for Table 3. To avoid the complexity, the single values are not repeated in the data. For the multiple raw data for influent and effluent concentrations of the organic compounds larger than the detection limits are presented in this worksheet.Tab “Table4” – raw data (RH, influent and effluent concentrations) for adsorption of H2S using the pilot biochar system for Table 4. All effluent concentrations were below detection limit and not listed.Tab “Fig 3”- raw data for observed pressure drops ratios predicted by Ergun and Classen equations, i.e.,(Ergon) /(Obs) or(Classen) /(Obs), for various gas velocities (U = 0.41, 0.025, 0.164, and 0.370 m/s) in Figure 3.Tab “Fig4” – breakthrough sorption capacity data for two different inlet concentrations (25 and 100 ppm) used for Figure 4Tab “Fig5” – raw data for daily sum of influent and effluent SCOAVs used for Figure 5
n
Data from: Using multiple imputation to estimate missing data in...
data.niaid.nih.gov
datadryad.org
+1more
zip
Updated Nov 25, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
E. Hance Ellington; Guillaume Bastille-Rousseau; Cayla Austin; Kristen N. Landolt; Bruce A. Pond; Erin E. Rees; Nicholas Robar; Dennis L. Murray (2015). Using multiple imputation to estimate missing data in meta-regression [Dataset]. http://doi.org/10.5061/dryad.m2v4m
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.m2v4m
Dataset updated
Nov 25, 2015
Dataset provided by
University of Prince Edward Island
Trent University
Authors
E. Hance Ellington; Guillaume Bastille-Rousseau; Cayla Austin; Kristen N. Landolt; Bruce A. Pond; Erin E. Rees; Nicholas Robar; Dennis L. Murray
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
There is a growing need for scientific synthesis in ecology and evolution. In many cases, meta-analytic techniques can be used to complement such synthesis. However, missing data is a serious problem for any synthetic efforts and can compromise the integrity of meta-analyses in these and other disciplines. Currently, the prevalence of missing data in meta-analytic datasets in ecology and the efficacy of different remedies for this problem have not been adequately quantified. 2. We generated meta-analytic datasets based on literature reviews of experimental and observational data and found that missing data were prevalent in meta-analytic ecological datasets. We then tested the performance of complete case removal (a widely used method when data are missing) and multiple imputation (an alternative method for data recovery) and assessed model bias, precision, and multi-model rankings under a variety of simulated conditions using published meta-regression datasets. 3. We found that complete case removal led to biased and imprecise coefficient estimates and yielded poorly specified models. In contrast, multiple imputation provided unbiased parameter estimates with only a small loss in precision. The performance of multiple imputation, however, was dependent on the type of data missing. It performed best when missing values were weighting variables, but performance was mixed when missing values were predictor variables. Multiple imputation performed poorly when imputing raw data which was then used to calculate effect size and the weighting variable. 4. We conclude that complete case removal should not be used in meta-regression, and that multiple imputation has the potential to be an indispensable tool for meta-regression in ecology and evolution. However, we recommend that users assess the performance of multiple imputation by simulating missing data on a subset of their data before implementing it to recover actual missing data.
LSD4WSD : An Open Dataset for Wet Snow Detection with SAR Data and Physical...
zenodo.org
bin, pdf +1
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthieu Gallet; Matthieu Gallet; Abdourrahmane Atto; Abdourrahmane Atto; Fatima Karbou; Fatima Karbou; Emmanuel Trouvé; Emmanuel Trouvé (2024). LSD4WSD : An Open Dataset for Wet Snow Detection with SAR Data and Physical Labelling [Dataset]. http://doi.org/10.5281/zenodo.10046730
Explore at:
text/x-python, bin, pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10046730
Dataset updated
Jul 11, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Matthieu Gallet; Matthieu Gallet; Abdourrahmane Atto; Abdourrahmane Atto; Fatima Karbou; Fatima Karbou; Emmanuel Trouvé; Emmanuel Trouvé
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
LSD4WSD V2.0
Learning SAR Dataset for Wet Snow Detection - Full Analysis Version.
The aim of this dataset is to provide a basis for automatic learning to detect wet snow. It is based on Sentinel-1 SAR GRD satellite images acquired between August 2020 and August 2021 over the French Alps. The new version of this dataset is no longer simply restricted to a classification task, and provides a set of metadata for each sample.
Modification and improvements of the version 2.0.0 :
Number of massif: add 7 new massif to cover the all Sentinel-1 images (cf info.pdf).
Acquisition: add images of the descending pass in addition to those originally used in the ascending pass.
Sample: reduction in the size of the samples considered to 15 by 15 to facilitate evaluation at the central pixel.
Sample: increased density of extracted windows, with a distance of approximately 500 meters between the centers of the windows.
Sample: removal of the pre-processing involving the use of logarithms.
Sample: removal of the pre-processing involving the normalisation.
Labels: new structure for the labels part: dictionary with keys: topography, metadata and physics.
Labels: physics: addition of direct information from the CROCUS model for 3 simulations: Liquid Water Content, snow height and minimum snowpack temperature.
Labels: topography: information on the slope, altitude and average orientation of the sample.
Labels: metadata : information on the date of the sample, the mountain massif and the run (ascending or descending).
Dataset: removal of the train/test split*
We leave it up to the user to use the Group Kfold method to validate the models using the alpine massif information.
Finally, it consists of 2467516 samples of size 15 by 15 by 9. For each sample, the 9 metadata are provided, using in particular the Crocus physical model:
topography:
elevation (meters) (average),
orientation (degrees) (average),
slope (degrees) (average),
metadata:
name of the alpine massif,
date of acquisition,
type of acquisition (ascending/descending),
physics
Liquid Water Content (km/m2),
snow height (m),
minimum snowpack temperature (Celsius degree).
The 9 channels are in the following order:
Sentinel-1 polarimetric channels: VV, VH and the combination C: VV/VH in linear,
Topographical features: altitude, orientation, slope
Polarimetric ratio with a reference summer image: VV/VVref, VH/VHref, C/Cref
* The reference image selected is that of August 9th 2020, as a reference image without snow (cf. Nagler&al)
An overview of the distribution and a summary of the sample statistics can be found in the file info.pdf.
The data is stored in .hdf5 format with gzip compression. We provide a python script to read and request the data. The script is dataset_load.py. It is based on the h5py, numpy and pandas libraries. It allows to select a part or the whole dataset using requests on the metadata. The script is documented and can be used as described in the README.md file
The processing chain is available at the following Github address.
The authors would like to acknowledge the support from the National Centre for Space Studies (CNES) in providing computing facilities and access to SAR images via the PEPS platform.
The authors would like to deeply thank Mathieu Fructus for running the Crocus simulations.
Erratum :
In the dataloader file, the name of the "aquisition" column must be added twice, see the correction below.:
dtst_ld = Dataset_loader(path_dataset,shuffle=False,descrp=["date","massif","aquisition","aquisition","elevation","slope","orientation","tmin","hsnow","tel",],)
If you have any comments, questions or suggestions, please contact the authors:
matthieu.gallet@univ-smb.fr
fatima.karbou@meteo.fr
abdourrahmane.atto@univ-smb.fr
emmanuel.trouve@univ-smb.fr
MeSH 2023 Update - Delete Report - 4at4-q6rg - Archive Repository
healthdata.gov
application/rdfxml +5
Updated Jun 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). MeSH 2023 Update - Delete Report - 4at4-q6rg - Archive Repository [Dataset]. https://healthdata.gov/dataset/MeSH-2023-Update-Delete-Report-4at4-q6rg-Archive-R/bjnp-cusd
Explore at:
csv, application/rdfxml, json, tsv, application/rssxml, xmlAvailable download formats
Dataset updated
Jun 27, 2025
Description
This dataset tracks the updates made on the dataset "MeSH 2023 Update - Delete Report" as a repository for previous versions of the data and metadata.
BLM Natl WesternUS GRSG Sagebrush Focal Areas
catalog.data.gov
s.cnmilf.com
+1more
Updated Nov 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Land Management (2024). BLM Natl WesternUS GRSG Sagebrush Focal Areas [Dataset]. https://catalog.data.gov/dataset/blm-natl-westernus-grsg-sagebrush-focal-areas
Explore at:
Dataset updated
Nov 20, 2024
Dataset provided by
Bureau of Land Managementhttp://www.blm.gov/
Description
This dataset is a modified version of the FWS developed data depicting “Highly Important Landscapes”, as outlined in Memorandum FWS/AES/058711 and provided to the Wildlife Habitat Spatial analysis Lab on October 29th 2014. Other names and acronyms used to refer to this dataset have included: Areas of Significance (AoSs - name of GIS data set provided by FWS), Strongholds (FWS), and Sagebrush Focal Areas (SFAs - BLM). The BLM will refer to these data as Sagebrush Focal Areas (SFAs). Data were provided as a series of ArcGIS map packages which, when extracted, contained several datasets each. Based on the recommendation of the FWS Geographer/Ecologist (email communication, see data originator for contact information) the dataset called “Outiline_AreasofSignificance” was utilized as the source for subsequent analysis and refinement. Metadata was not provided by the FWS for this dataset. For detailed information regarding the dataset’s creation refer to Memorandum FWS/AES/058711 or contact the FWS directly. Several operations and modifications were made to this source data, as outlined in the “Description” and “Process Step” sections of this metadata file. Generally: The source data was named by the Wildlife Habitat Spatial Analysis Lab to identify polygons as described (but not identified in the GIS) in the FWS memorandum. The Nevada/California EIS modified portions within their decision space in concert with local FWS personnel and provided the modified data back to the Wildlife Habitat Spatial Analysis Lab. Gaps around Nevada State borders, introduced by the NVCA edits, were then closed as was a large gap between the southern Idaho & southeast Oregon present in the original dataset. Features with an area below 40 acres were then identified and, based on FWS guidance, either removed or retained. Finally, guidance from BLM WO resulted in the removal of additional areas, primarily non-habitat with BLM surface or subsurface management authority. Data were then provided to each EIS for use in FEIS development. Based on guidance from WO, SFAs were to be limited to BLM decision space (surface/sub-surface management areas) within PHMA. Each EIS was asked to provide the limited SFA dataset back to the National Operations Center to ensure consistent representation and analysis. Returned SFA data, modified by each individual EIS, was then consolidated at the BLM’s National Operations Center retaining the three standardized fields contained in this dataset.Several Modifications from the original FWS dataset have been made. Below is a summary of each modification.1. The data as received from FWS: 16,514,163 acres & 1 record.2. Edited to name SFAs by Wildlife Habitat Spatial Analysis Lab:Upon receipt of the “Outiline_AreasofSignificance” dataset from the FWS, a copy was made and the one existing & unnamed record was exploded in an edit session within ArcMap. A text field, “AoS_Name”, was added. Using the maps provided with Memorandum FWS/AES/058711, polygons were manually selected and the “AoS_Name” field was calculated to match the names as illustrated. Once all polygons in the exploded dataset were appropriately named, the dataset was dissolved, resulting in one record representing each of the seven SFAs identified in the memorandum.3. The NVCA EIS made modifications in concert with local FWS staff. Metadata and detailed change descriptions were not returned with the modified data. Contact Leisa Wesch, GIS Specialist, BLM Nevada State Office, 775-861-6421, lwesch@blm.gov, for details.4. Once the data was returned to the Wildlife Habitat Spatial Analysis Lab from the NVCA EIS, gaps surrounding the State of NV were closed. These gaps were introduced by the NVCA edits, exacerbated by them, or existed in the data as provided by the FWS. The gap closing was performed in an edit session by either extending each polygon towards each other or by creating a new polygon, which covered the gap, and merging it with the existing features. In addition to the gaps around state boundaries, a large area between the S. Idaho and S.E. Oregon SFAs was filled in. To accomplish this, ADPP habitat (current as of January 2015) and BLM GSSP SMA data were used to create a new polygon representing PHMA and BLM management that connected the two existing SFAs.5. In an effort to simplify the FWS dataset, features whose areas were less than 40 acres were identified and FWS was consulted for guidance on possible removal. To do so, features from #4 above were exploded once again in an ArcMap edit session. Features whose areas were less than forty acres were selected and exported (770 total features). This dataset was provided to the FWS and then returned with specific guidance on inclusion/exclusion via email by Lara Juliusson (lara_juliusson@fws.gov). The specific guidance was:a. Remove all features whose area is less than 10 acresb. Remove features identified as slivers (the thinness ratio was calculated and slivers identified by Lara Juliusson according to https://tereshenkov.wordpress.com/2014/04/08/fighting-sliver-polygons-in-arcgis-thinness-ratio/) and whose area was less than 20 acres.c. Remove features with areas less than 20 acres NOT identified as slivers and NOT adjacent to other features.d. Keep the remainder of features identified as less than 40 acres.To accomplish “a” and “b”, above, a simple selection was applied to the dataset representing features less than 40 acres. The select by location tool was used, set to select identical, to select these features from the dataset created in step 4 above. The records count was confirmed as matching between the two data sets and then these features were deleted. To accomplish “c” above, a field (“AdjacentSH”, added by FWS but not calculated) was calculated to identify features touching or intersecting other features. A series of selections was used: first to select records 6. Based on direction from the BLM Washington Office, the portion of the Upper Missouri River Breaks National Monument (UMRBNM) that was included in the FWS SFA dataset was removed. The BLM NOC GSSP NLCS dataset was used to erase these areas from #5 above. Resulting sliver polygons were also removed and geometry was repaired.7. In addition to removing UMRBNM, the BLM Washington Office also directed the removal of Non-ADPP habitat within the SFAs, on BLM managed lands, falling outside of Designated Wilderness’ & Wilderness Study Areas. An exception was the retention of the Donkey Hills ACEC and adjacent BLM lands. The BLM NOC GSSP NLCS datasets were used in conjunction with a dataset containing all ADPP habitat, BLM SMA and BLM sub-surface management unioned into one file to identify and delete these areas.8. The resulting dataset, after steps 2 – 8 above were completed, was dissolved to the SFA name field yielding this feature class with one record per SFA area.9. Data were provided to each EIS for use in FEIS allocation decision data development.10. Data were subset to BLM decision space (surface/sub-surface) within PHMA by each EIS and returned to the NOC.11. Due to variations in field names and values, three standardized fields were created and calculated by the NOC:a. SFA Name – The name of the SFA.b. Subsurface – Binary “Yes” or “No” to indicated federal subsurface estate.c. SMA – Represents BLM, USFS, other federal and non-federal surface management 12. The consolidated data (with standardized field names and values) were dissolved on the three fields illustrated above and geometry was repaired, resulting in this dataset.
d
UW Datasets (broken link, delete)
datadiscoverystudio.org
Updated Apr 30, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2015). UW Datasets (broken link, delete) [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/aae3242d685a4a978110b44f70b523b4/html
Explore at:
Dataset updated
Apr 30, 2015
Area covered

Description
Multiple, but so mixed up
A
Municipal Court Caseload Information **Dataset removal in 2 months on...
data.amerigeoss.org
datadiscoverystudio.org
+1more
csv, json, rdf, xml
Updated Jul 29, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States[old] (2019). Municipal Court Caseload Information **Dataset removal in 2 months on September 1, 2014 [Dataset]. https://data.amerigeoss.org/sr_Latn/dataset/showcases/municipal-court-caseload-information-dataset-removal-in-2-months-on-september-1-2014
Explore at:
json, rdf, csv, xmlAvailable download formats
Dataset updated
Jul 29, 2019
Dataset provided by
United States[old]
Description
This data is provided to help with analysis of various violations charged throughout the City of Austin. See Fiscal Year datasets for new the new format. Dataset removal in 2 months on September 1, 2014
d
Groundwater Economic Elements Hunter NSW 20150520 PersRem v02
data.gov.au
researchdata.edu.au
+2more
Updated Aug 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2023). Groundwater Economic Elements Hunter NSW 20150520 PersRem v02 [Dataset]. https://data.gov.au/data/dataset/7174b44f-7146-4b70-a77f-b55c989a3278
Explore at:
Dataset updated
Aug 9, 2023
Dataset authored and provided by
Bioregional Assessment Program
Area covered
Hunter Region, New South Wales
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme. This dataset was derived from multiple datasets. You can find a link to the parent datasets in the Lineage Field in this metadata statement. The History Field in this metadata statement describes how this dataset was derived.

This dataset has been created to represent Groundwater Economic Elements (Groundwater entitlements) in the Hunter region.

This dataset differs from "Groundwater Economic Assets Hunter NSW 20150331 PersRem" which is version 1 by the inclusion of two further spreadsheets from NSW Office of Water (see lineage) which updated values of unknown FTYPE (use) in the database.

Purpose

The purpose of this dataset is to represent the Economic elements (Groundwater entitlements) in the Hunter PAE.

Dataset History

The dataset was derived by the Bioregional Assessment Programme. This dataset was derived from multiple datasets. You can find a link to the parent datasets in the Lineage Field in this metadata statement. The History Field in this metadata statement describes how this dataset was derived.

This data is a new version of the dataset "Groundwater Economic Assets Hunter NSW 20150331 PersRem (ca620380-3d67-43bb-811a-3cafdf41056a) to be sent to ERIN for reprocessing into a new version of the Hunter Economic Asset Database. This dataset differs from the previous version as data extracts were supplied by the NSW Office of Water to infill records of "Unknown" FTYPE (or use). The result of this is that many more records have been able to be assigned an asset class.

Also fields were added to include what the NSW Office of water term 'High Security' Assets. The resulting 'security' field defines the records with a value of 'High Security". They are Town Water Supply bores.

This new information was joined to the Bores in the previous version of the dataset, new records with values for FTYPE were selected and used to populate (update) the FTYPE in the base table.

Basic processing steps is summarized below:

Join existing Hunter_Elements to new table provided by NSW office of Water, 197 records joined

Select where "NewSpreadsheetNSW.StateBoreID" IS NOT NULL

Update [FTYPE] from NewSpreadsheetNSW

Update [Status] from NewSpreadsheetNSW

Remove join

Now join with second spreadsheet supplied by NSW Office of Water containing High security water descriptions

Join with "UknownExport"

Update FTYPE from "UnknownsExport"

Update "FileUpdate"

Now add in high security descriptions

131 total records supplied, only

66 matches in Hunter Bores

AddField: PurposeDesc

AddField: Security

Remove join

Add join MonitoringBom

AddField: PurposeDesc + "NOW OWNED"

Now delete all non functioning bores (433 records)

Status = 'ABN' OR "Status" = 'DCM' OR "Status" = 'NON' OR "Status" = 'RMV'

However keep if have volume associated

Use "Remove from Current selection" Volper Element > 0

Leaves 389 records to be deleted (Note Back up copy created with undeleted records HunterElementsVol20150331_Merg2

7377 records remain

Now reclassify

Note mines added from GA classified FTYPE as "MINE, MNAS"

See Classificaton LUT

Join

Field calculate Asset_class

The AssetClass was then populated by the following classification

FType DEFINITION Class

COMS Water supply for commercial activities Water Access Right

DRNG Potable water supply i.e. drinking water Water Access Right

EXPR Exploration or research NULL

EXPR, MNAS Exploration or research, Water supply for mining activities Water Access Right

HUSE Water supply for household needs e.g. washing, toilet Water Access Right

INDS Water supply for manufacturing and industry Water Access Right

INDS, MINE Water supply for manufacturing and industry, Mining Water Access Right

IRAG Water supply for irrigated agriculture Water Access Right

MINE, MNAS Mining, Water supply for mining activities Water Access Right

MON Monitoring of groundwater conditions NULL

RECN Recreational purposes Water Access Right

STOK Water supply for livestock Basic Access Right

UNK Unknown i.e. purpose of groundwater feature is unknown Water Access Right

WSUP Water supply Water Access Right

WSUP, INDS Water supply, Water supply for manufacturing and industry Water Access Right

Dataset Citation

Bioregional Assessment Programme (2015) Groundwater Economic Elements Hunter NSW 20150520 PersRem v02. Bioregional Assessment Derived Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/7174b44f-7146-4b70-a77f-b55c989a3278.

Dataset Ancestors

Derived From GW Element Bores with Unknown FTYPE Hunter NSW Office of Water 20150514

Derived From Monitoring Power Generation and Water Supply Bores Hunter NOW 20150514

Derived From Operating Mines OZMIN Geoscience Australia 20150201

Derived From NSW Office of Water - National Groundwater Information System 20141101v02

Derived From Groundwater Economic Assets Hunter NSW 20150331 PersRem

Derived From Groundwater Entitlement Hunter NSW Office of Water 20150324
i
Heavy metal removal of intermittent acid mine drainage with an open...
pre.iepnb.es
iepnb.es
Updated Nov 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Heavy metal removal of intermittent acid mine drainage with an open limestone channel - Dataset - CKAN [Dataset]. https://pre.iepnb.es/catalogo/dataset/heavy-metal-removal-of-intermittent-acid-mine-drainage-with-an-open-limestone-channel
Explore at:
Dataset updated
Nov 5, 2024
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This study is focused on the influence of a particular open limestone channel (OLC) on the quality of the surface water drained from an intermittent watercourse. The OLC was constructed along a creek surrounded by upstream tailings deposits, in an extensive, abandoned sulfide-mining site, which generates acidic and heavy metals-rich drainage water during the occasional precipitation that occurs. The overall length of the OLC is 1986 m, it has an average slope of 4.6%, and consists of two main segments. The effectiveness of this channel was evaluated through different physico-chemical parameters: pH, electrical conductivity (EC), total solids (TS), and heavy metal concentrations (Al, Fe, Zn, Ni, Cu, As, Cd, and Pb), measured in surface water. A total of 47 water samples were collected in 12 rainfall events, in the period 2005-2009. Moreover, for three different precipitation events, depletion curves of these parameters were constructed. The values of pH and Ca were increased downstream of the channel, related to the alkalinity and calcium release of the OLC and carbonates present in the watershed, whereas the EC, TS, K, Mg, SO42-, Al, Mn, Fe, Ni, Cu, Zn, As, Cd, and Sb decreased towards the mouth of the creek. The OLC reduced the input of heavy metals into the Mar Menor lagoon by one order of magnitude. According to the results, this kind of constructive solution is effective with regard to mitigating the effects of intermittent acid mine drainage in Mediterranean and semi-arid regions. (C) 2011 Elsevier Ltd. All rights reserved.
r
Geoscience Australia, 3 second SRTM Digital Elevation Model (DEM) v01
researchdata.edu.au
data.gov.au
+1more
Updated Mar 22, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2016). Geoscience Australia, 3 second SRTM Digital Elevation Model (DEM) v01 [Dataset]. https://researchdata.edu.au/geoscience-australia-3-dem-v01
Explore at:
Dataset updated
Mar 22, 2016
Dataset provided by
data.gov.au
Authors
Bioregional Assessment Program
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Area covered
Description
Abstract

This dataset and its metadata statement were supplied to the Bioregional Assessment Programme by a third party and are presented here as originally supplied.

The 3 second (\~90m) Shuttle Radar Topographic Mission (SRTM) Digital Elevation Model (DEM) version 1.0 was derived from resampling the 1 arc second (\~30m) gridded DEM (ANZCW0703013355). The DEM represents ground surface topography, and excludes vegetation features. The dataset was derived from the 1 second Digital Surface Model (DSM; ANZCW0703013336) by automatically removing vegetation offsets identified using several vegetation maps and directly from the DSM. The 1 second product provides substantial improvements in the quality and consistency of the data relative to the original SRTM data, but is not free from artefacts. Man-made structures such as urban areas and power line towers have not been treated. The removal of vegetation effects has produced satisfactory results over most of the continent and areas with defects are identified in the quality assessment layers distributed with the data and described in the User Guide (Geoscience Australia and CSIRO Land & Water, 2010). A full description of the methods is in progress (Read et al., in prep; Gallant et al., in prep). The 3 second DEM was produced for use by government and the public under Creative Commons attribution.

The 3 second DSM and smoothed DEM are also available (DSM; ANZCW0703014216,

DEM-S; ANZCW0703014217).

Dataset History

Source data

SRTM 1 second Version 2 data (Slater et al., 2006), supplied by Defence Imagery and Geospatial Organisation (DIGO) as 813 1 x 1 degree tiles. Data was produced by NASA from radar data collected by the Shuttle Radar Topographic Mission in February 2000.

GEODATA 9 second DEM Version 3 (Geoscience Australia, 2008) used to fill voids.

SRTM Water Body Data (SWBD) shapefile accompanying the SRTM data (Slater et al., 2006). This defines the coastline and larger inland waterbodies for the DEM and DSM.

Vegetation masks and water masks applied to the DEM to remove vegetation.

1 second DEM resampled to 3 second DEM.

1 second DSM processing

The 1 second SRTM-derived Digital Surface Model (DSM) was derived from the 1 second Shuttle Radar Topographic Mission data by removing stripes, filling voids and reflattening water bodies. Further details are provided in the DSM metadata (ANZCW0703013336).

1 second DEM processing (vegetation offset removal)

Vegetation offsets were identified using Landsat-based mapping of woody vegetation. The height offsets were estimated around the edges of vegetation patches then interpolated to a continuous surface of vegetation height offset that was subtracted from the DSM to produce a bare-earth DEM. Further details are provided in the 1 second DSM metadata (ANZCW0703013355).

Void filling

Voids (areas without data) occur in the data due to low radar reflectance (typically open water or dry sandy soils) or topographic shadowing in high relief areas. Delta Surface Fill Method (Grohman et al., 2006) was adapted for this task, using GEODATA 9 second DEM as infill data source. The 9 second data was refined to 1 second resolution using ANUDEM 5.2 without drainage enforcement. Delta Surface Fill Method calculates height differences between SRTM and infill data to create a "delta" surface with voids where the SRTM has no values, then interpolates across voids. The void is then replaced by infill DEM adjusted by the interpolated delta surface, resulting in an exact match of heights at the edges of each void. Two changes to the Delta Surface Fill Method were made: interpolation of the delta surface was achieved with natural neighbour interpolation (Sibson, 1981; implemented in ArcGIS 9.3) rather than inverse distance weighted interpolation; and a mean plane inside larger voids was not used.

Water bodies

Water bodies defined from the SRTM Water Body Data as part of the DSM processing were set to the same elevations as in the DSM.

Edit rules for land surrounding water bodies

SRTM edit rules set all land adjacent to water at least 1m above water level to ensure containment of water (Slater et al., 2006). Following vegetation removal, void filling and water flattening, the heights of all grid cells adjacent to water was set to at least 1 cm above the water surface. The smaller offset (1cm rather than 1m) could be used because the cleaned digital surface model is in floating point format rather than integer format of the original SRTM.

Some small islands within water bodies are represented as voids within the SRTM due to edit rules. These voids are filled as part of void filling process, and their elevations set to a minimum of 1 cm above surrounding water surface across the entire void fill.

Overview of quality assessment

The quality of vegetation offset removal was manually assessed on a 1/8 ×1/8 degree grid. Issues with the vegetation removal were identified and recorded in ancillary data layers. The assessment was based on visible artefacts rather than comparison with reference data so relies on the detection of artefacts by edges.

The issues identified were:

\* vegetation offsets are still visible (not fully removed)

\* vegetation offset overestimated

\* linear vegetation offset not fully removed

\* incomplete removal of built infrastructure and other minor issues

DEM Ancillary data layers

The vegetation removal and assessment process produced two ancillary data layers:

\* A shapefile of 1/8 × 1/8 degree tiles indicating which tiles have been affected by vegetation removal and any issue noted with the vegetation offset removal

\* A difference surface showing the vegetation offset that has been removed; this shows the effect of vegetation on heights as observed by the SRTM radar

instrument and is related to vegetation height, density and structure.

The water and void fill masks for the 1 second DSM were also applied to the DEM. Further information is provided in the User Guide (Geoscience Australia and CSIRO Land & Water, 2010).

Resampling to 3 seconds

The 1 second SRTM derived Digital Elevation Model (DEM) was resampled to 3 seconds of arc (90m) in ArcGIS software using aggregation tool. This tool determines a new cell value based on multiplying the cell resolution by a factor of the input (in this case three) and determines the mean value of input cells with the new extent of the cell (i.e. Mean value of the 3x3 input cells). The 3 second SRTM was converted to integer format for the national mosaic to make the file size more manageable. It does not affect the accuracy of the data at this resolution. Further information on the processing is provided in the User Guide (Geoscience Australia and CSIRO Land & Water, 2010).

Further information can be found at http://www.ga.gov.au/metadata-gateway/metadata/record/gcat_aac46307-fce9-449d-e044-00144fdd4fa6/SRTM-derived+3+Second+Digital+Elevation+Models+Version+1.0

Dataset Citation

Geoscience Australia (2010) Geoscience Australia, 3 second SRTM Digital Elevation Model (DEM) v01. Bioregional Assessment Source Dataset. Viewed 11 December 2018, http://data.bioregionalassessments.gov.au/dataset/12e0731d-96dd-49cc-aa21-ebfd65a3f67a.
a
Greater sage-grouse 2015 ARMPA status
western-watersheds-project-westernwater.hub.arcgis.com
Updated Jan 30, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
wwpbighorn (2015). Greater sage-grouse 2015 ARMPA status [Dataset]. https://western-watersheds-project-westernwater.hub.arcgis.com/items/f5aed733fcbd47fb8b5ae27f1334f900
Explore at:
Dataset updated
Jan 30, 2015
Dataset authored and provided by
wwpbighorn
Area covered

Description
This dataset is a modified version of the FWS developed data depicting “Highly Important Landscapes”, as outlined in Memorandum FWS/AES/058711 and provided to the Wildlife Habitat Spatial analysis Lab on October 29th 2014. Other names and acronyms used to refer to this dataset have included: Areas of Significance (AoSs - name of GIS data set provided by FWS), Strongholds (FWS), and Sagebrush Focal Areas (SFAs - BLM). The BLM will refer to these data as Sagebrush Focal Areas (SFAs). Data were provided as a series of ArcGIS map packages which, when extracted, contained several datasets each. Based on the recommendation of the FWS Geographer/Ecologist (email communication, see data originator for contact information) the dataset called “Outiline_AreasofSignificance” was utilized as the source for subsequent analysis and refinement. Metadata was not provided by the FWS for this dataset. For detailed information regarding the dataset’s creation refer to Memorandum FWS/AES/058711 or contact the FWS directly. Several operations and modifications were made to this source data, as outlined in the “Description” and “Process Step” sections of this metadata file. Generally: The source data was named by the Wildlife Habitat Spatial Analysis Lab to identify polygons as described (but not identified in the GIS) in the FWS memorandum. The Nevada/California EIS modified portions within their decision space in concert with local FWS personnel and provided the modified data back to the Wildlife Habitat Spatial Analysis Lab. Gaps around Nevada State borders, introduced by the NVCA edits, were then closed as was a large gap between the southern Idaho & southeast Oregon present in the original dataset. Features with an area below 40 acres were then identified and, based on FWS guidance, either removed or retained. Guidance from BLM WO resulted in the removal of additional areas including: non-habitat with BLM surface or subsurface management authority, all areas within the Lander EIS boundary, and areas outside of PHMA once EISs had updated PHMA designation.Several Modifications from the original FWS dataset have been made. Below is a summary of each modification.1. The data as received from FWS.2. Edited to name SFAs by Wildlife Habitat Spatial Analysis Lab:Upon receipt of the “Outiline_AreasofSignificance” dataset from the FWS, a copy was made and the one existing & unnamed record was exploded in an edit session within ArcMap. A text field, “AoS_Name”, was added. Using the maps provided with Memorandum FWS/AES/058711, polygons were manually selected and the “AoS_Name” field was calculated to match the names as illustrated. Once all polygons in the exploded dataset were appropriately named, the dataset was dissolved, resulting in one record representing each of the seven SFAs identified in the memorandum.3. The NVCA EIS made modifications in concert with local FWS staff. Metadata and detailed change descriptions were not returned with the modified data. Contact Leisa Wesch, GIS Specialist, BLM Nevada State Office, 775-861-6421, lwesch@blm.gov, for details.4. Once the data was returned to the Wildlife Habitat Spatial Analysis Lab from the NVCA EIS, gaps surrounding the State of NV were closed. These gaps were introduced by the NVCA edits, exacerbated by them, or existed in the data as provided by the FWS. The gap closing was performed in an edit session by either extending each polygon towards each other or by creating a new polygon, which covered the gap, and merging it with the existing features. In addition to the gaps around state boundaries, a large area between the S. Idaho and S.E. Oregon SFAs was filled in. To accomplish this, ADPP habitat (current as of January 2015) and BLM GSSP SMA data were used to create a new polygon representing PHMA and BLM management that connected the two existing SFAs.5. In an effort to simplify the FWS dataset, features whose areas were less than 40 acres were identified and FWS was consulted for guidance on possible removal. To do so, features from #4 above were exploded once again in an ArcMap edit session. Features whose areas were less than forty acres were selected and exported (770 total features). This dataset was provided to the FWS and then returned with specific guidance on inclusion/exclusion via email by Lara Juliusson (lara_juliusson@fws.gov). The specific guidance was:a. Remove all features whose area is less than 10 acresb. Remove features identified as slivers (the thinness ratio was calculated and slivers identified by Lara Juliusson according to https://tereshenkov.wordpress.com/2014/04/08/fighting-sliver-polygons-in-arcgis-thinness-ratio/) and whose area was less than 20 acres.c. Remove features with areas less than 20 acres NOT identified as slivers and NOT adjacent to other features.d. Keep the remainder of features identified as less than 40 acres.To accomplish “a” and “b”, above, a simple selection was applied to the dataset representing features less than 40 acres. The select by location tool was used, set to select identical, to select these features from the dataset created in step 4 above. The records count was confirmed as matching between the two data sets and then these features were deleted. To accomplish “c” above, a field (“AdjacentSH”, added by FWS but not calculated) was calculated to identify features touching or intersecting other features. A series of selections was used: first to select records < 20 acres that were not slivers, second to identify features intersecting other features, and finally another to identify features touching the boundary of other features. Once the select by locations were applied, the field “AdjacentSH” was calculated to identify the features as touching, intersecting or not touching other features. Features identified as not touching or intersecting were selected, then the select by location tool was used , set to select identical, to select these features from the dataset created in step 4 above. The records count was confirmed as matching between the two data sets and then these features were deleted. 530 of the 770 features were removed in total.6. Based on direction from the BLM Washington Office, the portion of the Upper Missouri River Breaks National Monument (UMRBNM) that was included in the FWS SFA dataset was removed. The BLM NOC GSSP NLCS dataset was used to erase these areas from #5 above. Resulting sliver polygons were also removed and geometry was repaired.7. In addition to removing UMRBNM, the BLM Washington Office also directed the removal of Non-ADPP habitat within the SFAs, on BLM managed lands, falling outside of Designated Wilderness’ & Wilderness Study Areas. An exception was the retention of the Donkey Hills ACEC and adjacent BLM lands. The BLM NOC GSSP NLCS datasets were used in conjunction with a dataset containing all ADPP habitat, BLM SMA and BLM sub-surface management unioned into one file to identify and delete these areas.8. The resulting dataset, after steps 2 – 8 above were completed, was dissolved to the SFA name field yielding this feature class with one record per SFA area.9. The "Acres" field was added and calculated.10. All areas within the Lander EIS were erased from the dataset (ArcGIS 'Erase' function) and resulting sliver geometries removed.11. Data were clipped to Proposed Plan PHMA.12. The "Acres" field was re-calculated
(No) Influence of Continuous Integration on the Development Activity in...
zenodo.org
data.niaid.nih.gov
csv
Updated Jan 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sebastian Baltes; Sebastian Baltes; Jascha Knack; Jascha Knack (2020). (No) Influence of Continuous Integration on the Development Activity in GitHub Projects — Dataset [Dataset]. http://doi.org/10.5281/zenodo.1291582
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1291582
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sebastian Baltes; Sebastian Baltes; Jascha Knack; Jascha Knack
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is based on the TravisTorrent dataset released 2017-01-11 (https://travistorrent.testroots.org), the Google BigQuery GHTorrent dataset accessed 2017-07-03, and the Git log history of all projects in the dataset, retrieved 2017-07-16 and 2017-07-17.

We selected projects hosted on GitHub that employ the Continuous Integration (CI) system Travis CI. We identified the projects using the TravisTorrent data set and considered projects that:

used GitHub from the beginning (first commit not more than seven days before project creation date according to GHTorrent),

were active for at least one year (365 days) before the first build with Travis CI (before_ci),

used Travis CI at least for one year (during_ci),

had commit or merge activity on the default branch in both of these phases, and

used the default branch to trigger builds.

To derive the time frames, we employed the GHTorrent Big Query data set. The resulting sample contains 113 projects. Of these projects, 89 are Ruby projects and 24 are Java projects. For our analysis, we only consider the activity one year before and after the first build.

We cloned the selected project repositories and extracted the version history for all branches (see https://github.com/sbaltes/git-log-parser). For each repo and branch, we created one log file with all regular commits and one log file with all merges. We only considered commits changing non-binary files and applied a file extension filter to only consider changes to Java or Ruby source code files. From the log files, we then extracted metadata about the commits and stored this data in CSV files (see https://github.com/sbaltes/git-log-parser).

We also retrieved a random sample of GitHub project to validate the effects we observed in the CI project sample. We only considered projects that:

have Java or Ruby as their project language

used GitHub from the beginning (first commit not more than seven days before project creation date according to GHTorrent)

have commit activity for at least two years (730 days)

are engineered software projects (at least 10 watchers)

were not in the TravisTorrent dataset

In total, 8,046 projects satisfied those constraints. We drew a random sample of 800 projects from this sampling frame and retrieved the commit and merge data in the same way as for the CI sample. We then split the development activity at the median development date, removed projects without commits or merges in either of the two resulting time spans, and then manually checked the remaining projects to remove the ones with CI configuration files. The final comparision sample contained 60 non-CI projects.

This dataset contains the following files:

tr_projects_sample_filtered_2.csv
A CSV file with information about the 113 selected projects.

tr_sample_commits_default_branch_before_ci.csv
tr_sample_commits_default_branch_during_ci.csv
One CSV file with information about all commits to the default branch before and after the first CI build. Only commits modifying, adding, or deleting Java or Ruby source code files were considered. Those CSV files have the following columns:

project: GitHub project name ("/" replaced by "_").
branch: The branch to which the commit was made.
hash_value: The SHA1 hash value of the commit.
author_name: The author name.
author_email: The author email address.
author_date: The authoring timestamp.
commit_name: The committer name.
commit_email: The committer email address.
commit_date: The commit timestamp.
log_message_length: The length of the git commit messages (in characters).
file_count: Files changed with this commit.
lines_added: Lines added to all files changed with this commit.
lines_deleted: Lines deleted in all files changed with this commit.
file_extensions: Distinct file extensions of files changed with this commit.

tr_sample_merges_default_branch_before_ci.csv
tr_sample_merges_default_branch_during_ci.csv
One CSV file with information about all merges into the default branch before and after the first CI build. Only merges modifying, adding, or deleting Java or Ruby source code files were considered. Those CSV files have the following columns:

project: GitHub project name ("/" replaced by "_").
branch: The destination branch of the merge.
hash_value: The SHA1 hash value of the merge commit.
merged_commits: Unique hash value prefixes of the commits merged with this commit.
author_name: The author name.
author_email: The author email address.
author_date: The authoring timestamp.
commit_name: The committer name.
commit_email: The committer email address.
commit_date: The commit timestamp.
log_message_length: The length of the git commit messages (in characters).
file_count: Files changed with this commit.
lines_added: Lines added to all files changed with this commit.
lines_deleted: Lines deleted in all files changed with this commit.
file_extensions: Distinct file extensions of files changed with this commit.
pull_request_id: ID of the GitHub pull request that has been merged with this commit (extracted from log message).
source_user: GitHub login name of the user who initiated the pull request (extracted from log message).
source_branch : Source branch of the pull request (extracted from log message).

comparison_project_sample_800.csv
A CSV file with information about the 800 projects in the comparison sample.

commits_default_branch_before_mid.csv
commits_default_branch_after_mid.csv
One CSV file with information about all commits to the default branch before and after the medium date of the commit history. Only commits modifying, adding, or deleting Java or Ruby source code files were considered. Those CSV files have the same columns as the commits tables described above.

merges_default_branch_before_mid.csv
merges_default_branch_after_mid.csv
One CSV file with information about all merges into the default branch before and after the medium date of the commit history. Only merges modifying, adding, or deleting Java or Ruby source code files were considered. Those CSV files have the same columns as the merge tables described above.
c
ckanext-ark
catalog.civicdataecosystem.org
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). ckanext-ark [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-ark
Explore at:
Dataset updated
Jun 4, 2025
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The ARK extension for CKAN provides the capability to mint and resolve ARK (Archival Resource Key) identifiers for datasets. Inspired by the ckanext-doi extension, it facilitates persistent identification of data resources within a CKAN instance, ensuring greater stability and longevity of data references. This extension is compatible with CKAN versions 2.9 and 2.10 and supports Python versions 3.8, 3.9, and 3.10. Key Features: ARK Identifier Minting: Generates unique ARK identifiers for CKAN datasets, helping to guarantee long-term access and citation stability. ARK Identifier Resolving: Enables resolution of ARK identifiers to associated CKAN resources. Configurable ARK Generation: Allows customization of ARK generation through configurable templates and shoulder assignments, providing flexibility in identifier structure. ERC Metadata Mapping: Supports mapping of dataset fields to ERC (Encoding and Rendering Conventions) metadata elements, using configurable mappings to extract data for persistent identification. Command-Line Interface (CLI) Tools: Includes CLI commands to manage (create, update and delete) ARK identifiers for existing datasets. Customizable NMA (Name Mapping Authority) URL: Supports setting up a custom NMA URL, providing the ability to customize the resolver to point towards different data sources than ckan.siteurl. Technical Integration: The ARK extension integrates with CKAN through plugins and modifies the read_base.html template (either in a custom extension or directly) to display the ARK identifier within the dataset's view. Configuration settings, such as the ARK NAAN (Name Assigning Authority Number) and other specific parameters, are defined in the CKAN configuration file (ckan.ini). Database initialization is required after installation to ensure proper functioning. Benefits & Impact: Implementing the ARK extension enhances the data management capabilities of CKAN by providing persistent identifiers for datasets. This ensures that data resources can be reliably cited and accessed over time. By enabling the association of ERC metadata with ARKs, the extension promotes better description and discoverability of data. The extension can be beneficial for institutions that require long-term preservation and persistent identification of their data resources.
Z
Datasets of the DIMet manuscript
data.niaid.nih.gov
Updated May 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karkar, Slim (2024). Datasets of the DIMet manuscript [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8378886
Explore at:
Dataset updated
May 1, 2024
Dataset provided by
Specque, Florian
Nikolski, Macha
Karkar, Slim
Daubon, Thomas
Dartigues, Benjamin
Hecht, Helge
Soueidan, Hayssam
Guyon, Joris
Galvis, Johanna
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets for reproducing the results of the manuscript "DIMet : An open-source tool for Differential analysis of targeted Isotope-labeled Metabolomics data". DIMet tool is available here, and the tool documentation is accessible in the DIMet wiki page and in its Galaxy site.

Users of the Galaxy version of DIMet:

download and decompress (unzip) the .zip file.

within the 'datasets_manuscript_DIMet/' there is a sub-folder data/, preserve.

within 'datasets_manuscript_DIMet/' there is a sub-folder config/, the user can delete it as it is not used in the Galaxy version.

use the .csv files that are provided in data/ . The specific .csv files to be given as input are explained in each 'dimet_' module in Galaxy.

check metadata_endo_ldh.csv and metadata_timeseries.csv files: if all the content has quotes (") for delimiting the strings, please edit the file in a plain text editor (e.g. Notepad, Gedit, etc) and delete such quotes (replace all " by no character). These quotes (") in the samples metadata, which are tolerated in the command line version, are not allowed in the galaxy version.

Users of the command-line version of DIMet:

download, decompress it and follow the instructions of the documentation in the DIMet wiki page.
c
Welcome to the open data portal of the Open State Foundation! We promote...
catalog.civicdataecosystem.org
Updated Apr 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Welcome to the open data portal of the Open State Foundation! We promote digital transparency by encouraging open (government) data and stimulating its reuse. On this data portal, you will find cool data in easy-to-use file formats. We often place datasets here that we have created ourselves by scraping or requesting data. The datasets are cleaned and are open data, so everyone is allowed to reuse them. They are provided with descriptions, previews, and links to documentation and metadata. Our aim is to remove as many barriers as possible so that you can effectively work on cool apps, visualizations, and analyses! Come and have a look at our public Hack de Overheid-Slack where you can ask questions or share your experiences and ideas with the community. Do you also want a digitally more transparent Netherlands? Donate! Or sign up for our monthly newsletter. [Dataset]. https://catalog.civicdataecosystem.org/dataset/open-state-data-portal
Explore at:
Dataset updated
Apr 22, 2025
Description
Welcome to the open data portal of the Open State Foundation! We promote digital transparency by encouraging open (government) data and stimulating its reuse. On this data portal, you will find cool data in easy-to-use file formats. We often place datasets here that we have created ourselves by scraping or requesting data. The datasets are cleaned and are open data, so everyone is allowed to reuse them. They are provided with descriptions, previews, and links to documentation and metadata. Our aim is to remove as many barriers as possible so that you can effectively work on cool apps, visualizations, and analyses! Come and have a look at our public Hack de Overheid-Slack where you can ask questions or share your experiences and ideas with the community. Do you also want a digitally more transparent Netherlands? Donate! Or sign up for our monthly newsletter. Translated from Dutch Original Text: Welkom op het open dataportaal van Open State Foundation! We bevorderen digitale transparantie door open (overheids)data aan te jagen en het hergebruik hiervan te stimuleren. Op dit dataportaal vind je toffe data in makkelijk te gebruiken bestandsformaten. We plaatsen hier datasets die we vaak zelf gemaakt hebben door gegevens te scrapen of op te vragen. De datasets zijn opgeschoond en open data, dus iedereen mag ze hergebruiken. Ze zijn voorzien van beschrijvingen, previews, en links naar documentatie en metadata. Het streven is om zoveel mogelijk belemmeringen weg te nemen zodat je effectief aan toffe apps, visualisaties en analyzes kan klussen! Kom ook eens kijken in onze openbare Hack de Overheid-Slack waar je vragen kan stellen of je ervaringen en ideeën kan delen met de community. Wil je ook een digitaal transparanter Nederland? Doneer! Of meld je aan voor onze maandelijkse nieuwsbrief.

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. EPA Office of Research and Development (ORD) (2021). Meta-Analysis and modeling of vegetated filter removal of sediment using global dataset [Dataset]. https://catalog.data.gov/dataset/meta-analysis-and-modeling-of-vegetated-filter-removal-of-sediment-using-global-dataset

Meta-Analysis and modeling of vegetated filter removal of sediment using global dataset

Explore at:

Dataset updated

Nov 22, 2021

Dataset provided by

United States Environmental Protection Agencyhttp://www.epa.gov/

Description

Data on vegetated filter strips, sediment loading into and out of riparian corridors/buffers (VFS), removal efficiency of sediment, meta-analysis of removal efficiencies, dimensional analysis of predictor variables, and regression modeling of VFS removal efficiencies. This dataset is associated with the following publication: Ramesh, R., L. Kalin, M. Hantush, and A. Chaudhary. A secondary assessment of sediment trapping effectiveness by vegetated buffers. ECOLOGICAL ENGINEERING. Elsevier Science Ltd, New York, NY, USA, 159: 106094, (2021).

Clear search

Close search

Google apps

Main menu

Meta-Analysis and modeling of vegetated filter removal of sediment using...

record-test-will-delete-later-latterrrr

US Restaurant POI dataset with metadata

record-remove-debris

test to delete extra entities version 2

Data from: Metadata for: Pilot-scale H2S and swine odor removal system using...

Data from: Using multiple imputation to estimate missing data in...

LSD4WSD : An Open Dataset for Wet Snow Detection with SAR Data and Physical...

MeSH 2023 Update - Delete Report - 4at4-q6rg - Archive Repository

BLM Natl WesternUS GRSG Sagebrush Focal Areas

UW Datasets (broken link, delete)

Municipal Court Caseload Information **Dataset removal in 2 months on...

Groundwater Economic Elements Hunter NSW 20150520 PersRem v02

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

Heavy metal removal of intermittent acid mine drainage with an open...

Geoscience Australia, 3 second SRTM Digital Elevation Model (DEM) v01

Abstract

Dataset History

Dataset Citation

Greater sage-grouse 2015 ARMPA status

(No) Influence of Continuous Integration on the Development Activity in...

ckanext-ark

Datasets of the DIMet manuscript

Welcome to the open data portal of the Open State Foundation! We promote...

Meta-Analysis and modeling of vegetated filter removal of sediment using global datasetSee More Versions

Meta-Analysis and modeling of vegetated filter removal of sediment using global dataset