29 datasets found

d
Joiner
search.dataone.org
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HU, Tao (2024). Joiner [Dataset]. http://doi.org/10.7910/DVN/0BM2IQ
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/0BM2IQ
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
HU, Tao
Description
The joiner is a component often used in workflows to merge or join data from different sources or intermediate steps into a single output. In the context of Common Workflow Language (CWL), the joiner can be implemented as a step that combines multiple inputs into a cohesive dataset or output. This might involve concatenating files, merging data frames, or aggregating results from different computations.
Dataset of "A Metabolites Merging Strategy (MMS): Harmonization to enable...
zenodo.org
bin
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Héctor Villalba; Héctor Villalba; Maria Llambrich; Maria Llambrich; Josep Gumà; Josep Gumà; Jesús Brezmes; Jesús Brezmes; Raquel Cumeras; Raquel Cumeras (2023). Dataset of "A Metabolites Merging Strategy (MMS): Harmonization to enable studies intercomparison" [Dataset]. http://doi.org/10.5281/zenodo.8226097
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8226097
Dataset updated
Nov 21, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Héctor Villalba; Héctor Villalba; Maria Llambrich; Maria Llambrich; Josep Gumà; Josep Gumà; Jesús Brezmes; Jesús Brezmes; Raquel Cumeras; Raquel Cumeras
Description
Metabolomics encounters challenges in cross-study comparisons due to diverse metabolite nomenclature and reporting practices. To bridge this gap, we introduce the Metabolites Merging Strategy (MMS), offering a systematic framework to harmonize multiple metabolite datasets for enhanced interstudy comparability. MMS has three steps. Step 1: Translation and merging of the different datasets by employing InChIKeys for data integration, encompassing the translation of metabolite names (if needed). Followed by Step 2: Attributes' retrieval from the InChIkey, including descriptors of name (title name from PubChem and RefMet name from Metabolomics Workbench), and chemical properties (molecular weight and molecular formula), both systematic (InChI, InChIKey, SMILES) and non-systematic identifiers (PubChem, CheBI, HMDB, KEGG, LipidMaps, DrugBank, Bin ID and CAS number), and their ontology. Finally, a meticulous three-step curation process is used to rectify disparities for conjugated base/acid compounds (optional step), missing attributes, and synonym checking (duplicated information). The MMS procedure is exemplified through a case study of urinary asthma metabolites, where MMS facilitated the identification of significant pathways hidden when no dataset merging strategy was followed. This study highlights the need for standardized and unified metabolite datasets to enhance the reproducibility and comparability of metabolomics studies.
P
Im4Sketch Dataset
paperswithcode.com
Updated Feb 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikos Efthymiadis; Giorgos Tolias; Ondrej Chum (2022). Im4Sketch Dataset [Dataset]. https://paperswithcode.com/dataset/im4sketch
Explore at:
Dataset updated
Feb 25, 2022
Authors
Nikos Efthymiadis; Giorgos Tolias; Ondrej Chum
Description
Im4Sketch is a large-scale dataset with shape-oriented set of classes for image-to-sketch generalization . It consists of a collection of natural images from 874 categories for training and validation, and sketches from 393 categories (a subset of natural image categories) for testing.

The images and sketches are collected from existing popular computer vision datasets. The categories are selected having shape similarity in mind, so that object with same shape belong to the same category.

The natural-image part of the dataset is based on the ILSVRC2012 version of ImageNet. The original ImageNet categories are first merged according to the shape criteria. Object categories for objects whose shape, e.g. how a human would draw the object, is the same are merged. For this step, semantic similarity of categories, obtained through WordNet and category names, is used to obtain candidate categories for merging. Based on visual inspection of these candidates, the decision to merge the original ImageNet classes is made by a human. For instance, ”Indian Elephant” and ”African Elephant”, or ”Laptop” and ”Notebook” are merged. An extreme case of merging is the new class “dog” that is a union of 121 original ImageNet classes of dog breeds.

In the second step, classes from datasets containing sketches are used. In particular, DomainNet, Sketchy, PACS, and TU-Berlin. Note that merging is not necessary for classes in these datasets, because the shape criteria are guaranteed since they are designed for sketches. In this step, a correspondence between the merged ImageNet categories and categories of the other datasets is found. As in the merging step, semantic similarity is used to guide the correspondence search. Sketch categories that are not present in the merged ImageNet are added to the overall category set, while training natural images of those categories are collected from either DomainNet or Sketchy. In the end, ImageNet is used for 690 classes, DomainNet for 183 classes, and Sketchy for 1 class, respectively.
d
Replication Data for: Bespoke NPO Taxonomies - Step 02: Merge and Refine...
search.dataone.org
Updated Nov 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Santamarina, Francisco (2023). Replication Data for: Bespoke NPO Taxonomies - Step 02: Merge and Refine Data [Dataset]. http://doi.org/10.7910/DVN/EO2HIM
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/EO2HIM
Dataset updated
Nov 19, 2023
Dataset provided by
Harvard Dataverse
Authors
Santamarina, Francisco
Description
Pre-processed mission statements and additional data from 1023-EZ approvals for 2018 and 2019. For additional information on cleaning steps, please go to the project's replication GitHub page.
f
RSR1.5 of ICP and CICP algorithms in two steps on US-MERGE and US-SNAP...
plos.figshare.com
xls
Updated Jun 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hengkai Guo; Guijin Wang; Lingyun Huang; Yuxin Hu; Chun Yuan; Rui Li; Xihai Zhao (2023). RSR1.5 of ICP and CICP algorithms in two steps on US-MERGE and US-SNAP datasets. [Dataset]. https://plos.figshare.com/articles/dataset/RSR_sub_1_5_sub_of_ICP_and_CICP_algorithms_in_two_steps_on_US_MERGE_and_US_SNAP_datasets_/2296066
Explore at:
xlsAvailable download formats
Dataset updated
Jun 15, 2023
Dataset provided by
PLOS ONE
Authors
Hengkai Guo; Guijin Wang; Lingyun Huang; Yuxin Hu; Chun Yuan; Rui Li; Xihai Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
RSR1.5 of ICP and CICP algorithms in two steps on US-MERGE and US-SNAP datasets.
f
Table_1_Deep learning of model- and reanalysis-based precipitation and...
frontiersin.figshare.com
bin
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaveh Patakchi Yousefi; Stefan Kollet (2023). Table_1_Deep learning of model- and reanalysis-based precipitation and pressure mismatches over Europe.DOCX [Dataset]. http://doi.org/10.3389/frwa.2023.1178114.s001
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.3389/frwa.2023.1178114.s001
Dataset updated
Jun 2, 2023
Dataset provided by
Frontiers
Authors
Kaveh Patakchi Yousefi; Stefan Kollet
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Physically based numerical weather prediction and climate models provide useful information for a large number of end users, such as flood forecasters, water resource managers, and farmers. However, due to model uncertainties arising from, e.g., initial value and model errors, the simulation results do not match the in situ or remotely sensed observations to arbitrary accuracy. Merging model-based data with observations yield promising results benefiting simultaneously from the information content of the model results and observations. Machine learning (ML) and/or deep learning (DL) methods have been shown to be useful tools in closing the gap between models and observations due to the capacity in the representation of the non-linear space–time correlation structure. This study focused on using UNet encoder–decoder convolutional neural networks (CNNs) for extracting spatiotemporal features from model simulations for predicting the actual mismatches (errors) between the simulation results and a reference data set. Here, the climate simulations over Europe from the Terrestrial Systems Modeling Platform (TSMP) were used as input to the CNN. The COSMO-REA6 reanalysis data were used as a reference. The proposed merging framework was applied to mismatches in precipitation and surface pressure representing more and less chaotic variables, respectively. The merged data show a strong average improvement in mean error (~ 47%), correlation coefficient (~ 37%), and root mean square error (~22%). To highlight the performance of the DL-based method, the results were compared with the results obtained by a baseline method, quantile mapping. The proposed DL-based merging methodology can be used either during the simulation to correct model forecast output online or in a post-processing step, for downstream impact applications, such as flood forecasting, water resources management, and agriculture.
r
Landscape classification of the Galilee preliminary assessment extent
researchdata.edu.au
cloud.csiss.gmu.edu
+2more
Updated Dec 6, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2018). Landscape classification of the Galilee preliminary assessment extent [Dataset]. https://researchdata.edu.au/landscape-classification-galilee-assessment-extent/3520617
Explore at:
Dataset updated
Dec 6, 2018
Dataset provided by
data.gov.au
Authors
Bioregional Assessment Program
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Galilee
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

This dataset contains shapefiles showing landscape classification, including all natural and human ecosystems, for the Galilee preliminary assessment extent.

Purpose

It is constructed from source data (see Lineage) to show the landscape classification systematically and define geographical areas into classes based on similarity in physical and/or biological and hydrological character. The landscape classification includes all natural and human ecosystems in the Galilee preliminary assessment extent.

Dataset History

A landscape classification was developed to characterise the nature of water dependency among these assets.

The aim of the landscape classification is to systematically define geographical areas into classes based on similarity in physical and/or biological and hydrological character.

The landscape classification was carried out on data layers consisting of polygons (e.g. remnant vegetation, wetlands) and lines (stream network) and points (springs and spring complexes).

The layers created to contribute to the landscape classes were: GAL_Landform_LC layer; GAL_GW_SRCE_LC; GAL_FLD_r; GAL Streams; LC_WaterType_GAL; LC_WaterAvail_GAL.

The description of how these layers were created is below:

A. To make the GAL_Landform_LC layer:

Merge Queensland wetlands (QLD_WETLAND_SYSTEM_100K_A) with South Australian wetlands (Wetlands_GDE_Classification (SA))

Select wetlands from step A1that intersect the Galilee_SW_PAE_v02.

Add a new field to wetlands data for landform class called "Landform_LC".

From merged wetland data (step A1) select Queensland wetlands ( "Wetlandsys" field is not blank) and update Landform_LC to first letter of "Wetlandsys" value

From merged wetland data (step A1) select South Australian wetlands ("WETCLASS" field is not blank) and update Landform_LC to "wetclass"

Compare to Landclass_Draft1 (Don Butler's data) to check areas between wetlands created in step A1 have the "Landform" value of "-" and cover about the same percent of the area (\~63%). This is true and therefore these data match the Don Butler data for wetlands.

Select all wetlands as defined in steps A4-6 and eliminate errors or slithers created by slight overlaps when data were merged

Select all wetlands as defined in steps A4-6 and delete all overlaps

9.Select all streams (AHGHMappedStream) within Galilee_SW_PAE_v02 and buffer to 1m total width. This makes the area of the stream equal to the length

Overlap wetland areas with buffered streams (created in step A9) and erase any wetlands inside buffered stream areas. This ensures no overlapping polygons when wetlands and streams merged

Merge buffered streams created in step A9 and wetlands created in step A10 to create GAL_LANDFORM_LC

B. To make the GAL_GW_SRCE_LC (GAL Ground Water SOURCE Land Class):

Select "Aquifers assocated with springs that form saline scolds" and "Sandstone aquifers with fresh permanent groundwater connectivity regime associated with discharge springs") from GDE_Terr_Area_v01_3 that are within Galilee_SW_PAE_v02, add a new field called "GW_SRCE_LC" and update values to "Artesian"

Select NRM_Regions_2014_v01 within South Australia

Select subsurface GDEs (GDEsub) and surface GDEs (GDEsur) from GM_PED_AssetList_poly that are within South Australia (ie they intersect NRM regions in South Australia (those selected in step B2), add a new field called "GW_SRCE_LC" and update values to "Artesian"

Select springs from Topo250KSeries3_gdb (Springs) that are within South Australia (ie they intersect NRM regions in South Australia (those selected in step B2) and buffer to 20m radius, add a new field called "GW_SRCE_LC" and update values to "Artesian"

Select where LEB_Non-GAB_Springs.shp intersect NRM regions in South Australia (those selected in step B2) - there were none.

Erase GDEs from step B3 that overlap with springs from step B4 and merge remaining GDEs with springs (step B4) and aquifers (step B1) to make GAL_GW_SRCE_LC

C. To make the Topography landclasses = GAL_FLD_r:

Select landzone 3 from DP_Preclear_RE_DCDB_A

Select floodplains from QLD_Wetland_System

Combine LandSubjectToInundation, MarineSwamp, SalineCoastalFlat and Swamps from GA 250K topogrpahic flats data (GA_250K_topo(Flats))

Select floodplains from Don Butlers landscape classes (Landclass_Draft\])

Merge landzone 3 (from step C1) with wetland floodplains (from step C2), flats from GA (step C3) and Don Butler's floodplains (step C4) then add a new field called "LC_Code" and update values to "10,000"

Select floodplains (created in step C5) that are within the Galilee_SW_PAE_v02

convert to raster and cut into 2 degree tiles to create GAL_FLD_1..75

D. To make the GAL Streams (no overlaps between buffered streams):

Select streams from AHGHMappedStream within GAL_PAE_v02

Buffer streams to 1m

Select first 2 buffered streams, erase the first from the area of the second then merge the 2 together

Select 3rd stream, erase from the 3rd areas that overlap first and second (results of D3), then merge

continue for every stream until all done

check for overlaps and remove if any to create GAL Streams

add a new field called "LC_Code" and update values to "3" for riverine

E. To make the LC_WaterType_GAL:

Select Queensland terrestrial GDEs (QLD_GDETerr) within GAL_PAE_v02 where salinity of groundwater >= 3000mg/L TDS

Select Queensland surface GDEs (QLD_GDETerr) within GAL_PAE_v02 where salinity of groundwater >= 3000mg/L TDS

Select Queensland wetlands (QLD_Wetlands) within GAL_PAE_v02 where "SALIMOD" = "S2" or "S3" or "T1"

Merge results from E1, E2 and E3 to create LC_WaterType_GAL

add a new field called LC_Code" and update values to 100 for fresh water and 200 for saline

F. To make the LC_WaterAvail_GAL:

Select Queensland terrestrial GDEs (QLD_GDETerr) within GAL_PAE_v02 where water regime (WTRRegime) = "WR0", update LC_Code to 30 (intermittent)

Select Queensland surface GDEs (QLD_GDETerr) within GAL_PAE_v02 where water regime (WTRRegime) = "WR0", "T1", "WT1" or "WR2", update LC_Code to 30 (intermittent)

Select Queensland surface GDEs (QLD_GDETerr) within GAL_PAE_v02 where water regime (WTRRegime) = ""WT3" or "WR3", update LC_Code to 20 (near perminent)

Select Queensland wetlands (QLD_Wetlands) within GAL_PAE_v02 where water regime (WTRRegime) = "WR0", "T1", "WT1"or "WR2", update LC_Code to 30 (intermittent)

Select Queensland wetlands (QLD_Wetlands) within GAL_PAE_v02 where water regime (WTRRegime) = "WT3" or "WR3", update LC_Code to 20 (near perminent)

Combine results from F1..F5 to create LC_WaterAvail_GAL

G. To make the REmVEG:

select vegetation classes \[1-23, 26, 29-32\] from NVIS - Australian Major Vegetation Subgroups that are within GAL_PAE_v02,

convert to vector data and add a new field called "LC_Code" and update values to "100,000" (remnant vegetation) to create REMVeg

H.To make Landscape_Tile1..4

Update LC_Codes for all layers to ensure this code:

1 = P = palustrine

2 = L = lacustrine

3 = R = riverine

4 = E = estuarine

10 = permanant water

20 = near permanent water (water there between 70 and 100% of the time)

30 = intermittent water (water there less than 70% of the time)

100 = fresh water

200 = saline water

10,000 = floodplain

100,000 = remnant vegetation

combine all layers to ensure end value is the sum of all values to create Landscape_Tile1..4

Dataset Citation

Bioregional Assessment Programme (2015) Landscape classification of the Galilee preliminary assessment extent. Bioregional Assessment Derived Dataset. Viewed 12 December 2018, http://data.bioregionalassessments.gov.au/dataset/80e7b80a-23e4-4aa1-a56c-27febe34d7db.

Dataset Ancestors

Derived From Queensland wetland data version 3 - wetland areas.

Derived From Geofabric Surface Cartography - V2.1

Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)

Derived From Queensland groundwater dependent ecosystems

Derived From GEODATA TOPO 250K Series 3

Derived From Multi-resolution Valley Bottom Flatness MrVBF at three second resolution CSIRO 20000211

Derived From Biodiversity status of pre-clearing and remnant regional ecosystems - South East Qld
SAGA-rate Merged dataset of all C-130 observations and GEOS-Chem...
data.ucar.edu
ascii
Updated Dec 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCAR/NCAR - Earth Observing Laboratory (2024). SAGA-rate Merged dataset of all C-130 observations and GEOS-Chem near-realtime simulations for WINTER [Dataset]. http://doi.org/10.5065/D60V89ZV
Explore at:
asciiAvailable download formats
Unique identifier
https://doi.org/10.5065/D60V89ZV
Dataset updated
Dec 26, 2024
Dataset provided by
University Corporation for Atmospheric Research
Authors
UCAR/NCAR - Earth Observing Laboratory
Time period covered
Feb 1, 2015 - Mar 15, 2015
Area covered

Description
This dataset contains all Wintertime Investigation of Transport, Emission, and Reactivity (WINTER) C-130 observations merged at the rate of the SAGA data. In addition to the observations, the dataset also contains results from the GEOS-Chem near-realtime simulation sampled along the flight track. Refer to the instruments dataset for instrument description. The GEOS-Chem model description can be found at www.geos-chem.org. Missing values are indicated by -99999. When generating fine time resolution data from a coarser resolution, the reported value at the original (coarse) time step is applied uniformly to all intermediate (fine) time steps - no interpolation is performed. When generating coarse time resolution data from a finer resolution, the time weighted average of the values at the intermediate (fine) time steps is used as the value at the coarser time step. Please follow the WINTER data policy. These data were updated April 7, 2016. The revised set uses the final data uploaded to the WINTER archive as of 4th April 2016.
d
AFSC/REFM: Digitized 2005 GOA Trawl Logbooks merged with Fish Ticket and...
catalog.data.gov
fisheries.noaa.gov
+1more
Updated Jun 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact, Custodian) (2025). AFSC/REFM: Digitized 2005 GOA Trawl Logbooks merged with Fish Ticket and Observer data [Dataset]. https://catalog.data.gov/dataset/afsc-refm-digitized-2005-goa-trawl-logbooks-merged-with-fish-ticket-and-observer-data1
Explore at:
Dataset updated
Jun 1, 2025
Dataset provided by
(Point of Contact, Custodian)
Description
The data include a full year of logbook forms for vessels 60-124 feet in length (the partial coverage fleet) that had participated in the trawl flatfish fishery of 2005 in the Gulf of Alaska. The digitized hauls were not restricted exclusively to the population of trips to the Gulf of Alaska (GOA), since some vessels also participated in BSAI trawl fisheries. A total of 55 unique vessels daily fishing logbooks (9 catcher-processors and 46 catcher vessels) were digitized into the Vessel Log System database. The daily production section for catcher-processors was not digitized, therefore they were excluded from the data entry procedure and we focus on the remaining catcher vessels. These logbook records are then combined with observer and fish ticket data for the same vessels to create a more complete accounting of each vessels activity in 2005. In order to examine the utility, uniqueness, and the congruence of data contained in the logbooks with other sources, we collated vessel records from logbook data with Alaska Commercial Fisheries Entry Commission (CFEC) fish tickets (retrieved from the Alaska Fisheries Information Network (AKFIN)) and the North Pacific Groundfish Observer Program observer records. Merging of datasets was a multiple-step process. The first merge of data was between the quality-controlled observer and fish ticket data. Prior to 2007, the observer program did not track trip-level information such as the date of departure and return to/from port, or landing date. Consequently, to combine the 2005 haul-level observer data with the trip-level data from the fish tickets for a given vessel, each observer haul was merged with a fish ticket record if the haul retrieval date from the observer data was contained within in the modified start and end date derived from the fish ticket data (see above). Since the starting date on the fish ticket record represents the date fishing began, rather than the date a vessel left port, all observer haul records should be within the time frame of the fish ticket start and end dates. The observer hauls were therefore given the same trip number as determined by the fish tickets trip numbering algorithm. The same process was then repeated to merge each logbook haul onto the combined fish ticket and observer data. Trip targets were then assigned from the North Pacific Fishery Management Council comprehensive observer database (Council.Comprehensive_obs) for observed trips, and statistical areas denoted on the fish tickets were mapped to Fishery Management Plan (FMP) areas. After quality control, the dataset was considered complete, and is referred to as the combined dataset.
f
Additional file 2: of Metassembler: merging and optimizing de novo genome...
springernature.figshare.com
figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alejandro Wences; Michael Schatz (2023). Additional file 2: of Metassembler: merging and optimizing de novo genome assemblies [Dataset]. http://doi.org/10.6084/m9.figshare.c.3642413_D4.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.c.3642413_D4.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Authors
Alejandro Wences; Michael Schatz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A1MetEventsTable.txt: Reported metassembly events (i.e. modifications to the primary assembly such as gaps closed, number of scaffold links, etc) for all Assemblathon1 metassemblies at each merging step. (TXT 24 kb)
d
Merged Galilee model recharge estimates chloride mass balance v02
data.gov.au
cloud.csiss.gmu.edu
+2more
zip
Updated Apr 13, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2022). Merged Galilee model recharge estimates chloride mass balance v02 [Dataset]. https://data.gov.au/data/dataset/groups/b892b063-4df9-4199-80c8-2ed4a1077d5b
Explore at:
zip(7347192)Available download formats
Dataset updated
Apr 13, 2022
Dataset authored and provided by
Bioregional Assessment Program
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

This dataset was derived by the Bioregional Assessment Programme. The parent datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

This dataset merges the groundwater recharge estimate grids for each hydrogeological formation in the Galilee Basin (GUID: d42a8497-9d67-42ad-9e7d-70a8d519875f) into a single grid for input into the Galilee groundwater model.

A second groundwater recharge estimate has been produced which reduces the estimated rate for all formations. All values of the merged grid were halved, and areas which underlie unconsolidated clay and basalt surface geology features (GUID: 3c8e66e7-6a15-47ce-853b-bbe38435d28f) are given a recharge value of 0.125 mm/year for clay and 0.25 mm/year for basalt, unless the original cell value was less.

Purpose

This dataset provides a single input recharge estimate grid for the Galilee groundwater model.

Dataset History

All formation recharge estimate grids from the input recharge dataset were merged into a single raster layer using the raster calculator statement:

Con(IsNull("%a%"),Con(IsNull("%b%"),Con(IsNull("%c%"),Con(IsNull("%d%"),Con(IsNull("%e%"),Con(IsNull("%f%"),Con(IsNull("%g%"),Con(IsNull("%h%"),Con(IsNull("%i%"),"%j%","%i%"),"%h%"),"%g%"),"%f%"),"%e%"),"%d%"),"%c%"),"%b%"),"%a%"). Where a, b, c, d... are recharge estimates for individual formations.

Then, 'No data' gaps were filled in using the raster calculator statement: Con(IsNull("%MergeALL%"),Con(IsNull(BlockStatistics("%MergeALL%",NbrRectangle(2,2,"CELL"),"MAXIMUM")),(BlockStatistics("%MergeALL%",NbrRectangle(4,4,"CELL"),"MAXIMUM")),(BlockStatistics("%MergeALL%",NbrRectangle(2,2,"CELL"),"MAXIMUM"))),"%MergeALL%"). Where MergeALL is the output raster of the previous step.

To create the raster "Recharge_mergeAll_AlluviumCenozoicFeatures" the output of the previous step was multiplied by 0.5, then cells which were contained within the clay features shapefile were given a values of 0.125 and cells contained within the basalt features shapefile were given values of 0.25 (original cell values less than the new values were retained)

Dataset Citation

Bioregional Assessment Programme (2015) Merged Galilee model recharge estimates chloride mass balance v02. Bioregional Assessment Derived Dataset. Viewed 07 December 2018, http://data.bioregionalassessments.gov.au/dataset/b892b063-4df9-4199-80c8-2ed4a1077d5b.

Dataset Ancestors

Derived From Galilee model recharge estimates: chloride mass balance v02

Derived From Australian 0.05Âº gridded chloride deposition v2

Derived From Galilee Recharge Cenozoic Alluvium Regions v01

Derived From GAL Aquifer Formation Extents v01

Derived From GAL Aquifer Formation Extents v02

Derived From Surface Geology of Australia, 1:1 000 000 scale, 2012 edition

Derived From Natural Resource Management (NRM) Regions 2010

Derived From Galilee Groundwater Model, hydrogeological formation recharge (Outcrop) extents v01

Derived From Galilee - Alluvium and Cenozoic 1M surface Geology

Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)

Derived From GEODATA TOPO 250K Series 3

Derived From NSW Catchment Management Authority Boundaries 20130917

Derived From Geological Provinces - Full Extent

Derived From Phanerozoic OZ SEEBASE v2 GIS

Derived From Galilee Hydrochemistry: Quality control for Chloride model recharge v02

Derived From Bioregional Assessment areas v03

Derived From QLD Geological Digital Data - QLD Geology, Structural Framework, November 2012

Derived From Galilee Groundwater Model, Hydrogeological Formation Extents v01

Derived From Queensland petroleum exploration data - QPED

Derived From Three-dimensional visualisation of the Great Artesian Basin - GABWRA

Derived From QLD Department of Natural Resources and Mines Groundwater Database Extract 20142808

Derived From Bioregional Assessment areas v01

Derived From Bioregional Assessment areas v02

Derived From Queensland Geological Digital Data - Detailed state extent, regional. November 2012
h
HuggingFaceTB_smoltalk_filtered_10k_sampled
huggingface.co
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jungki son (2025). HuggingFaceTB_smoltalk_filtered_10k_sampled [Dataset]. https://huggingface.co/datasets/aeolian83/HuggingFaceTB_smoltalk_filtered_10k_sampled
Explore at:
Dataset updated
Apr 10, 2025
Authors
jungki son
Description
Origin Datasets: HuggingFaceTB/smoltalk Dataset Sampling for Merge-Up SLM Training To prepare a dataset of 100,000 samples for Merge-Up SLM training, the following steps were taken:

Filtering for English Only: We used a regular expression to filter the dataset, retaining only the samples that contain English alphabets exclusively. Proportional Sampling by Token Length: Starting from 4,000 tokens, we counted the number of samples in increments of 200 tokens. Based on the resulting distribution… See the full description on the dataset page: https://huggingface.co/datasets/aeolian83/HuggingFaceTB_smoltalk_filtered_10k_sampled.
T
Global daily-scale soil moisture fusion dataset based on Triple Collocation...
data.tpdc.ac.cn
tpdc.ac.cn
zip
Updated Jan 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li JIA; Qiuxia XIE; Guangcheng HU (2022). Global daily-scale soil moisture fusion dataset based on Triple Collocation Analysis (2011-2018) [Dataset]. http://doi.org/10.11888/Terre.tpdc.271935
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.11888/Terre.tpdc.271935
Dataset updated
Jan 10, 2022
Dataset provided by
TPDC
Authors
Li JIA; Qiuxia XIE; Guangcheng HU
Area covered
Description
This dataset is an 8-year (2011-2018) global spatiotemporally consistent surface soil moisture dataset with a 25km spatial grid resolution and daily temporal step in unit of cm3/cm3. This dataset is developed by applying a linear weight fusion algorithm based on the Triple Collocation Analysis (TCA) to merge the five soil moisture data products, i.e., SMOS, ASCAT, FY3B, CCI and SMAP in two steps. The first step is to fuse the SMOS, ASCAT and FY3B soil moisture products from 2011 to 2018. The second step is to refuse the merged soil moisture product in the first step, CCI and SMAP products from 2015 to 2018, and to obtain the finally merged soil moisture product from 2011 to 2018. In addition, the measured soil moisture data from seven ground observation networks around the world are used to evaluate and analyze the merged soil moisture product. The fused soil moisture product has the global spatial coverage ratio of more than 80%. With rhe minimum RMSE (root mean square error) of 0.036 cm3/cm3.
1s Merged dataset of all C-130 observations and GEOS-Chem near-realtime...
data.ucar.edu
ascii
Updated Dec 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCAR/NCAR - Earth Observing Laboratory (2024). 1s Merged dataset of all C-130 observations and GEOS-Chem near-realtime simulations for WINTER [Dataset]. http://doi.org/10.5065/D68C9TDX
Explore at:
asciiAvailable download formats
Unique identifier
https://doi.org/10.5065/D68C9TDX
Dataset updated
Dec 26, 2024
Dataset provided by
University Corporation for Atmospheric Research
Authors
UCAR/NCAR - Earth Observing Laboratory
Time period covered
Feb 1, 2015 - Mar 15, 2015
Area covered

Description
This dataset contains all Wintertime Investigation of Transport, Emission, and Reactivity (WINTER) C-130 observations merged at the 1 second time steps. In addition to the observations, the dataset also contains results from the GEOS-Chem near-realtime simulation sampled along the flight track. Refer to the instrument's dataset for instrument description. The GEOS-Chem model description can be found at www.geos-chem.org. Missing values are indicated by -99999. When generating fine time resolution data from a coarser resolution, the reported value at the original (coarse) time step is applied uniformly to all intermediate (fine) time steps - no interpolation is performed. When generating coarse time resolution data from a finer resolution, the time weighted average of the values at the intermediate (fine) time steps is used as the value at the coarser time step. Please follow the WINTER data policy. These data were updated April 7, 2016. The revised set uses the final data uploaded to the WINTER archive as of 4th April 2016.
h
allenai_llama_3.1_tulu_3_405b_preference_mixture_filtered_10k_sampled
huggingface.co
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jungki son (2025). allenai_llama_3.1_tulu_3_405b_preference_mixture_filtered_10k_sampled [Dataset]. https://huggingface.co/datasets/aeolian83/allenai_llama_3.1_tulu_3_405b_preference_mixture_filtered_10k_sampled
Explore at:
Dataset updated
Apr 10, 2025
Authors
jungki son
Description
Origin Datasets: allenai/llama-3.1-tulu-3-405b-preference-mixture Dataset Sampling for Merge-Up SLM Training To prepare a dataset of 100,000 samples for Merge-Up SLM training, the following steps were taken:

Filtering for English Only: We used a regular expression to filter the dataset, retaining only the samples that contain English alphabets exclusively. Proportional Sampling by Token Length: Starting from 4,000 tokens, we counted the number of samples in increments of 200 tokens. Based on… See the full description on the dataset page: https://huggingface.co/datasets/aeolian83/allenai_llama_3.1_tulu_3_405b_preference_mixture_filtered_10k_sampled.
Seamless high-resolution soil moisture from the synergistic merging of the...
zenodo.org
zip
Updated Jun 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Fiifi Tawia Hagan; Seokhyeon Kim; Guojie Wang; Xiaowen Ma; Robin van der Schalie; Yifan Hu; Yi Y. Liu; Alexander Barth; Haonan Liu; Waheed Ullah; Isaac K. Nooni; Asher S. Bhatti; Daniel Fiifi Tawia Hagan; Seokhyeon Kim; Guojie Wang; Xiaowen Ma; Robin van der Schalie; Yifan Hu; Yi Y. Liu; Alexander Barth; Haonan Liu; Waheed Ullah; Isaac K. Nooni; Asher S. Bhatti (2024). Seamless high-resolution soil moisture from the synergistic merging of the FengYun-3 satellite observations series [Dataset]. http://doi.org/10.5281/zenodo.11501751
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11501751
Dataset updated
Jun 6, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Daniel Fiifi Tawia Hagan; Seokhyeon Kim; Guojie Wang; Xiaowen Ma; Robin van der Schalie; Yifan Hu; Yi Y. Liu; Alexander Barth; Haonan Liu; Waheed Ullah; Isaac K. Nooni; Asher S. Bhatti; Daniel Fiifi Tawia Hagan; Seokhyeon Kim; Guojie Wang; Xiaowen Ma; Robin van der Schalie; Yifan Hu; Yi Y. Liu; Alexander Barth; Haonan Liu; Waheed Ullah; Isaac K. Nooni; Asher S. Bhatti
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These datasets are results from merging three FengYun passive microwave soil moisture observations at a 15kmx15km spatial resolution from 2011 to 2020 with continuous extension as data becomes available. Here, we rely on a merging technique that minimizes mean square error (MSE) using the signal-to-noise ratio (SNRopt) of the input parent products to first merge subdaily soil moisture products into dail averages. From these, these are gap-filled using a Data INterpolating Convolutional Auto-Encoder, DINCAE (FY3_Reoconstructed_*). The advantage of this method is that it comes with error variances(FY3_ErVar_*) for each pixel and time step which are useful for sevral applications.
STEPwise Survey for Non Communicable Diseases Risk Factors 2005 - Zimbabwe
datacatalog.ihsn.org
catalog.ihsn.org
Updated Jun 26, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Health Organization (2017). STEPwise Survey for Non Communicable Diseases Risk Factors 2005 - Zimbabwe [Dataset]. https://datacatalog.ihsn.org/catalog/6968
Explore at:
Dataset updated
Jun 26, 2017
Dataset provided by
World Health Organizationhttps://who.int/
Ministry of Health and Child Welfare
Time period covered
2005
Area covered
Zimbabwe
Description
Abstract

Noncommunicable diseases are the top cause of deaths. In 2008, more than 36 million people worldwide died of such diseases. Ninety per cent of those lived in low-income and middle-income countries.WHO Maps Noncommunicable Disease Trends in All Countries The STEPS Noncommunicable Disease Risk Factor Survey, part of the STEPwise approach to surveillance (STEPS) Adult Risk Factor Surveillance project by the World Health Organization (WHO), is a survey methodology to help countries begin to develop their own surveillance system to monitor and fight against noncommunicable diseases. The methodology prescribes three steps—questionnaire, physical measurements, and biochemical measurements. The steps consist of core items, core variables, and optional modules. Core topics covered by most surveys are demographics, health status, and health behaviors. These provide data on socioeconomic risk factors and metabolic, nutritional, and lifestyle risk factors. Details may differ from country to country and from year to year.

The general objective of the Zimbabwe NCD STEPS survey was to assess the risk factors of selected NCDs in the adult population of Zimbabwe using the WHO STEPwise approach to non-communicable diseases surveillance. The specific objectives were: - To assess the distribution of life-style factors (physical activity, tobacco and alcohol use), and anthropometric measurements (body mass index and central obesity) which may impact on diabetes and cardiovascular risk factors. - To identify dietary practices that are risk factors for selected NCDs. - To determine the prevalence and determinants of hypertension - To determine the prevalence and determinants of diabetes. - To determine the prevalence and determinants of serum lipid profile.

Geographic coverage

Mashonaland Central, Midlands and Matebeleland South Provinces.

Analysis unit

Household Individual

Universe

The survey comprised of individuals aged 25 years and over.

Kind of data

Sample survey data [ssd]

Sampling procedure

A multistage sampling strategy with 3 stages consisting of province, district and health centre was employed. The World Health Organization STEPwise Approach (STEPS) was used as the design basis for the survey. The 3 randomly selected provinces for the survey were Mashonaland Central, Midlands and Matebeleland South. In each Province four districts were chosen and four health centres were surveyed per district. The survey comprised of individuals aged 25 years and over.The survey was carried out on 3,081 respondents consisting of 1,189 from Midlands,944 from Mashonaland Central and 948 from Matebeleland South. A detailed description of the sampling process is provided in sections 3.8 -3.9. if the survey report provided under the related materials tab.

Sampling deviation

Designing a community-based survey such as this one is fraught with difficulties in ensuring representativeness of the sample chosen. In this survey there was a preponderance of female respondents because of the pattern of employment of males and females which also influences urban rural migration.

The response rate in Midlands was lower than the other two provinces in both STEP 2 and 3. This notable difference was due to the fact that Midlands had more respondents sampled from the urban communities. A higher proportion of urban respondents was formally employed and therefore did not complete STEP 2 and 3 due to conflict with work schedules.

Mode of data collection

Face-to-face [f2f]

Research instrument

In this survey all the core and selected expanded and optional variables were collected. In addition a food frequency questionnaire and a UNICEF developed questionnaire, the Fortification Rapid Assessment Tool (FRAT) were administered to elicit relevant dietary information.

Cleaning operations

Data entry for Step 1 and Step 2 data was carried out as soon as data became available to the data management team. Step 3 data became available in October and data entry was carried out when data quality checks were completed in November. Report writing started in September and a preliminary report became available in December 2005.

Training of data entry clerks Five data entry clerks were recruited and trained for one week. The selection of data entry clerks was based on their performance during previous research carried out by the MOH&CW. The training of the data entry clerks involved the following: - Familiarization with the NCD, FRAT and FFQ questionnaires. - Familiarization with the data entry template. - Development of codes for open-ended questions. - Statistical package (EPI Info 6). - Development of a data entry template using EPI6. - Development of check files for each template - Trial runs (mock runs) to check whether template was complete and user friendly for data entry. - Double entry (what it involves and how to do it and why it should be done). - Pre-primary data cleaning (check whether denominators are tallying) of the data entry template was done.

Data Entry for NCD, FRAT and FFQ questionnaires The questionnaires were sequentially numbered and were then divided among the five data entry clerks. Each one of the data entry clerks had a unique identifier for quality control purposes. Hence, the data was entered into five separate files using the statistical package EPI Info version 6.0. The data entry clerks inter-changed their files for double entry and validation of the data. Preliminary data cleaning was done for each of the five files. The five files were then merged to give a single file. The merged file was then transferred to STATA Version 7.0 using Stat Transfer version 5.0.

Data Cleaning A data-cleaning workshop was held with the core research team members. The objectives of the workshop were: 1. To check all data entry errors. 2. To assess any inconsistencies in data filling. 3. To assess any inconsistencies in data entry. 4. To assess completeness of the data entered.

Data Merging There were two datasets (NCD questionnaire dataset and laboratory dataset) after the data entry process. The two files were merged by joining corresponding observations from the NCD questionnaire dataset with those from the laboratory dataset into single observations using a unique identifier. The ID number was chosen as the unique identifier since it appeared in both data sets. The main aim of merging was to combine the two datasets containing information on behaviour of individuals and the NCD laboratory parameters. When the two data sets were merged, a new merge variable was created. The merge variable took values 1, 2 and 3. Merge variable==1 Observation appeared in the NCD questionnaire data set but a corresponding observation was not in the laboratory data set Merge variable==2 Observation appeared in the laboratory data set but a corresponding observation did not appear in the questionnaire data set Merge variable==3 Observation appeared in both data sets and reflects a complete merge of the two data sets.

Data Cleaning After Merging Data cleaning involved identifying the observations where the merge variable values were either 1 or 2. Merge status for each observation was also changed after effecting any corrections. The other two unique variables that were used in the cleaning were Province, district and health centre since they also appeared in both data sets.

Objectives of cleaning: 1. Match common variables in both data sets and identify inconsistencies in other matching variables e.g. province, district and health centre. 2. To check for any data entry errors.

Response rate

A total of 3,081 respondents were included in the survey against an estimated sample size of 3,000. The response rate for Step 1 was 80% for and for Step 2 70% taking Step 1 accrual as being 100%.
d
A combined global ocean pCO2 climatology combining open ocean and coastal...
catalog.data.gov
Updated Jul 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact) (2025). A combined global ocean pCO2 climatology combining open ocean and coastal areas (NCEI Accession 0209633) [Dataset]. https://catalog.data.gov/dataset/a-combined-global-ocean-pco2-climatology-combining-open-ocean-and-coastal-areas-ncei-accession-
Explore at:
Dataset updated
Jul 1, 2025
Dataset provided by
(Point of Contact)
Description
This dataset contains the partial pressure of carbon dioxide (pCO2) climatology that was created by merging 2 published and publicly available pCO2 datasets covering the open ocean (LandschÃ¼tzer et. al 2016) and the coastal ocean (Laruelle et. al 2017). Both fields were initially created using a 2-step neural network technique. In a first step, the global ocean is divided into 16 biogeochemical provinces using a self-organizing map. In a second step, the non-linear relationship between variables known to drive the surface ocean carbon system and gridded observations from the SOCAT open and coastal ocean datasets (Bakker et. al 2016) is reconstructed using a feed-forward neural network within each province separately. The final product is then produced by projecting driving variables, e.g., surface temperature, chlorophyll, mixed layer depth, and atmospheric CO2 onto oceanic pCO2 using these non-linear relationships (see LandschÃ¼tzer et. al 2016 and Laruelle et. al 2017 for more detail). This results in monthly open ocean pCO2 fields at 1Â°x1Â° resolution and coastal ocean pCO2 fields at 0.25Â°x0.25Â° resolution. To merge the products, we divided each 1Â°x1Â° open ocean bin into 16 equal 0.25Â°x0.25Â° bins without any interpolation. The common overlap area of the products has been merged by scaling the respective products by their mismatch compared to observations from the SOCAT datasets (see LandschÃ¼tzer et. al 2020).
h
summary-map-reduce-v1
huggingface.co
Updated Dec 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter Szemraj (2024). summary-map-reduce-v1 [Dataset]. https://huggingface.co/datasets/pszemraj/summary-map-reduce-v1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 5, 2024
Authors
Peter Szemraj
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
summary-map-reduce-v1

A dataset for training text-to-text models to consolidate multiple summaries from a chunked long document in the "reduce" step of map-reduce summarization

About

Each example contains chunked summaries from a long document, concatenated into a single string with

as delimiter (input_summaries), and their synthetically generated consolidated/improved version (final_summary). The consolidation step focuses on

Merge redundant information while… See the full description on the dataset page: https://huggingface.co/datasets/pszemraj/summary-map-reduce-v1.
g
NCDC International Best Track Archive for Climate Stewardship (IBTrACS)...
gimi9.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NCDC International Best Track Archive for Climate Stewardship (IBTrACS) Project, Version 2 (Version Superseded) | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_ff05c302d35ece2d8cc6f2e3b0115c9222ec1afc
Explore at:
Description
Version 2 of the dataset has been superseded by a newer version. Users should not use version 2 except in rare cases (e.g., when reproducing previous studies that used version 2). The International Best Track Archive for Climate Stewardship (IBTrACS) dataset was developed by The NOAA National Climatic Data Center, which took the initial step of synthesizing and merging best track data from all official Tropical Cyclone Warning Centers (TCWCs) and the WMO Regional Specialized Meteorological Centers (RSMCs) who are responsible for developing and archiving best track data worldwide. In recognizing the deficiency in global tropical cyclone data, and the lack of a publically available dataset, the IBTrACS dataset was produced, which, for the first time, combines existing best track data from over 10 international forecast centers. The dataset contains the position, maximum sustained winds, minimum central pressure, and storm nature for every tropical cyclone globally at 6-hr intervals in UTC. Statistics from the merge are also provided (such as number of centers tracking the storm, range in pressure, median wind speed, etc.). The dataset period is from 1848 to the present with dataset updates performed semi-annually--in the boreal spring following the completion of the Northern Hemisphere TC season and in the boreal autumn following the completion of the Southern Hemisphere TC season. The dataset is archived as netCDF files but can be accessed via a variety of user-friendly formats to facilitate data analysis, including netCDF and CSV formatted files. Version 2 changes include source data updates, bug fixes, adjustments and corrections as well as additional source datasets.

Facebook

Twitter

Click to copy link

Link copied

Cite

HU, Tao (2024). Joiner [Dataset]. http://doi.org/10.7910/DVN/0BM2IQ

Joiner

Explore at:

Unique identifier

https://doi.org/10.7910/DVN/0BM2IQ

Dataset updated

Sep 24, 2024

Dataset provided by

Harvard Dataverse

Authors

HU, Tao

Description

The joiner is a component often used in workflows to merge or join data from different sources or intermediate steps into a single output. In the context of Common Workflow Language (CWL), the joiner can be implemented as a step that combines multiple inputs into a cohesive dataset or output. This might involve concatenating files, merging data frames, or aggregating results from different computations.

Clear search

Close search

Google apps

Main menu

Joiner

Dataset of "A Metabolites Merging Strategy (MMS): Harmonization to enable...

Im4Sketch Dataset

Replication Data for: Bespoke NPO Taxonomies - Step 02: Merge and Refine...

RSR1.5 of ICP and CICP algorithms in two steps on US-MERGE and US-SNAP...

Table_1_Deep learning of model- and reanalysis-based precipitation and...

Landscape classification of the Galilee preliminary assessment extent

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

SAGA-rate Merged dataset of all C-130 observations and GEOS-Chem...

AFSC/REFM: Digitized 2005 GOA Trawl Logbooks merged with Fish Ticket and...

Additional file 2: of Metassembler: merging and optimizing de novo genome...

Merged Galilee model recharge estimates chloride mass balance v02

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

HuggingFaceTB_smoltalk_filtered_10k_sampled

Global daily-scale soil moisture fusion dataset based on Triple Collocation...

1s Merged dataset of all C-130 observations and GEOS-Chem near-realtime...

allenai_llama_3.1_tulu_3_405b_preference_mixture_filtered_10k_sampled

Seamless high-resolution soil moisture from the synergistic merging of the...

STEPwise Survey for Non Communicable Diseases Risk Factors 2005 - Zimbabwe

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Sampling deviation

Mode of data collection

Research instrument

Cleaning operations

Response rate

A combined global ocean pCO2 climatology combining open ocean and coastal...

summary-map-reduce-v1

NCDC International Best Track Archive for Climate Stewardship (IBTrACS)...

Joiner