23 datasets found

f
Mock community raw data (fastq.gz files) for long read ONT 16s with ONT...
adelaide.figshare.com
bin
Updated Apr 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Isabella Burdon (2024). Mock community raw data (fastq.gz files) for long read ONT 16s with ONT primer dataset [Dataset]. http://doi.org/10.25909/25720092.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.25909/25720092.v1
Dataset updated
Apr 30, 2024
Dataset provided by
The University of Adelaide
Authors
Isabella Burdon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Mock community raw data (fastq.gz files) for long read ONT 16s with ONT primer dataset
d
wfip2.model/realtime.hrrr_wfip2.icbc.02
catalog.data.gov
s.cnmilf.com
Updated Aug 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wind Energy Technologies Office (WETO) (2021). wfip2.model/realtime.hrrr_wfip2.icbc.02 [Dataset]. https://catalog.data.gov/dataset/surface-meteorological-station-pnnl-short-tower-rufus-raw-data
Explore at:
Dataset updated
Aug 7, 2021
Dataset provided by
Wind Energy Technologies Office (WETO)
Description
Overview The primary purpose of WFIP2 Model Development Team is to improve existing numerical weather prediction models in a manner that leads to improved wind forecasts in regions of complex terrain. Improvements in the models will come through better understanding of the physics associated with the wind flow in and around the wind plant across a range of temporal and spatial scales, which will be gained through WFIP2’s observational field study and analysis. Data Details Initial conditions, lateral-boundary conditions, WRF namelists, and output graphics were archived from three real-time modeling frameworks: 1) RAP-ESRL: the experimental RAP (run hourly) 2) HRRR-ESRL: the experimental HRRR (run hourly) 3) HRRR-WFIP2: the experimental, WFIP2-provisional version of the HRRR, run twice daily at 0600 and 1800 UTC. The real-time HRRR-WFIP2 also ran with a concurrent 750-m nest (i.e., the HRRR-WFIP2 nest) that was initialized at 1 h into the HRRR forecast (i.e., 0700 and 1900 UTC). Each of these frameworks should be considered experimental, subject to intermittent production outages (sometimes persistent), data-assimilation outages, and changes to data-assimilation procedures and physical parameterizations. The archive of real-time data from these modeling frameworks consists of the following two zip-file aggregations: 1) files containing initial conditions, lateral boundary conditions, and WRF namelists: For RAP-ESRL and HRRR-ESRL runs, three files are compressed in a single zip file: i) wrfinput_d01: initial conditions (netCDF) ii) wrfbdy_d01: lateral-boundary conditions (netCDF) iii) namelist.input: the WRF-ARW namelist (plain text) The HRRR-WFIP2 archive also includes these files, but with the addition of "wrfinput_d02", the nested-domain initial conditions (netCDF). Note that while the archived HRRR-WFIP2 namelist specifies a 15-h forecast, lateral-boundary conditions for most runs are available for a 24-h forecast. 2) files containing output graphics (png). Given the large number of graphics files that are produced, a detailed description of the zip-file contents is not given here.
Z
Dataset for: Feasibility study to improve clinical trial transparency with...
data.niaid.nih.gov
zenodo.org
Updated Jan 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Salholz-Hillel, Maia (2024). Dataset for: Feasibility study to improve clinical trial transparency with individualized report cards at a large university medical center [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10467053
Explore at:
Dataset updated
Jan 8, 2024
Dataset provided by
Salholz-Hillel, Maia
Franzen, Delwen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This deposit contains data associated with a feasibility study evaluating the use of individualized report cards to improve trial transparency at the Charité - Universitätsmedizin Berlin. It primarily includes large raw data files and other files compiled by, or used in the project code repository: https://github.com/quest-bih/tv-ct-transparency/. These data are deposited for documentation and computational reproducibility; they do not reflect the most current/accurate data available from each source.

The deposit contains:

Survey data (survey-data.csv): Participant responses for an anonymous survey conducted to assess the usefulness of the report cards and infosheet. The survey was administered in LimeSurvey and hosted on a server at the QUEST Center for Responsible Research at the Berlin Institute of Health at Charité – Universitätsmedizin Berlin. Any information that could potentially identify participants, such as IP address and free-text fields (e.g., corrections, comments) were removed. This file serves as input for the analysis of the survey data.
Data from: A large-scale fMRI dataset for the visual processing of...
openneuro.org
Updated Jul 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhengxin Gong; Ming Zhou; Yuxuan Dai; Yushan Wen; Youyi Liu; Zonglei Zhen (2023). A large-scale fMRI dataset for the visual processing of naturalistic scenes [Dataset]. http://doi.org/10.18112/openneuro.ds004496.v2.1.2
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds004496.v2.1.2
Dataset updated
Jul 9, 2023
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Zhengxin Gong; Ming Zhou; Yuxuan Dai; Yushan Wen; Youyi Liu; Zonglei Zhen
Description
Summary

One ultimate goal of visual neuroscience is to understand how the brain processes visual stimuli encountered in the natural environment. Achieving this goal requires records of brain responses under massive amounts of naturalistic stimuli. Although the scientific community has put in a lot of effort to collect large-scale functional magnetic resonance imaging (fMRI) data under naturalistic stimuli, more naturalistic fMRI datasets are still urgently needed. We present here the Natural Object Dataset (NOD), a large-scale fMRI dataset containing responses to 57,120 naturalistic images from 30 participants. NOD strives for a balance between sampling variation between individuals and sampling variation between stimuli. This enables NOD to be utilized not only for determining whether an observation is generalizable across many individuals, but also for testing whether a response pattern is generalized to a variety of naturalistic stimuli. We anticipate that the NOD together with existing naturalistic neuroimaging datasets will serve as a new impetus for our understanding of the visual processing of naturalistic stimuli.

Data record

The data were organized according to the Brain-Imaging-Data-Structure (BIDS) Specification version 1.7.0 and can be accessed from the OpenNeuro public repository (accession number: ds004496). In short, raw data of each subject were stored in “sub-

Stimulus images The stimulus images for different fMRI experiments are deposited in separate folders: “stimuli/imagenet”, “stimuli/coco”, “stimuli/prf”, and “stimuli/floc”. Each experiment folder contains corresponding stimulus images, and the auxiliary files can be found within the “info” subfolder.

Raw MRI data Each participant folder consists of several session folders: anat, coco, imagenet, prf, floc. Each session folder in turn includes “anat”, “func”, or “fmap” folders for corresponding modality data. The scan information for each session is provided in a TSV file.

Preprocessed volume data from fMRIprep The preprocessed volume-based fMRI data are in subject's native space, saved as “sub-

Preprocessed surface-based data from ciftify The preprocessed surface-based data are in standard fsLR space, saved as “sub-

Brain activation data from surface-based GLM analyses The brain activation data are derived from GLM analyses on the standard fsLR space, saved as “sub-
D
Vegetation Condition Benchmarks Stems raw data V1.2
data.nsw.gov.au
Updated Mar 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NSW Department of Climate Change, Energy, the Environment and Water (2024). Vegetation Condition Benchmarks Stems raw data V1.2 [Dataset]. https://data.nsw.gov.au/data/dataset/vegetation-condition-benchmarks-stems-raw-data-v1-2
Explore at:
Dataset updated
Mar 13, 2024
Dataset provided by
Department of Climate Change, Energy, the Environment and Water of New South Waleshttps://www.nsw.gov.au/departments-and-agencies/dcceew
Description
Vegetation Condition Benchmarks describe the reference state to which sites are compared to score their site-scale biodiversity values or set goals for management or restoration. This file contains some of the raw data used to create the most current vegetation condition benchmarks. Refer to the 'Dataset relationship' section, below, to access all the raw data files used in creating the Vegetation Condition Benchmarks V1.2.

The ‘Vegetation Condition Benchmarks Stems raw data V1.2’ file contains aggregated stems data from 2302 plots that were used to create number of large tree benchmarks. Refer to the info worksheet for further details of column headings.

For further details see Capararo S, Watson CJ, Somerville M, Travers SK, McNellie MJ, Dorrough J and Oliver I (2019) Function Attribute Benchmarks for the Biodiversity Assessment Method: Data audit, compilation and analysis. Department of Planning, Industry and Environment.
4
Data from: High-Resolution Atmospheric Modelling and the Effects on the...
data.4tu.nl
figshare.com
+1more
zip
Updated Apr 29, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Casper van Laerhoven (2016). High-Resolution Atmospheric Modelling and the Effects on the Prediction of Wave Characteristics [Dataset]. http://doi.org/10.4121/uuid:b2f40b99-f913-4d3d-a048-5ea41ee85f57
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:b2f40b99-f913-4d3d-a048-5ea41ee85f57
Dataset updated
Apr 29, 2016
Dataset provided by
TU Delft
Authors
Casper van Laerhoven
License
https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use
Area covered

Description
This data set is part of the MSc thesis: 'High-Resolution Atmospheric Modelling and the Effects on the Prediction of wave characteristics. It provides the files used to run WRF and SWAN simulations for the most important simulations. Raw data file were not included due to their large size.
s
Data from: Data files used to study the distribution of growth in software...
figshare.swinburne.edu.au
pdf
Updated Jul 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajesh Vasa (2024). Data files used to study the distribution of growth in software systems [Dataset]. http://doi.org/10.25916/sut.26271970.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.25916/sut.26271970.v1
Dataset updated
Jul 22, 2024
Dataset provided by
Swinburne
Authors
Rajesh Vasa
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
The evolution of a software system can be studied in terms of how various properties as reflected by software metrics change over time. Current models of software evolution have allowed for inferences to be drawn about certain attributes of the software system, for instance, regarding the architecture, complexity and its impact on the development effort. However, an inherent limitation of these models is that they do not provide any direct insight into where growth takes place. In particular, we cannot assess the impact of evolution on the underlying distribution of size and complexity among the various classes. Such an analysis is needed in order to answer questions such as 'do developers tend to evenly distribute complexity as systems get bigger?', and 'do large and complex classes get bigger over time?'. These are questions of more than passing interest since by understanding what typical and successful software evolution looks like, we can identify anomalous situations and take action earlier than might otherwise be possible. Information gained from an analysis of the distribution of growth will also show if there are consistent boundaries within which a software design structure exists. The specific research questions that we address in Chapter 5 (Growth Dynamics) of the thesis this data accompanies are: What is the nature of distribution of software size and complexity measures? How does the profile and shape of this distribution change as software systems evolve? Is the rate and nature of change erratic? Do large and complex classes become bigger and more complex as software systems evolve? In our study of metric distributions, we focused on 10 different measures that span a range of size and complexity measures. In order to assess assigned responsibilities we use the two metrics Load Instruction Count and Store Instruction Count. Both metrics provide a measure for the frequency of state changes in data containers within a system. Number of Branches, on the other hand, records all branch instructions and is used to measure the structural complexity at class level. This measure is equivalent to Weighted Method Count (WMC) as proposed by Chidamber and Kemerer (1994) if a weight of 1 is applied for all methods and the complexity measure used is cyclomatic complexity. We use the measures of Fan-Out Count and Type Construction Count to obtain insight into the dynamics of the software systems. The former offers a means to document the degree of delegation, whereas the latter can be used to count the frequency of object instantiations. The remaining metrics provide structural size and complexity measures. In-Degree Count and Out-Degree Count reveal the coupling of classes within a system. These measures are extracted from the type dependency graph that we construct for each analyzed system. The vertices in this graph are classes, whereas the edges are directed links between classes. We associate popularity (i.e., the number of incoming links) with In-Degree Count and usage or delegation (i.e., the number of outgoing links) with Out-Degree Count. Number of Methods, Public Method Count, and Number of Attributes define typical object-oriented size measures and provide insights into the extent of data and functionality encapsulation. The raw metric data (4 .txt files and 1 .log file in a .zip file measuring ~0.5MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).
d
Data from: High-resolution lidar data for Pilgrim Hot Springs, western...
catalog.data.gov
datasets.ai
Updated Jul 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alaska Division of Geological & Geophysical Surveys (Point of Contact) (2023). High-resolution lidar data for Pilgrim Hot Springs, western Alaska, collected August 15, 2019 [Dataset]. https://catalog.data.gov/dataset/high-resolution-lidar-data-for-pilgrim-hot-springs-western-alaska-collected-august-15-20191
Explore at:
Dataset updated
Jul 5, 2023
Dataset provided by
Alaska Division of Geological & Geophysical Surveys (Point of Contact)
Area covered
Alaska
Description
The State of Alaska Division of Geological & Geophysical Surveys (DGGS) produced airborne lidar-derived elevation data for the Pilgrim Hot Springs area, western Alaska. Both aerial lidar and ground control data were collected by DGGS. This data collection is being released as a Raw Data File with an open end-user license. These data were produced in support of active fault detection and geothermal hydrology research in the area. This data collection is being released as a Raw Data File with an open end-user license. All files can be downloaded free of charge from the Alaska Division of Geological & Geophysical Surveys website (http://doi.org/10.14509/30659).
B
Data Cleaning Sample
borealisdata.ca
dataone.org
Updated Jul 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP3/ZCN177
Dataset updated
Jul 13, 2023
Dataset provided by
Borealis
Authors
Rong Luo
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Sample data for exercises in Further Adventures in Data Cleaning.
d
Data from: High resolution lidar-derived elevation data for Barry Arm...
catalog.data.gov
Updated Jul 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alaska Division of Geological & Geophysical Surveys (Point of Contact) (2023). High resolution lidar-derived elevation data for Barry Arm landslide, Southcentral Alaska, June 26, 2020 [Dataset]. https://catalog.data.gov/dataset/high-resolution-lidar-derived-elevation-data-for-barry-arm-landslide-southcentral-alask-26-20201
Explore at:
Dataset updated
Jul 5, 2023
Dataset provided by
Alaska Division of Geological & Geophysical Surveys (Point of Contact)
Area covered
Southcentral Alaska, Alaska, Barry Arm
Description
The Alaska Division of Geological & Geophysical Surveys (DGGS) used aerial lidar to produce a classified point cloud and high-resolution digital terrain model (DTM), digital surface model (DSM), and intensity model of the Barry Arm landslide, northwest Prince William Sound, Alaska, during near snow-free ground conditions on June 26, 2020. The survey's goal is to provide high quality and high resolution (0.10 m) elevation data to assess potential landslide movement. Aerial lidar and ground control data were collected on June 26, 2020, and subsequently processed in Terrasolid and ArcGIS. Ground control was collected on June 26, 2020, as well. This data collection is released as a Raw Data File with an open end-user license. All files can be downloaded free of charge from the Alaska Division of Geological & Geophysical Surveys website (http://doi.org/10.14509/30593).
D
Vegetation Condition Benchmarks Stems raw data V1.2
data.nsw.gov.au
pdf, xlsx
Updated Mar 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NSW Department of Climate Change, Energy, the Environment and Water (2024). Vegetation Condition Benchmarks Stems raw data V1.2 [Dataset]. https://data.nsw.gov.au/data/dataset/vegetation-condition-benchmarks-stems-raw-data-v1-2
Explore at:
xlsx, pdfAvailable download formats
Dataset updated
Mar 13, 2024
Dataset provided by
Department of Climate Change, Energy, the Environment and Water of New South Waleshttps://www.nsw.gov.au/departments-and-agencies/dcceew
Description
Vegetation Condition Benchmarks describe the reference state to which sites are compared to score their site-scale biodiversity values or set goals for management or restoration. This file contains some of the raw data used to create the most current vegetation condition benchmarks. Refer to the 'Dataset relationship' section, below, to access all the raw data files used in creating the Vegetation Condition Benchmarks V1.2.

The ‘Vegetation Condition Benchmarks Stems raw data V1.2’ file contains aggregated stems data from 2302 plots that were used to create number of large tree benchmarks. Refer to the info worksheet for further details of column headings.

For further details see Capararo S, Watson CJ, Somerville M, Travers SK, McNellie MJ, Dorrough J and Oliver I (2019) Function Attribute Benchmarks for the Biodiversity Assessment Method: Data audit, compilation and analysis. Department of Planning, Industry and Environment.
d
Ceilometer raw data measured at Neumayer station (2016-10), links to files
dataone.org
doi.pangaea.de
Updated Apr 4, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
König-Langlo, Gert; Alfred Wegener Institute, Helmholtz Center for Polar and Marine Research, Bremerhaven (2018). Ceilometer raw data measured at Neumayer station (2016-10), links to files [Dataset]. http://doi.org/10.1594/PANGAEA.874096
Explore at:
Unique identifier
https://doi.org/10.1594/PANGAEA.874096
Dataset updated
Apr 4, 2018
Dataset provided by
PANGAEA Data Publisher for Earth and Environmental Science
Authors
König-Langlo, Gert; Alfred Wegener Institute, Helmholtz Center for Polar and Marine Research, Bremerhaven
Time period covered
Oct 1, 2016 - Oct 31, 2016
Description
No description is available. Visit https://dataone.org/datasets/6add447b9cbe6fdfec7bd30cba174581 for complete metadata about this dataset.
d
Data from: High-resolution lidar data for Kotlik, Western Alaska
catalog.data.gov
s.cnmilf.com
Updated Jul 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alaska Division of Geological & Geophysical Surveys (Point of Contact) (2023). High-resolution lidar data for Kotlik, Western Alaska [Dataset]. https://catalog.data.gov/dataset/high-resolution-lidar-data-for-kotlik-western-alaska1
Explore at:
Dataset updated
Jul 5, 2023
Dataset provided by
Alaska Division of Geological & Geophysical Surveys (Point of Contact)
Area covered
Kotlik, Alaska
Description
The Alaska Division of Geological & Geophysical Surveys (DGGS) used aerial lidar to produce a digital terrain model (DTM), surface model (DSM), and intensity model for the area surrounding the community of Kotlik, Alaska. Detailed bare earth elevation data for the Kotlik area support and inform potential infrastructure development and provide critical information required to assess geomorphic activity. Airborne data were collected on August 17, 2019, and subsequently processed in Terrasolid and ArcGIS. Ground control was collected between August 20-22, 2019, by the Alaska Division of Mining, Land, and Water. This data collection is released as a Raw Data File with an open end-user license. All files can be downloaded free of charge from the Alaska Division of Geological & Geophysical Surveys website (http://doi.org/10.14509/30561).
i
Employment and Unemployment Survey 2007, Economic Research Forum (ERF)...
datacatalog.ihsn.org
catalog.ihsn.org
Updated Jun 26, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Statistics (2017). Employment and Unemployment Survey 2007, Economic Research Forum (ERF) Harmonization Data - Jordan [Dataset]. https://datacatalog.ihsn.org/catalog/6942
Explore at:
Dataset updated
Jun 26, 2017
Dataset provided by
Economic Research Forum
Department of Statistics
Time period covered
2007
Area covered
Jordan
Description
Abstract

THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE DEPARTMENT OF STATISTICS OF THE HASHEMITE KINGDOM OF JORDAN.

The Department of Statistics (DOS) carried out four rounds of the 2007 Employment and Unemployment Survey (EUS) during February, May, August and November 2007. The survey rounds covered a total sample of about fifty three thousand households Nation-wide. The sampled households were selected using a stratified multi-stage cluster sampling design. It is noteworthy that the sample represents the national level (Kingdom), governorates, the three Regions (Central, North and South), and the urban/rural areas.

The importance of this survey lies in that it provides a comprehensive data base on employment and unemployment that serves decision makers, researchers as well as other parties concerned with policies related to the organization of the Jordanian labor market.

The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing labor force surveys in several Arab countries.

Geographic coverage

Covering a sample representative on the national level (Kingdom), governorates, the three Regions (Central, North and South), and the urban/rural areas.

Analysis unit

1- Household/family. 2- Individual/person.

Universe

The survey covered a national sample of households and all individuals permanently residing in surveyed households.

Kind of data

Sample survey data [ssd]

Sampling procedure

THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE DEPARTMENT OF STATISTICS OF THE HASHEMITE KINGDOM OF JORDAN

Survey Frame

The sample of this survey is based on the frame provided by the data of the Population and Housing Census, 2004. The Kingdom was divided into strata, where each city with a population of 100,000 persons or more was considered as a large city. The total number of these cities is 6. Each governorate (except for the 6 large cities) was divided into rural and urban areas. The rest of the urban areas in each governorate was considered as an independent stratum. The same was applied to rural areas where it was considered as an independent stratum. The total number of strata was 30.

In view of the existing significant variation in the socio-economic characteristics in large cities in particular and in urban in general, each stratum of the large cities and urban strata was divided into four sub-stratum according to the socio- economic characteristics provided by the population and housing census with the purpose of providing homogeneous strata.

The frame excludes collective dwellings, However, it is worth noting that the collective households identified in the harmonized data, through a variable indicating the household type, are those reported without heads in the raw data, and in which the relationship of all household members to head was reported "other".

This sample is also not representative for the non-Jordanian population.

Sample Design

The sample of this survey was designed, using the two-stage cluster stratified sampling method, based on the data of the population and housing census 2004 for carrying out household surveys. The sample is representative on the Kingdom, rural-urban regions and governorates levels. The total sample size for each round was 1336 Primary Sampling Units (PSUs) (clusters). These units were distributed to urban and rural regions in the governorates, in addition to the large cities in each governorate according to the weight of persons and households, and according to the variance within each stratum. Slight modifications regarding the number of these units were made to cope with the multiple of 8, the number of clusters for four rounds was 5344.

The main sample consists of 40 replicates, each replicate consists of 167 PSUs. For the purpose of each round, eight replicates of the main sample were used. The PSUs were ordered within each stratum according to geographic characteristics and then according to socio-economic characteristics in order to ensure good spread of the sample. Then, the sample was selected on two stages. In the first stage, the PSUs were selected using the Probability Proportionate to Size with systematic selection procedure. The number of households in each PSU served as its weight or size. In the second stage, the blocks of the PSUs (cluster) which were selected in the first stage have been updated. Then a constant number of households (10 households) was selected, using the random systematic sampling method as final PSUs from each PSU (cluster).

Sampling notes

It is noteworthy that the sample of the present survey does not represent the non-Jordanian population, due to the fact that it is based on households living in conventional dwellings. In other words, it does not cover the collective households living in collective dwellings. Therefore, the non-Jordanian households covered in the present survey are either private households or collective households living in conventional dwellings.

Mode of data collection

Face-to-face [f2f]

Cleaning operations

Raw Data

The plan of the tabulation of survey results was guided by former Employment and Unemployment Surveys which were previously prepared and tested. The final survey report was then prepared to include all detailed tabulations as well as the methodology of the survey.

Harmonized Data

The SPSS package is used to clean and harmonize the datasets.

The harmonization process starts with a cleaning process for all raw data files received from the Statistical Agency.

All cleaned data files are then merged to produce one data file on the individual level containing all variables subject to harmonization.

A country-specific program is generated for each dataset to generate/ compute/ recode/ rename/ format/ label harmonized variables.

A post-harmonization cleaning process is then conducted on the data.

Harmonized data is saved on the household as well as the individual level, in SPSS and then converted to STATA, to be disseminated.
g
Home Mortgage Disclosure Act (HMDA): Loan Application Register (LAR) and...
search.gesis.org
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Federal Financial Institutions Examination Council (FFIEC), Home Mortgage Disclosure Act (HMDA): Loan Application Register (LAR) and Transmittal Sheet (TS) Raw Data, 2007 - Version 1 [Dataset]. http://doi.org/10.3886/ICPSR24611.v1
Explore at:
Unique identifier
https://doi.org/10.3886/ICPSR24611.v1
Dataset provided by
GESIS search
ICPSR - Interuniversity Consortium for Political and Social Research
Authors
Federal Financial Institutions Examination Council (FFIEC)
License
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de448416https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de448416
Description
Abstract (en): The Home Mortgage Disclosure Act (HMDA): Loan Application Register (LAR) and Transmittal Sheet (TS) Raw Data, 2007 contains information collected in calendar year 2006. The HMDA, enacted by Congress in 1975, requires most mortgage lenders located in metropolitan areas to report data about their housing-related lending activity. The HMDA data were collected from 8,886 lending institutions and cover approximately 34.1 million home purchase and home improvement loans and refinancings, including loan originations, loan purchases, and applications that were denied, incomplete or withdrawn. The Private Mortgage Insurance Companies (PMIC) data refer to applications for mortgage insurance to insure home purchase mortgages and to insure mortgages to refinance existing obligations. Part 1, HMDA Transmittal Sheet (TS), and Part 4, PMIC Transmittal Sheet (TS), include information submitted by reporting institutions with the Loan Application Register (LAR), such as the reporting institution's name, address, and Tax ID. Part 2, HMDA Reporter Panel, and Part 5, PMIC Reporter Panel, contain information on all institutions that reported data in activity year 2006. Part 3, HMDA MSA Offices, and Part 6, PMIC MSA Offices, contain information on all metropolitan statistical areas in the data. Parts 7 through 789 contain HMDA and PMIC Loan Application Register (LAR) files at the national level, at the agency level, and by MSA/MD. With some exceptions, for each transaction the institution reported data about the loan (or application), such as the type and amount of the loan made (or applied for) and, in limited circumstances, its price, the disposition of the application, such as whether it was denied or resulted in an origination of a loan, the property to which the loan relates, such as its type (single-family versus multi-family), and location (including the census tract), the sale of the loan, if it was sold, and the applicant's and co-applicant's ethnicity, race, sex, and income. The data are not weighted and do not contain any weight variables. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Created variable labels and/or value labels.; Created online analysis version with question text.. Home purchase and home improvement loans and refinancings (or applications) lended or insured by financial institutions in the United States that were required to report HMDA data in 2007. Smallest Geographic Unit: city HMDA data were collected from 8,886 depository and nondepository institutions that were required to report HMDA data if they met the law's criteria for coverage. Generally, whether a lender is covered by HMDA depended on the lender's asset size, its location, and whether it is in the business of residential mortgage lending. PMIC data were collected from eight mortgage insurance companies that insured home purchase mortgages and to insure mortgages to refinance existing obligations. For more information about how respondents reported, please refer to A Guide to HMDA Reporting. 2016-12-12 The study title and collection dates have been revised to reflect the 2006 activity year, with data reported in 2007. Filesets 1 through 6 and the multi-part setup files will also be replaced to correct the study year. Variable descriptions for parts 1 through 6 have been incorporated into the ICPSR Codebooks; "Frequencies" documents that were included in previous releases have been retired with this update. SDA was removed from this study as the original SDA pages were processed without using hermes, and the SDA title could not be updated to reflect the correct reporting year. For datasets 7 through 789, ICPSR is releasing the original deposited data files in the condition they were received, along with SPSS, Stata, and SAS setup files.The data file for Part 7, HMDA Loan Application Register (LAR): National File, contains over 34 million records. Due to its large size, users are encouraged to open this dataset in SAS. All Census tract, county definitions, and population counts were based on the 2000 Census of Population and Housing. Value labels for the variable STATE_...
ADMET-AI: A machine learning ADMET platform for evaluation of large-scale...
zenodo.org
explore.openaire.eu
zip
Updated Jul 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kyle Swanson; Kyle Swanson; Parker Walther; Jeremy Leitz; Souhrid Mukherjee; Joseph C. Wu; Rabindra V. Shivnaraine; James Zou; Parker Walther; Jeremy Leitz; Souhrid Mukherjee; Joseph C. Wu; Rabindra V. Shivnaraine; James Zou (2024). ADMET-AI: A machine learning ADMET platform for evaluation of large-scale chemical libraries – Data and Models [Dataset]. http://doi.org/10.5281/zenodo.10372419
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10372419
Dataset updated
Jul 13, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kyle Swanson; Kyle Swanson; Parker Walther; Jeremy Leitz; Souhrid Mukherjee; Joseph C. Wu; Rabindra V. Shivnaraine; James Zou; Parker Walther; Jeremy Leitz; Souhrid Mukherjee; Joseph C. Wu; Rabindra V. Shivnaraine; James Zou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains data and models used in the following paper.

Swanson, K., Walther, P., Leitz, J., Mukherjee, S., Wu, J. C., Shivnaraine, R. V., & Zou, J. ADMET-AI: A machine learning ADMET platform for evaluation of large-scale chemical libraries. In review.

The data and models are meant to be used with the ADMET-AI code, which runs the ADMET-AI web server at admet.ai.greenstonebio.com.

The data.zip file has the following structure.

data

drugbank: Contains files with drugs from the DrugBank that have received regulatory approval. drugbank_approved.csv contains the full set of approved drugs along with ADMET-AI predictions, while the other files contain subsets of these molecules used for testing the speed of ADMET prediction tools.

tdc_admet_all: Contains the data (.csv files) and RDKit features (.npz files) for all 41 single-task ADMET datasets from the Therapeutics Data Commons (TDC).

tdc_admet_multitask: Contains the data (.csv files) and RDKit features (.npz files) for the two multi-task datasets (one regression and one classification) constructed by combining the tdc_admet_all datasets.

tdc_admet_all.csv: A CSV file containing all 41 ADMET datasets from tdc_admet_all. This can be used to easily look up all ADMET properties for a given molecule in the TDC.

tdc_admet_group: Contains the data (.csv files) and RDKit features (.npz files) for the 22 TDC ADMET Benchmark Group datasets with five splits per dataset.

tdc_admet_group_raw: Contains the raw data (.csv files) used to construct the five splits per dataset in tdc_admet_group.

The models.zip file has the following structure. Note that the ADMET-AI website and Python package use the multi-task Chemprop-RDKit models below.

models

tdc_admet_all: Contains Chemprop and Chemprop-RDKit models trained on all 41 single-task TDC ADMET datasets.

tdc_admet_all_multitask: Contains Chemprop and Chemprop-RDKit models trained on the two multi-task TDC ADMET datasets (one regression and one classification).

tdc_admet_group: Contains Chemprop and Chemprop-RDKit models trained on the 22 TDC ADMET Benchmark Group datasets.
New Mexico NHD High Resolution Stream segments and Waterbodies
catalog.newmexicowaterdata.org
zip
Updated Oct 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New Mexico Environment Department (2023). New Mexico NHD High Resolution Stream segments and Waterbodies [Dataset]. https://catalog.newmexicowaterdata.org/dataset/nm-nhs-stream
Explore at:
zip(362632955)Available download formats
Dataset updated
Oct 23, 2023
Dataset provided by
New Mexico Environment Departmenthttp://www.env.nm.gov/
Description
These two shapefiles represent New Mexico NHD High Resolution stream segments and waterbodies, merged and clipped to the state boundary. RAW NHD High Resolution data, including additional layer files, is available from: https://viewer.nationalmap.gov/basic/
Z
Data from Large-Aperture Scintillometer (LAS) path measurements between...
data.niaid.nih.gov
Updated Mar 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Grimmond, Sue (2024). Data from Large-Aperture Scintillometer (LAS) path measurements between Berlin–Neukoelln (BENEUK) and Berlin–Gropiusstadt (BEGROP) from 2022-07-28 to 2022-10-20 [RAW] [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10847361
Explore at:
Dataset updated
Mar 27, 2024
Dataset provided by
Grimmond, Sue
Christen, Andreas
Fenner, Daniel
Zeeman, Matthias
Morrison, William
Area covered
Neukölln, Berlin, Gropiusstadt
Description
Original data files from LAS measurements.
d
ru02-20031104T1900 Delayed Mode Raw Time Series
catalog.data.gov
slocum-data.marine.rutgers.edu
+3more
Updated Jan 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rutgers University,Mote Marine Laboratory (Point of Contact) (2025). ru02-20031104T1900 Delayed Mode Raw Time Series [Dataset]. https://catalog.data.gov/dataset/ru02-20031104t1900-delayed-mode-raw-time-series
Explore at:
Dataset updated
Jan 27, 2025
Dataset provided by
Rutgers University,Mote Marine Laboratory (Point of Contact)
Description
Glider deployed as part of a larger program called Ecology and Oceanography of Harmful Algal Blooms in Florida (EcoHAB:Florida) to survey the physical oceanography, biological oceanography and circulation patterns for shelf scale modeling for predicting the occurrence and transport of Karenia brevis red tides. The glider was deployed to survey an area of the West Florida Shelf, in the Gulf of Mexico, and measure light attenuation, light absorption, colored dissolved organic matter (CDOM), temperature and salinity. This dataset includes raw measurements of these properties. This dataset was produced from the high resolution data files retrieved from the glider after the glider was recovered.
Supplementary Data for "Imaging biological tissue with high-throughput...
zenodo.org
Updated Jul 15, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daixuan Wu; Daixuan Wu (2021). Supplementary Data for "Imaging biological tissue with high-throughput single-pixel compressive holography" [Dataset]. http://doi.org/10.5281/zenodo.5075337
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5075337
Dataset updated
Jul 15, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Daixuan Wu; Daixuan Wu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This file concludes two subfolders, which were archived relevant data that necessarily used for reconstructing the holographic images of stained tissue from mouse tails and unstained tissue from mouse brains, respectively. Both two subfolders have the same file components, including the MATLAB data and raw data collected from the data acquisition card that are necessary for holographic imaging reconstruction.
Here, 'Supplementary Data 1' is prepared for the holographic reconstruction of stained tissue from mouse tails, while 'Supplementary Data 2' is provided for unstained tissue from mouse brains.

*) biological_sample.mat: The raw data of imaging a slice of rat tail. The format of the data has been converted from .tdms to .mat file.

*) background_curvature.mat: The raw data used to correct for phase contaminations from system aberrations. The format of the data has been converted from .tdms to .mat file.

*) biological_sample_rawdata.tdms: The raw data of imaging a slice of rat tail. The data was collected through DAC and was in the format of TDMS.

*) background_curvature_rawdata.tdms: The raw data used to correct for phase contaminations from system aberrations. The data was collected through DAC and was in the format of TDMS.

Facebook

Twitter

Click to copy link

Link copied

Cite

Isabella Burdon (2024). Mock community raw data (fastq.gz files) for long read ONT 16s with ONT primer dataset [Dataset]. http://doi.org/10.25909/25720092.v1

Mock community raw data (fastq.gz files) for long read ONT 16s with ONT primer dataset

Explore at:

binAvailable download formats

Unique identifier

https://doi.org/10.25909/25720092.v1

Dataset updated

Apr 30, 2024

Dataset provided by

The University of Adelaide

Authors

Isabella Burdon

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Mock community raw data (fastq.gz files) for long read ONT 16s with ONT primer dataset

Clear search

Close search

Google apps

Main menu

Mock community raw data (fastq.gz files) for long read ONT 16s with ONT...

wfip2.model/realtime.hrrr_wfip2.icbc.02

Dataset for: Feasibility study to improve clinical trial transparency with...

Data from: A large-scale fMRI dataset for the visual processing of...

Vegetation Condition Benchmarks Stems raw data V1.2

Data from: High-Resolution Atmospheric Modelling and the Effects on the...

Data from: Data files used to study the distribution of growth in software...

Data from: High-resolution lidar data for Pilgrim Hot Springs, western...

Data Cleaning Sample

Data from: High resolution lidar-derived elevation data for Barry Arm...

Vegetation Condition Benchmarks Stems raw data V1.2

Ceilometer raw data measured at Neumayer station (2016-10), links to files

Data from: High-resolution lidar data for Kotlik, Western Alaska

Employment and Unemployment Survey 2007, Economic Research Forum (ERF)...

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Survey Frame

Sample Design

Sampling notes

Mode of data collection

Cleaning operations

Raw Data

Harmonized Data

Home Mortgage Disclosure Act (HMDA): Loan Application Register (LAR) and...

ADMET-AI: A machine learning ADMET platform for evaluation of large-scale...

New Mexico NHD High Resolution Stream segments and Waterbodies

Data from Large-Aperture Scintillometer (LAS) path measurements between...

ru02-20031104T1900 Delayed Mode Raw Time Series

Supplementary Data for "Imaging biological tissue with high-throughput...

Mock community raw data (fastq.gz files) for long read ONT 16s with ONT primer dataset