https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
This dataset includes the position data of a two-dimensional gantry system experiment in which the G-code commands for the gantry were transmitted through a wireless communications link. The testbed is composed of four main components related to the operation of the gantry system. These components are the gantry system, the Wi-Fi network, the RF channel emulator, and the supervisory computer. In the experimental study, we run a scenario in which the gantry tool moves sequentially between four positions and has a preset dwell at each of the positions. The wireless channel impact is produced through the RF channel emulator. First, we consider the benchmark channel with free-space log-distance path loss and ideal channel impulse response (CIR) which has no multi-path. Second, we consider a measured delay profile of an industrial environment where the CIR is experimentally measured and processed to be deployed using the channel emulator and to reflect the industrial environment impact. Moreover, time-varying log-normal shadowing is introduced due to the fluctuations in the signal level because of obstructions. The variance of zero-mean log-normal shadowing is set through the emulator. In order to collect the position information of the gantry system tool, we used a vision tracking system. In this dataset, we attached a meta_data.csv file to map various files to their corresponding parameters. A README.doc file is included to describe the measurement apparatus.
The data set description provides a detail account of the type of data that is used within the peer-reviewed literature. The data involves special instrumentation, such as hyperspectral imaging cameras to develop thousands of pixels, which form images, like on a television screen. Other data is used to develop absorbance spectra from infrared spectrometers and compared to reference data to confirm the presence of a desired, tested chemical. This dataset is associated with the following publication: Baseley, D., L. Wunderlich, G. Phillips, K. Gross, G. Perram, S. Willison, M. Magnuson, S. Lee, R. Phillips, and W. Harper Jr.. Hyperspectral Analysis for Standoff Detection of Dimethyl Methylphosphonate on Building Materials [HS7.52.01]. JOURNAL OF ENVIRONMENTAL MANAGEMENT. Elsevier Science Ltd, New York, NY, USA, 135-142, (2016).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset globally (excluding frigid/polar zones) quantifies the different facets of variability in surface soil (0 – 30 cm) salinity and sodicity for the period between 1980 and 2018. This is realised by developing 4-D predictive models of Electrical Conductivity of saturated soil Extract (ECe) and soil Exchangeable Sodium Percentage (ESP) as indicators of soil salinity and sodicity. These machine learning-based models make predictions for ECe and ESP at different times, locations, and depths and by extracting meaningful statistics form those predictions, different facets of variability in the surface soil salinity and sodicity are quantified. The dataset includes 10 maps documenting different aspects of soil salinity and sodicity variations, and auxiliary data required for generation of those maps. Users are referred to the corresponding "READ_ME" file for more information about this dataset.
MY NASA DATA (MND) is a tool that allows anyone to make use of satellite data that was previously unavailable.Through the use of MND’s Live Access Server (LAS) a multitude of charts, plots and graphs can be generated using a wide variety of constraints. This site provides a large number of lesson plans with a wide variety of topics, all with the students in mind. Not only can you use our lesson plans, you can use the LAS to improve the ones that you are currently implementing in your classroom.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Notes: As of June 2020 this dataset has been static for several years. Recent versions of NHD High Res may be more detailed than this dataset for some areas, while this dataset may still be more detailed than NHD High Res in other areas. This dataset is considered authoritative as used by CDFW for particular tracking purposes but may not be current or comprehensive for all streams in the state.
National Hydrography Dataset (NHD) high resolution NHDFlowline features for California were originally dissolved on common GNIS_ID or StreamLevel* attributes and routed from mouth to headwater in meters. The results are measured polyline features representing entire streams. Routes on these streams are measured upstream, i.e., the measure at the mouth of a stream is zero and at the upstream end the measure matches the total length of the stream feature. Using GIS tools, a user of this dataset can retrieve the distance in meters upstream from the mouth at any point along a stream feature.** CA_Streams_v3 Update Notes: This version includes over 200 stream modifications and additions resulting from requests for updating from CDFW staff and others***. New locator fields from the USGS Watershed Boundary Dataset (WBD) have been added for v3 to enhance user's ability to search for or extract subsets of California Streams by hydrologic area. *See the Source Citation section of this metadata for further information on NHD, WBD, NHDFlowline, GNIS_ID and StreamLevel. **See the Data Quality section of this metadata for further explanation of stream feature development. ***Some current NHD data has not yet been included in CA_Streams. The effort to synchronize CA_Streams with NHD is ongoing.
Note: Find data at source; data is continuously updated・ PG&E provides non-confidential, aggregated usage data that are available to the public and updated on a quarterly basis. These public datasets consist of monthly consumption aggregated by ZIP code and by customer segment: Residential, Commercial, Industrial and Agricultural. The public datasets must meet the standards for aggregating and anonymizing customer data pursuant to CPUC Decision 14-05-016, as follows: a minimum of 100 Residential customers; a minimum of 15 Non-Residential customers, with no single Non-Residential customer in each sector accounting for more than 15% of the total consumption. If the aggregation standard is not met, the consumption will be combined with a neighboring ZIP code until the aggregation requirements are met.
The King County Groundwater Protection Program maintains a database of groundwater wells, water quality and water level sampling data. Users may search the database using Quick or Advanced Search OR use King County Groundwater iMap map set. The viewer provides a searchable map interface for locating groundwater well data.
Hydrographic and Impairment Statistics (HIS) is a National Park Service (NPS) Water Resources Division (WRD) project established to track certain goals created in response to the Government Performance and Results Act of 1993 (GPRA). One water resources management goal established by the Department of the Interior under GRPA requires NPS to track the percent of its managed surface waters that are meeting Clean Water Act (CWA) water quality standards. This goal requires an accurate inventory that spatially quantifies the surface water hydrography that each bureau manages and a procedure to determine and track which waterbodies are or are not meeting water quality standards as outlined by Section 303(d) of the CWA. This project helps meet this DOI GRPA goal by inventorying and monitoring in a geographic information system for the NPS: (1) CWA 303(d) quality impaired waters and causes; and (2) hydrographic statistics based on the United States Geological Survey (USGS) National Hydrography Dataset (NHD). Hydrographic and 303(d) impairment statistics were evaluated based on a combination of 1:24,000 (NHD) and finer scale data (frequently provided by state GIS layers).
The Ancillary Data component of the Indicators of Coastal Water Quality Collection includes a 5 arc-minute (approximately 9 x 9 km at the equator) sequence grid, grid cell centroids that relate to the grid cells in the tabular "Indicators of Coastal Water Quality: Change in Chlorophyll-a Concentration 1998-2007" data set, and a country buffer data set that is divided by exclusive economic zones (EEZ). The data are produced by the Columbia University Center for International Earth Science Information Network (CIESIN).
This data set shows 311 service requests in the City of Pittsburgh. This data is collected from the request intake software used by the 311 Response Center in the Department of Innovation & Performance. Requests are collected from phone calls, tweets, emails, a form on the City website, and through the 311 mobile application. For more information, see the 311 Data User Guide. If you are unable to download the 311 Data table due to a 504 Gateway Timeout error, use this link instead: https://tools.wprdc.org/downstream/76fda9d0-69be-4dd5-8108-0de7907fc5a4 NOTE: The data feed for this dataset is broken as of December 21st, 2022. We're working on restoring it.
## Overview
Head Data Set 2 is a dataset for object detection tasks - it contains Heads QiDz annotations for 2,342 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
https://data.4tu.nl/info/fileadmin/user_upload/Documenten/4TU.ResearchData_Restricted_Data_2022.pdfhttps://data.4tu.nl/info/fileadmin/user_upload/Documenten/4TU.ResearchData_Restricted_Data_2022.pdf
This file contains raw data for cameras and wearables of the ConfLab dataset.
./cameras
contains the overhead video recordings for 9 cameras (cam2-10) in MP4 files.
These cameras cover the whole interaction floor, with camera 2 capturing the
bottom of the scene layout, and camera 10 capturing top of the scene layout.
Note that cam5 ran out of battery before the other cameras and thus the recordings
are cut short. However, cam4 and 6 contain significant overlap with cam 5, to
reconstruct any information needed.
Note that the annotations are made and provided in 2 minute segments.
The annotated portions of the video include the last 3min38sec of x2xxx.MP4
video files, and the first 12 min of x3xxx.MP4 files for cameras (2,4,6,8,10),
with "x" being the placeholder character in the mp4 file names. If one wishes
to separate the video into 2 min segments as we did, the "video-splitting.sh"
script is provided.
./camera-calibration contains the camera instrinsic files obtained from
https://github.com/idiap/multicamera-calibration. Camera extrinsic parameters can
be calculated using the existing intrinsic parameters and the instructions in the
multicamera-calibration repo. The coordinates in the image are provided by the
crosses marked on the floor, which are visible in the video recordings.
The crosses are 1m apart (=100cm).
./wearables
subdirectory includes the IMU, proximity and audio data from each
participant at the Conflab event (48 in total). In the directory numbered
by participant ID, the following data are included:
1. raw audio file
2. proximity (bluetooth) pings (RSSI) file (raw and csv) and a visualization
3. Tri-axial accelerometer data (raw and csv) and a visualization
4. Tri-axial gyroscope data (raw and csv) and a visualization
5. Tri-axial magnetometer data (raw and csv) and a visualization
6. Game rotation vector (raw and csv), recorded in quaternions.
All files are timestamped.
The sampling frequencies are:
- audio: 1250 Hz
- rest: around 50Hz. However, the sample rate is not fixed
and instead the timestamps should be used.
For rotation, the game rotation vector's output frequency is limited by the
actual sampling frequency of the magnetometer. For more information, please refer to
https://invensense.tdk.com/wp-content/uploads/2016/06/DS-000189-ICM-20948-v1.3.pdf
Audio files in this folder are in raw binary form. The following can be used to convert
them to WAV files (1250Hz):
ffmpeg -f s16le -ar 1250 -ac 1 -i /path/to/audio/file
Synchronization of cameras and werables data
Raw videos contain timecode information which matches the timestamps of the data in
the "wearables" folder. The starting timecode of a video can be read as:
ffprobe -hide_banner -show_streams -i /path/to/video
./audio
./sync: contains wav files per each subject
./sync_files: auxiliary csv files used to sync the audio. Can be used to improve the synchronization.
The code used for syncing the audio can be found here:
https://github.com/TUDelft-SPC-Lab/conflab/tree/master/preprocessing/audio
This database, compiled by Matthews and Fung (1987), provides information on the distribution and environmental characteristics of natural wetlands. The database was developed to evaluate the role of wetlands in the annual emission of methane from terrestrial sources. The original data consists of five global 1-degree latitude by 1-degree longitude arrays. This subset, for the study area of the Large Scale Biosphere-Atmosphere Experiment in Amazonia (LBA) in South America, retains all five arrays at the 1-degree resolution but only for the area of interest (i.e., longitude 85 deg to 30 deg W, latitude 25 deg S to 10 deg N). The arrays are (1) wetland data source, (2) wetland type, (3) fractional inundation, (4) vegetation type, and (5) soil type. The data subsets are in both ASCII GRID and binary image file formats.The data base is the result of the integration of three independent digital sources: (1) vegetation classified according to the United Nations Educational Scientific and Cultural Organization (UNESCO) system (Matthews, 1983), (2) soil properties from the Food and Agriculture Organization (FAO) soil maps (Zobler, 1986), and (3) fractional inundation in each 1-degree cell compiled from a global map survey of Operational Navigation Charts (ONC). With vegetation, soil, and inundation characteristics of each wetland site identified, the data base has been used for a coherent and systematic estimate of methane emissions from wetlands and for an analysis of the causes for uncertainties in the emission estimate.The complete global data base is available from NASA/GISS [http://www.giss.nasa.gov] and NCAR data set ds765.5 [http://www.ncar.ucar.edu]; the global vegetation types data are available from ORNL DAAC [http://www.daac.ornl.gov].
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Location of Fire Stations in the Dún Laoghaire-Rathdown Administrative area.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.
Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.
Dataset of all the data supplied by each local authority and imputed figures used for national estimates.
This file is no longer being updated to include any late revisions local authorities may have reported to the department. Please use instead the Local authority housing statistics open data file for the latest data.
MS Excel Spreadsheet, 1.26 MB
This file may not be suitable for users of assistive technology.
Request an accessible format.Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wiki-Reliability: Machine Learning datasets for measuring content reliability on WikipediaConsists of metadata features and content text datasets, with the formats:- {template_name}_features.csv - {template_name}_difftxt.csv.gz - {template_name}_fulltxt.csv.gz For more details on the project, dataset schema, and links to data usage and benchmarking:https://meta.wikimedia.org/wiki/Research:Wiki-Reliability:_A_Large_Scale_Dataset_for_Content_Reliability_on_Wikipedia
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Journey9ni/VLM-3R-DATA dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
600noisy Data is a dataset for object detection tasks - it contains Tumor annotations for 600 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
The BOREAS TF-10 team collected tower flux and meteorological data at two sites, a fen and a young jack pine forest, near Thompson, Manitoba, Canada, as part of BOREAS. A preliminary data set was assembled in August 1993 while field testing the instrument packages, and at both sites data were collected from 15-Aug to 31-Aug. The main experimental period was in 1994, when continuous data were collected from 08-Apr to 23-Sept at the fen site. A very limited experiment was run in the spring/summer of 1995, when the fen site tower was operated from 08-Apr to 14-Jun in support of a hydrology experiment in an adjoining, feeder basin. Upon examination of the 1994 data set, it became clear that the behavior of the heat, water, and carbon dioxide fluxes throughout the whole growing season was an important scientific question, and that the 1994 data record was not sufficiently long to capture the character of the seasonal behavior of the fluxes. Thus, the fen site was operated in 1996 in order to collect data from spring melt to autumn freeze-up. Data were collected from 29-Apr to 05-Nov at the fen site. All variables are presented as 30-minute averages.
https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
This dataset includes the position data of a two-dimensional gantry system experiment in which the G-code commands for the gantry were transmitted through a wireless communications link. The testbed is composed of four main components related to the operation of the gantry system. These components are the gantry system, the Wi-Fi network, the RF channel emulator, and the supervisory computer. In the experimental study, we run a scenario in which the gantry tool moves sequentially between four positions and has a preset dwell at each of the positions. The wireless channel impact is produced through the RF channel emulator. First, we consider the benchmark channel with free-space log-distance path loss and ideal channel impulse response (CIR) which has no multi-path. Second, we consider a measured delay profile of an industrial environment where the CIR is experimentally measured and processed to be deployed using the channel emulator and to reflect the industrial environment impact. Moreover, time-varying log-normal shadowing is introduced due to the fluctuations in the signal level because of obstructions. The variance of zero-mean log-normal shadowing is set through the emulator. In order to collect the position information of the gantry system tool, we used a vision tracking system. In this dataset, we attached a meta_data.csv file to map various files to their corresponding parameters. A README.doc file is included to describe the measurement apparatus.