75 datasets found

d
Microclimate Sensor - csv - Datasets - data.wa.gov.au
catalogue.data.wa.gov.au
Updated May 6, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). Microclimate Sensor - csv - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/canning-microclimate-sensor-csv
Explore at:
Dataset updated
May 6, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Western Australia
Description
Micro - climate sensors collect telemetry at set intervals throughout the day. Sensors are located at various locations in the City of Canning, Western Australia and each sensor has a unique ID. Contact us at opendata@canning.wa.gov.au for a larger data set (The data is supplied is the sensor reading for 30 days). The following lists the locations of each sensor:18zua9muwbb is located at Wharf Street Basin - Pavilion 2hq3byfebne is located at The City’s Civic and Administration Building uu90853psl is located at Wharf Street Basin - Leila Street entrance xd2su7w05m is located at Wharf Street Basin - Nature Play Area
CSV file used in statistical analyses
data.csiro.au
researchdata.edu.au
+1more
Updated Oct 13, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CSIRO (2014). CSV file used in statistical analyses [Dataset]. http://doi.org/10.4225/08/543B4B4CA92E6
Explore at:
Unique identifier
https://doi.org/10.4225/08/543B4B4CA92E6
Dataset updated
Oct 13, 2014
Dataset authored and provided by
CSIROhttp://www.csiro.au/
License
https://research.csiro.au/dap/licences/csiro-data-licence/https://research.csiro.au/dap/licences/csiro-data-licence/
Time period covered
Mar 14, 2008 - Jun 9, 2009
Dataset funded by
CSIROhttp://www.csiro.au/
Description
A csv file containing the tidal frequencies used for statistical analyses in the paper "Estimating Freshwater Flows From Tidally-Affected Hydrographic Data" by Dan Pagendam and Don Percival.
c
Data from: Datasets used to train the Generative Adversarial Networks used...
opendata.cern.ch
Updated 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ATLAS collaboration (2021). Datasets used to train the Generative Adversarial Networks used in ATLFast3 [Dataset]. http://doi.org/10.7483/OPENDATA.ATLAS.UXKX.TXBN
Explore at:
Unique identifier
https://doi.org/10.7483/OPENDATA.ATLAS.UXKX.TXBN
Dataset updated
2021
Dataset provided by
CERN Open Data Portal
Authors
ATLAS collaboration
Description
Three datasets are available, each consisting of 15 csv files. Each file containing the voxelised shower information obtained from single particles produced at the front of the calorimeter in the |η| range (0.2-0.25) simulated in the ATLAS detector. Two datasets contain photons events with different statistics; the larger sample has about 10 times the number of events as the other. The other dataset contains pions. The pion dataset and the photon dataset with the lower statistics were used to train the corresponding two GANs presented in the AtlFast3 paper SIMU-2018-04.

The information in each file is a table; the rows correspond to the events and the columns to the voxels. The voxelisation procedure is described in the AtlFast3 paper linked above and in the dedicated PUB note ATL-SOFT-PUB-2020-006. In summary, the detailed energy deposits produced by ATLAS were converted from x,y,z coordinates to local cylindrical coordinates defined around the particle 3-momentum at the entrance of the calorimeter. The energy deposits in each layer were then grouped in voxels and for each voxel the energy was stored in the csv file. For each particle, there are 15 files corresponding to the 15 energy points used to train the GAN. The name of the csv file defines both the particle and the energy of the sample used to create the file.

The size of the voxels is described in the binning.xml file. Software tools to read the XML file and manipulate the spatial information of voxels are provided in the FastCaloGAN repository.
Updated on February 10th 2022. A new dataset photons_samples_highStat.tgz was added to this record and the binning.xml file was updated accordingly.
Updated on April 18th 2023. A new dataset pions_samples_highStat.tgz was added to this record.

Data from: Large Landing Trajectory Data Set for Go-Around Analysis

zenodo.org
explore.openaire.eu
+1more

application/gzip, bin +1

Updated Dec 16, 2022

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Raphael Monstein; Raphael Monstein; Benoit Figuet; Benoit Figuet; Timothé Krauth; Timothé Krauth; Manuel Waltert; Manuel Waltert; Marcel Dettling; Marcel Dettling (2022). Large Landing Trajectory Data Set for Go-Around Analysis [Dataset]. http://doi.org/10.5281/zenodo.7148117

Explore at:

application/gzip, bin, zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.7148117

Dataset updated

Dec 16, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Raphael Monstein; Raphael Monstein; Benoit Figuet; Benoit Figuet; Timothé Krauth; Timothé Krauth; Manuel Waltert; Manuel Waltert; Marcel Dettling; Marcel Dettling

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.

If you use this data for a scientific publication, please consider citing our paper.

The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:

go_arounds_minimal.csv.gz

Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:


Column name	Type	Description
time	date time	UTC time of landing or first GA attempt
icao24	string	Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
callsign	string	Aircraft identifier in air-ground communications
airport	string	ICAO airport code where the aircraft is landing
runway	string	Runway designator on which the aircraft landed
has_ga	string	"True" if at least one GA was performed, otherwise "False"
n_approaches	integer	Number of approaches identified for this flight
n_rwy_approached	integer	Number of unique runways approached by this flight

The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.

go_arounds_augmented.csv.gz

Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:

Column name	Type	Description
time	date time	UTC time of landing or first GA attempt
icao24	string	Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
callsign	string	Aircraft identifier in air-ground communications
airport	string	ICAO airport code where the aircraft is landing
runway	string	Runway designator on which the aircraft landed
has_ga	string	"True" if at least one GA was performed, otherwise "False"
n_approaches	integer	Number of approaches identified for this flight
n_rwy_approached	integer	Number of unique runways approached by this flight
registration	string	Aircraft registration
typecode	string	Aircraft ICAO typecode
icaoaircrafttype	string	ICAO aircraft type
wtc	string	ICAO wake turbulence category
glide_slope_angle	float	Angle of the ILS glide slope in degrees
has_intersection	string	Boolean that is true if the runway has an other runway intersecting it, otherwise false
rwy_length	float	Length of the runway in kilometre
airport_country	string	ISO Alpha-3 country code of the airport
airport_region	string	Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
operator_country	string	ISO Alpha-3 country code of the operator
operator_region	string	Geographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania)
wind_speed_knts	integer	METAR, surface wind speed in knots
wind_dir_deg	integer	METAR, surface wind direction in degrees
wind_gust_knts	integer	METAR, surface wind gust speed in knots
visibility_m	float	METAR, visibility in m
temperature_deg	integer	METAR, temperature in degrees Celsius
press_sea_level_p	float	METAR, sea level pressure in hPa
press_p	float	METAR, QNH in hPA
weather_intensity	list	METAR, list of present weather codes: qualifier - intensity
weather_precipitation	list	METAR, list of present weather codes: weather phenomena - precipitation
weather_desc	list	METAR, list of present weather codes: qualifier - descriptor
weather_obscuration	list	METAR, list of present weather codes: weather phenomena - obscuration
weather_other	list	METAR, list of present weather codes: weather phenomena - other

This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.

go_arounds_agg.csv.gz

Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:

Column name	Type	Description
airport	string	ICAO airport code where the aircraft is landing
runway	string	Runway designator on which the aircraft landed
n_landings	integer	Total number of landings observed on this runway in 2019
ga_rate	float	Go-around rate, per 1000 landings
glide_slope_angle	float	Angle of the ILS glide slope in degrees
has_intersection	string	Boolean that is true if the runway has an other runway intersecting it, otherwise false
rwy_length	float	Length of the runway in kilometres
airport_country	string	ISO Alpha-3 country code of the airport
airport_region	string	Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)

This aggregated data set is used in the paper for the generalized linear regression model.

Downloading the trajectories

Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:

import datetime
from tqdm.auto import tqdm
import pandas as pd
from traffic.data import opensky
from traffic.core import Traffic

load minimum data set

df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False)
df["time"] = pd.to_datetime(df["time"])

select London City Airport, go-arounds, and 2019-01-04

airport = "EGLC"
start = datetime.datetime(year=2019, month=1, day=4).replace(
  tzinfo=datetime.timezone.utc
)
stop = datetime.datetime(year=2019, month=1, day=5).replace(
  tzinfo=datetime.timezone.utc
)

df_selection = df.query("airport==@airport & has_ga

d
GP Practice Prescribing Presentation-level Data - August 2014
digital.nhs.uk
csv, zip
Updated Nov 28, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2014). GP Practice Prescribing Presentation-level Data - August 2014 [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/practice-level-prescribing-data
Explore at:
csv(1.7 MB), csv(276.0 kB), csv(1.4 GB), zip(248.4 MB)Available download formats
Dataset updated
Nov 28, 2014
License
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Time period covered
Aug 1, 2014 - Aug 31, 2014
Area covered
United Kingdom
Description
Warning: Large file size (over 1GB). Each monthly data set is large (over 4 million rows), but can be viewed in standard software such as Microsoft WordPad (save by right-clicking on the file name and selecting 'Save Target As', or equivalent on Mac OSX). It is then possible to select the required rows of data and copy and paste the information into another software application, such as a spreadsheet. Alternatively add-ons to existing software, such as the Microsoft PowerPivot add-on for Excel, to handle larger data sets, can be used. The Microsoft PowerPivot add-on for Excel is available from the Microsoft Download Center, using the link in the 'Related Links' section below. Once PowerPivot has been installed, to load the large files, please follow the instructions below. Note that it may take at least 20 to 30 minutes to load one monthly file. Start Excel as normal Click on the PowerPivot tab Click on the PowerPivot Window icon (top left) In the PowerPivot Window, click on the "From Other Sources" icon In the Table Import Wizard e.g. scroll to the bottom and select Text File Browse to the file you want to open and choose the file extension you require e.g. CSV Once the data has been imported you can view it in a spreadsheet. What does the data cover? General practice prescribing data is a list of all medicines, dressings and appliances that are prescribed and dispensed each month. A record will only be produced when this has occurred and there is no record for a zero total. For each practice in England, the following information is presented at presentation level for each medicine, dressing and appliance, (by presentation name): the total number of items prescribed and dispensed the total net ingredient cost the total actual cost the total quantity The data covers NHS prescriptions written in England and dispensed in the community in the UK. Prescriptions written in England but dispensed outside England are included. The data includes prescriptions written by GPs and other non-medical prescribers (such as nurses and pharmacists) who are attached to GP practices. GP practices are identified only by their national code, so an additional data file - linked to the first by the practice code - provides further detail in relation to the practice. Presentations are identified only by their BNF code, so an additional data file - linked to the first by the BNF code - provides the chemical name for that presentation.
Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...
zenodo.org
explore.openaire.eu
zip
Updated Oct 20, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. http://doi.org/10.5281/zenodo.6832242
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6832242
Dataset updated
Oct 20, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
LifeSnaps Dataset Documentation

Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

Data Import: Reading CSV

For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

Data Import: Setting up a MongoDB (Recommended)

To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

For the Fitbit data, run the following:

mongorestore --host localhost:27017 -d rais_anonymized -c fitbit

For the SEMA data, run the following:

mongorestore --host localhost:27017 -d rais_anonymized -c sema

For surveys data, run the following:

mongorestore --host localhost:27017 -d rais_anonymized -c surveys

If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

Data Availability

The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

{ _id:
e
OSNI Open Data 50m Digital Terrain Model CSV
data.europa.eu
data.wu.ac.at
Updated Oct 11, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenDataNI (2021). OSNI Open Data 50m Digital Terrain Model CSV [Dataset]. https://data.europa.eu/data/datasets/osni-open-data-50m-digital-terrain-model-csv?locale=en
Explore at:
Dataset updated
Oct 11, 2021
Dataset authored and provided by
OpenDataNI
Description
A Digital Terrain Model (DTM) is a digital file consisting of a grid of regularly spaced points of known height which, when used with other digital data such as maps or orthophotographs, can provide a 3D image of the land surface. 10m and 50m DTM’s are available. This is a large dataset and will take sometime to download. Please be patient. By download or use of this dataset you agree to abide by the LPS Open Government Data Licence.

Data from: Large-Scale Dataset for Radio Frequency based Device-Free Crowd...

data.niaid.nih.gov
repository.uantwerpen.be
+1more

Updated Apr 28, 2022

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Kaya, Abdil (2022). Large-Scale Dataset for Radio Frequency based Device-Free Crowd Estimation [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3813449

Explore at:

Dataset updated

Apr 28, 2022

Dataset provided by

Denis, Stijn
Kaya, Abdil
Bellekens, Ben
Weyn, Maarten
Berkvens, Rafael

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset serves to estimate the status, in particular the size, of a crowd given the impact on radio frequency communication links within a wireless sensor network. To quantify this relation, signal strengths across sub-GHz communication links are collected at the premises of the Tomorrowland music festival. The communication links are formed between the network nodes of wireless sensor networks deployed in three of the festival's stage environments.

The table below lists the eighteen dataset files. They are collected at the music festival's 2017 and 2018 editions. There are three environments, labeled: ‘Freedom Stage 2017’, ‘Freedom Stage 2018’, and ‘Main Comfort 2018’. Each environment has both 433 MHz and 868 MHz data. The measurements at each environment were collected over a period of three festival days. The dataset files are formatted as Comma-Separated Values (CSV).

Dataset file	Reference file	Number of messages
free17_433_fri.csv	None	393 852
free17_868_fri.csv	None	472 202
free17_433_sat.csv	free17_transactions.csv	996 033
free17_868_sat.csv	free17_transactions.csv	1 023 059
free17_433_sun.csv	free17_transactions.csv	1 007 066
free17_868_sun.csv	free17_transactions.csv	1 036 456
free18_433_fri.csv	None	765 024
free18_868_fri.csv	None	757 657
free18_433_sat.csv	free18_transactions.csv	711 438
free18_868_sat.csv	free18_transactions.csv	714 390
free18_433_sun.csv	free18_transactions.csv	648 329
free18_868_sun.csv	free18_transactions.csv	656 290
main18_433_fri.csv	None	791 462
main18_868_fri.csv	None	908 407
main18_433_sat.csv	main18_counts.csv	863 666
main18_868_sat.csv	main18_counts.csv	884 682
main18_433_sun.csv	main18_counts.csv	903 862
main18_868_sun.csv	main18_counts.csv	894 496

In addition to the datasets and reference files, a software example is provided to illustrate the data use and visualise the initial findings and relation between crowd size and network signal strength impact.

In order to use the software, please retain the following file structure:

. ├── data ├── data_reference ├── graphs └── software

The peer-reviewed data descriptor for this dataset has now been published in MDPI Data - an open access journal aiming at enhancing data transparency and reusability, and can be accessed here: https://doi.org/10.3390/data5020052. Please cite this when using the dataset.

F
Open Ended Question Answer Text Dataset in English
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Open Ended Question Answer Text Dataset in English [Dataset]. https://www.futurebeeai.com/dataset/prompt-response-dataset/english-open-ended-question-answer-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement
Dataset funded by
FutureBeeAI
Description
What’s Included
The English Open-Ended Question Answering Dataset is a meticulously curated collection of comprehensive Question-Answer pairs. It serves as a valuable resource for training Large Language Models (LLMs) and Question-answering models in the English language, advancing the field of artificial intelligence.
Dataset Content: This QA dataset comprises a diverse set of open-ended questions paired with corresponding answers in English. There is no context paragraph given to choose an answer from, and each question is answered without any predefined context content. The questions cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more.
Each question is accompanied by an answer, providing valuable information and insights to enhance the language model training process. Both the questions and answers were manually curated by native English Speaking people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.
This question-answer prompt completion dataset contains different types of prompts, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. The dataset also contains questions and answers with different types of rich text, including tables, code, JSON, etc., with proper markdown.
Question Diversity: To ensure diversity, this Q&A dataset includes questions with varying complexity levels, ranging from easy to medium and hard. Different types of questions, such as multiple-choice, direct, and true/false, are included. Additionally, questions are further classified into fact-based and opinion-based categories, creating a comprehensive variety. The QA dataset also contains the question with constraints and persona restrictions, which makes it even more useful for LLM training.Answer Formats: To accommodate varied learning experiences, the dataset incorporates different types of answer formats. These formats include single-word, short phrases, single sentences, and paragraph types of answers. The answer contains text strings, numerical values, date and time formats as well. Such diversity strengthens the Language model's ability to generate coherent and contextually appropriate answers.Data Format and Annotation Details: This fully labeled English Open Ended Question Answer Dataset is available in JSON and CSV formats. It includes annotation details such as id, language, domain, question_length, prompt_type, question_category, question_type, complexity, answer_type, rich_text.Quality and Accuracy: The dataset upholds the highest standards of quality and accuracy. Each question undergoes careful validation, and the corresponding answers are thoroughly verified. To prioritize inclusivity, the dataset incorporates questions and answers representing diverse perspectives and writing styles, ensuring it remains unbiased and avoids perpetuating discrimination.
Both the question and answers in English are grammatically accurate without any word or grammatical errors. No copyrighted, toxic, or harmful content is used while building this dataset.
Continuous Updates and Customization: The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Continuous efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to collect custom question-answer data tailored to specific needs, providing flexibility and customization options.License: The dataset, created by FutureBeeAI, is now ready for commercial use. Researchers, data scientists, and developers can utilize this fully labeled and ready-to-deploy English Open Ended Question Answer Dataset to enhance the language understanding capabilities of their generative ai models, improve response generation, and explore new approaches to NLP question-answering tasks.
c
Dataset of Melt Pool Variability Measurements for Powder Bed Fusion - Laser...
kilthub.cmu.edu
txt
Updated Oct 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Justin Miner; Sneha Prabha Narra (2024). Dataset of Melt Pool Variability Measurements for Powder Bed Fusion - Laser Beam of Ti-6Al-4V [Dataset]. http://doi.org/10.1184/R1/25696293.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/25696293.v1
Dataset updated
Oct 30, 2024
Dataset provided by
Carnegie Mellon University
Authors
Justin Miner; Sneha Prabha Narra
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description Dataset of melt pool geometry variability data in Powder Bed Fusion - Laser Beam of Ti-6Al-4V. This work was conducted on an EOS M290. Contents MTMeasurements.csv: A csv file with the multi track measurements including cap heights, remelt depths, and widths by orientation and velocity

STMeasurements.csv: A csv file with the single track measurements including cap heights, remelt depths, and widths by orientation and velocity

Note: These measurements were not used in the manuscript.

StWidths.csv: A csv file containing the widths as a function of lengths with the beginning and end of each track removed. These are labeled by location along the length, the measured width, velocity, and orientation.

WARNING: StWidths.csv is too large to open in excel. Saving it in excel will cause data loss.

figures.ipynb: jupyter notebook will generate all of the figures that were published with the article.

Additionally, all of the individual figure files are labeled as they occur in the manuscript and are generated by the code. Citation Please use the following reference in case you find this dataset useful.

@article{Miner2024, author = "Justin Miner and Sneha Prabha Narra", title = "{Dataset of Melt Pool Variability Measurements for Powder Bed Fusion - Laser Beam of Ti-6Al-4V}", year = "2024", month = "5", url = "https://kilthub.cmu.edu/articles/dataset/Dataset_of_Melt_Pool_Variability_Measurements_for_Powder_Bed_Fusion_-_Laser_Beam_of_Ti-6Al-4V/25696293", doi = "10.1184/R1/25696293.v1"}
f
iCite Database Snapshot 2022-07
nih.figshare.com
bin
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
iCite; B. Ian Hutchins; George Santangelo; Ehsanul Haque (2023). iCite Database Snapshot 2022-07 [Dataset]. http://doi.org/10.35092/yhjc.20439960.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.35092/yhjc.20439960.v1
Dataset updated
Jun 1, 2023
Dataset provided by
The NIH Figshare Archive
Authors
iCite; B. Ian Hutchins; George Santangelo; Ehsanul Haque
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This is a database snapshot of the iCite web service (provided here as a single zipped CSV file, or compressed, tarred JSON files). In addition, citation links in the NIH Open Citation Collection are provided as a two-column CSV table in open_citation_collection.zip. iCite provides bibliometrics and metadata on publications indexed in PubMed, organized into three modules:

Influence: Delivers metrics of scientific influence, field-adjusted and benchmarked to NIH publications as the baseline.

Translation: Measures how Human, Animal, or Molecular/Cellular Biology-oriented each paper is; tracks and predicts citation by clinical articles

Open Cites: Disseminates link-level, public-domain citation data from the NIH Open Citation Collection

Definitions for individual data fields:

pmid: PubMed Identifier, an article ID as assigned in PubMed by the National Library of Medicine

doi: Digital Object Identifier, if available

year: Year the article was published

title: Title of the article

authors: List of author names

journal: Journal name (ISO abbreviation)

is_research_article: Flag indicating whether the Publication Type tags for this article are consistent with that of a primary research article

relative_citation_ratio: Relative Citation Ratio (RCR)--OPA's metric of scientific influence. Field-adjusted, time-adjusted and benchmarked against NIH-funded papers. The median RCR for NIH funded papers in any field is 1.0. An RCR of 2.0 means a paper is receiving twice as many citations per year than the median NIH funded paper in its field and year, while an RCR of 0.5 means that it is receiving half as many citations per year. Calculation details are documented in Hutchins et al., PLoS Biol. 2016;14(9):e1002541.

provisional: RCRs for papers published in the previous two years are flagged as "provisional", to reflect that citation metrics for newer articles are not necessarily as stable as they are for older articles. Provisional RCRs are provided for papers published previous year, if they have received with 5 citations or more, despite being, in many cases, less than a year old. All papers published the year before the previous year receive provisional RCRs. The current year is considered to be the NIH Fiscal Year which starts in October. For example, in July 2019 (NIH Fiscal Year 2019), papers from 2018 receive provisional RCRs if they have 5 citations or more, and all papers from 2017 receive provisional RCRs. In October 2019, at the start of NIH Fiscal Year 2020, papers from 2019 receive provisional RCRs if they have 5 citations or more and all papers from 2018 receive provisional RCRs.

citation_count: Number of unique articles that have cited this one

citations_per_year: Citations per year that this article has received since its publication. If this appeared as a preprint and a published article, the year from the published version is used as the primary publication date. This is the numerator for the Relative Citation Ratio.

field_citation_rate: Measure of the intrinsic citation rate of this paper's field, estimated using its co-citation network.

expected_citations_per_year: Citations per year that NIH-funded articles, with the same Field Citation Rate and published in the same year as this paper, receive. This is the denominator for the Relative Citation Ratio.

nih_percentile: Percentile rank of this paper's RCR compared to all NIH publications. For example, 95% indicates that this paper's RCR is higher than 95% of all NIH funded publications.

human: Fraction of MeSH terms that are in the Human category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories)

animal: Fraction of MeSH terms that are in the Animal category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories)

molecular_cellular: Fraction of MeSH terms that are in the Molecular/Cellular Biology category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories)

x_coord: X coordinate of the article on the Triangle of Biomedicine

y_coord: Y Coordinate of the article on the Triangle of Biomedicine

is_clinical: Flag indicating that this paper meets the definition of a clinical article.

cited_by_clin: PMIDs of clinical articles that this article has been cited by.

apt: Approximate Potential to Translate is a machine learning-based estimate of the likelihood that this publication will be cited in later clinical trials or guidelines. Calculation details are documented in Hutchins et al., PLoS Biol. 2019;17(10):e3000416.

cited_by: PMIDs of articles that have cited this one.

references: PMIDs of articles in this article's reference list.

Large CSV files are zipped using zip version 4.5, which is more recent than the default unzip command line utility in some common Linux distributions. These files can be unzipped with tools that support version 4.5 or later such as 7zip.

Comments and questions can be addressed to iCite@mail.nih.gov
C
hotel in the center of the city
data.cityofchicago.org
Updated Mar 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Chicago (2025). hotel in the center of the city [Dataset]. https://data.cityofchicago.org/Community-Economic-Development/hotel-in-the-center-of-the-city/vcf9-ubdz
Explore at:
kml, application/geo+json, application/rssxml, csv, xml, tsv, application/rdfxml, kmzAvailable download formats
Dataset updated
Mar 22, 2025
Authors
City of Chicago
Description
This dataset contains all current and active business licenses issued by the Department of Business Affairs and Consumer Protection. This dataset contains a large number of records /rows of data and may not be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Notepad or Wordpad, to view and search.

Data fields requiring description are detailed below.

APPLICATION TYPE: 'ISSUE' is the record associated with the initial license application. 'RENEW' is a subsequent renewal record. All renewal records are created with a term start date and term expiration date. 'C_LOC' is a change of location record. It means the business moved. 'C_CAPA' is a change of capacity record. Only a few license types my file this type of application. 'C_EXPA' only applies to businesses that have liquor licenses. It means the business location expanded.

LICENSE STATUS: 'AAI' means the license was issued.

Business license owners may be accessed at: http://data.cityofchicago.org/Community-Economic-Development/Business-Owners/ezma-pppn To identify the owner of a business, you will need the account number or legal name.

Data Owner: Business Affairs and Consumer Protection

Time Period: Current

Frequency: Data is updated daily
z
Motor Vehicle Register CSV downloads - Dataset - data.govt.nz - discover and...
portal.zero.govt.nz
Updated Apr 16, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). Motor Vehicle Register CSV downloads - Dataset - data.govt.nz - discover and use data [Dataset]. https://portal.zero.govt.nz/77d6ef04507c10508fcfc67a7c24be32/dataset/motor-vehicle-register-csv-downloads2
Explore at:
Dataset updated
Apr 16, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A point-in-time ‘snapshot’ of all vehicles currently registered in New Zealand. The data relates to currently-registered vehicles as recorded on the Motor Vehicle Register (MVR). We update it monthly, so it's accurate up to the end of the previous month.Access Motor Vehicle Register data via APIRegistration is the process where we add a vehicle’s details to the MVR and issue its number plates. It is not the same thing as vehicle licensing, also called ‘rego’. To give you a quick overview of the data, see the charts in the ‘Attributes’ section below. These will give you information about each of the attributes (variables) in the dataset. Each chart is specific to a variable, and shows all data (without any filters applied). Motor Vehicle Register data - field descriptions Data reuse caveats: as per license. We’ve taken reasonable care in compiling this information, and provide it on an ‘as is, where is’ basis. We are not liable for any action taken on the basis of the information. For further information see the Waka Kotahi website, as well as the terms of the CC-BY 4.0 International license under which we publish this data. CC-BY 4.0 International licence details Variables in the dataset are formatted for analytical use. This can result in attribute charts that may not appear meaningful, and are not suitable for broader analysis or use. In addition, some variables are not mutually exclusive and should not be considered in isolation. As such, these charts should not be taken and used directly as analysis of the overall data. Data quality statement: this data relates to vehicles, not people. We have included some information about vehicle registered owners live. This is based on the most recent information we have about their physical address. To make sure it isn’t possible to identify a person in the data, we have provided this at Territorial Authority (TA) level. A TA is a broad geographical area defined under the Local Government Act 2002 as a city council or district council. There are 67 TAs consisting of 12 city councils, 53 districts, Auckland Council and Chatham Island Council. We haven’t included vehicles that belong to people with a confidential listing. We have restricted the Vehicle Identification Number (VIN) to the first 11 characters – these are generic and don’t identify specific vehicles. Data quality caveats: many of the fields in the (MVR) are free text fields, which means there may be spelling mistakes and other human errors. We have algorithmically cleaned the data to correct identified errors (particularly with respect to a vehicle’s make and model). However, due to the large number of vehicles on the Register we may not have corrected some information. Additionally, some variables may be subject to differences in how people have recorded details – for example, manufacturers release a variety of sub-models and these may not be referred to, or put into the system, in the same way. We have made our cleaning code open source.Vehicle make and model cleansing code (GitHub)
Data from: Optimized SMRT-UMI protocol produces highly accurate sequence...
data.niaid.nih.gov
zenodo.org
+1more
zip
Updated Dec 7, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies [Dataset]. https://data.niaid.nih.gov/resources?id=dryad_w3r2280w0
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.w3r2280w0
Dataset updated
Dec 7, 2023
Dataset provided by
HIV Vaccine Trials Networkhttp://www.hvtn.org/
HIV Prevention Trials Networkhttp://www.hptn.org/
National Institute of Allergy and Infectious Diseaseshttp://www.niaid.nih.gov/
PEPFAR
Authors
Dylan Westfall; Mullins James
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence PCR amplicons derived from cDNA templates tagged with universal molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR and the use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Handling of the large datasets produced from SMRT-UMI sequencing was facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline), that automatically filters and parses reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination or early cycle PCR errors, resulting in highly accurate sequence datasets. The optimized SMRT-UMI sequencing method presented here represents a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus (HIV) quasispecies. Methods This serves as an overview of the analysis performed on PacBio sequence data that is summarized in Analysis Flowchart.pdf and was used as primary data for the paper by Westfall et al. "Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies" Five different PacBio sequencing datasets were used for this analysis: M027, M2199, M1567, M004, and M005 For the datasets which were indexed (M027, M2199), CCS reads from PacBio sequencing files and the chunked_demux_config files were used as input for the chunked_demux pipeline. Each config file lists the different Index primers added during PCR to each sample. The pipeline produces one fastq file for each Index primer combination in the config. For example, in dataset M027 there were 3–4 samples using each Index combination. The fastq files from each demultiplexed read set were moved to the sUMI_dUMI_comparison pipeline fastq folder for further demultiplexing by sample and consensus generation with that pipeline. More information about the chunked_demux pipeline can be found in the README.md file on GitHub. The demultiplexed read collections from the chunked_demux pipeline or CCS read files from datasets which were not indexed (M1567, M004, M005) were each used as input for the sUMI_dUMI_comparison pipeline along with each dataset's config file. Each config file contains the primer sequences for each sample (including the sample ID block in the cDNA primer) and further demultiplexes the reads to prepare data tables summarizing all of the UMI sequences and counts for each family (tagged.tar.gz) as well as consensus sequences from each sUMI and rank 1 dUMI family (consensus.tar.gz). More information about the sUMI_dUMI_comparison pipeline can be found in the paper and the README.md file on GitHub. The consensus.tar.gz and tagged.tar.gz files were moved from sUMI_dUMI_comparison pipeline directory on the server to the Pipeline_Outputs folder in this analysis directory for each dataset and appended with the dataset name (e.g. consensus_M027.tar.gz). Also in this analysis directory is a Sample_Info_Table.csv containing information about how each of the samples was prepared, such as purification methods and number of PCRs. There are also three other folders: Sequence_Analysis, Indentifying_Recombinant_Reads, and Figures. Each has an .Rmd file with the same name inside which is used to collect, summarize, and analyze the data. All of these collections of code were written and executed in RStudio to track notes and summarize results. Sequence_Analysis.Rmd has instructions to decompress all of the consensus.tar.gz files, combine them, and create two fasta files, one with all sUMI and one with all dUMI sequences. Using these as input, two data tables were created, that summarize all sequences and read counts for each sample that pass various criteria. These are used to help create Table 2 and as input for Indentifying_Recombinant_Reads.Rmd and Figures.Rmd. Next, 2 fasta files containing all of the rank 1 dUMI sequences and the matching sUMI sequences were created. These were used as input for the python script compare_seqs.py which identifies any matched sequences that are different between sUMI and dUMI read collections. This information was also used to help create Table 2. Finally, to populate the table with the number of sequences and bases in each sequence subset of interest, different sequence collections were saved and viewed in the Geneious program. To investigate the cause of sequences where the sUMI and dUMI sequences do not match, tagged.tar.gz was decompressed and for each family with discordant sUMI and dUMI sequences the reads from the UMI1_keeping directory were aligned using geneious. Reads from dUMI families failing the 0.7 filter were also aligned in Genious. The uncompressed tagged folder was then removed to save space. These read collections contain all of the reads in a UMI1 family and still include the UMI2 sequence. By examining the alignment and specifically the UMI2 sequences, the site of the discordance and its case were identified for each family as described in the paper. These alignments were saved as "Sequence Alignments.geneious". The counts of how many families were the result of PCR recombination were used in the body of the paper. Using Identifying_Recombinant_Reads.Rmd, the dUMI_ranked.csv file from each sample was extracted from all of the tagged.tar.gz files, combined and used as input to create a single dataset containing all UMI information from all samples. This file dUMI_df.csv was used as input for Figures.Rmd. Figures.Rmd used dUMI_df.csv, sequence_counts.csv, and read_counts.csv as input to create draft figures and then individual datasets for eachFigure. These were copied into Prism software to create the final figures for the paper.

Data for "Direct and indirect Rod and Frame effect: A virtual reality study"...

data.mendeley.com

Updated Feb 12, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Michał Adamski (2025). Data for "Direct and indirect Rod and Frame effect: A virtual reality study" [Dataset]. http://doi.org/10.17632/pcf2n8b4rd.1

Explore at:

Unique identifier

https://doi.org/10.17632/pcf2n8b4rd.1

Dataset updated

Feb 12, 2025

Authors

Michał Adamski

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset contains the raw experimental data and supplementary materials for the "Asymmetry Effects in Virtual Reality Rod and Frame Test". The materials included are:

•  Raw Experimental Data: older.csv and young.csv
•  Mathematica Notebooks: a collection of Mathematica notebooks used for data analysis and visualization. These notebooks provide scripts for processing the experimental data, performing statistical analyses, and generating the figures used in the project.
•  Unity Package: a Unity package featuring a sample scene related to the project. The scene was built using Unity’s Universal Rendering Pipeline (URP). To utilize this package, ensure that URP is enabled in your Unity project. Instructions for enabling URP can be found in the Unity URP Documentation.

Requirements:

•  For Data Files: software capable of opening CSV files (e.g., Microsoft Excel, Google Sheets, or any programming language that can read CSV formats).
•  For Mathematica Notebooks: Wolfram Mathematica software to run and modify the notebooks.
•  For Unity Package: Unity Editor version compatible with URP (2019.3 or later recommended). URP must be installed and enabled in your Unity project.

Usage Notes:

•  The dataset facilitates comparative studies between different age groups based on the collected variables.
•  Users can modify the Mathematica notebooks to perform additional analyses.
•  The Unity scene serves as a reference to the project setup and can be expanded or integrated into larger projects.

Citation: Please cite this dataset when using it in your research or publications.

o
CMS Open Data 2012 datasets for dimuon exercises
explore.openaire.eu
zenodo.org
Updated Aug 31, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alejandro Gomez Espinosa (2021). CMS Open Data 2012 datasets for dimuon exercises [Dataset]. http://doi.org/10.5281/zenodo.5343105
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5343105
Dataset updated
Aug 31, 2021
Authors
Alejandro Gomez Espinosa
Description
These datasets are a subset of the CMS Open data with 2021 data-taking conditions for education purposes. In this version, the data and simulation files are compressed into one big file for easy access. They are stored in two different formats (CSV and PKL) with the same content, therefore just use one of them. Once unzipped: - Data files, starting with output_data_CMS_Run2012B, correspond to 4429.37 /pb of data collected by the CMS Experiment. They are a subset of the dataset on reference [1]. - Simulation files, starting with output_sim_CMS_MonteCarlo2012, are a subset of the dataset referenced on [2]. The number of generated events in this case is 30458871, and the cross section is 3503.71. All the files were processed with a modified version of the AOD2NanoAODOutreachTool [3]. The small modifications are related to the number of triggers stored, and some objects like taus were removed. -------------------------------------------------------- [1] CMS collaboration (2017). DoubleMuParked primary dataset in AOD format from Run of 2012 (/DoubleMuParked/Run2012B-22Jan2013-v1/AOD). CERN Open Data Portal. DOI:10.7483/OPENDATA.CMS.YLIC.86ZZ [2] Wunsch, Stefan; (2019). DYJetsToLL dataset in reduced NanoAOD format for education and outreach. CERN Open Data Portal. DOI:10.7483/OPENDATA.CMS.SRRA.2GON [3] https://github.com/cms-opendata-analyses/AOD2NanoAODOutreachTool {"references": ["CMS collaboration (2017). DoubleMuParked primary dataset in AOD format from Run of 2012 (/DoubleMuParked/Run2012B-22Jan2013-v1/AOD). CERN Open Data Portal. DOI:10.7483/OPENDATA.CMS.YLIC.86ZZ", "Wunsch, Stefan; (2019). DYJetsToLL dataset in reduced NanoAOD format for education and outreach. CERN Open Data Portal. DOI:10.7483/OPENDATA.CMS.SRRA.2GON", "https://github.com/cms-opendata-analyses/AOD2NanoAODOutreachTool"]} For the CSV files you might need to open them using pandas as: pandas.read_csv('output_data.csv', index_col=['entry','subentry']) For the pickle files, you might need to use python3.
pNEUMA dataset
zenodo.org
html, zip
Updated Jan 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emmanouil Barmpounakis; Emmanouil Barmpounakis; Nikolas Geroliminis; Nikolas Geroliminis (2024). pNEUMA dataset [Dataset]. http://doi.org/10.5281/zenodo.10491409
Explore at:
zip, htmlAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10491409
Dataset updated
Jan 16, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Emmanouil Barmpounakis; Emmanouil Barmpounakis; Nikolas Geroliminis; Nikolas Geroliminis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
pNEUMA is an open large-scale dataset of naturalistic trajectories of half a million vehicles that have been collected by a one-of-a-kind experiment by a swarm of drones in the congested downtown area of Athens, Greece. A unique observatory of traffic congestion, a scale an-order-of-magnitude higher than what was not available until now, that researchers from different disciplines around the globe can use to develop and test their own models.

How are the .csv files organized?

For each .csv file the following apply:

each row represents the data of a single vehicle

the first 10 columns in the 1st row include the columns’ names

the first 4 columns include information about the trajectory like the unique trackID, the type of vehicle, the distance traveled in meters and the average speed of the vehicle in km/h

the last 6 columns are then repeated every 6 columns based on the time frequency. For example, column_5 contains the latitude of the vehicle at time column_10, and column_11 contains the latitude of the vehicle at time column_16.

Speed is in km/h, Longitudinal and Lateral Acceleration in m/sec2 and time in seconds.

For more details about the pNEUMA dataset, please check our website at https://open-traffic.epfl.ch
c
Electrification of Heat Demonstration Project: Heat Pump Performance Raw...
datacatalogue.cessda.eu
beta.ukdataservice.ac.uk
Updated Dec 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Energy Systems Catapult (2024). Electrification of Heat Demonstration Project: Heat Pump Performance Raw Data, 2020-2023 [Dataset]. http://doi.org/10.5255/UKDA-SN-9049-2
Explore at:
Unique identifier
https://doi.org/10.5255/UKDA-SN-9049-2
Dataset updated
Dec 20, 2024
Authors
Energy Systems Catapult
Time period covered
Nov 1, 2020 - Sep 29, 2023
Area covered
United Kingdom
Variables measured
Families/households, Subnational
Measurement technique
Measurements and tests
Description
Abstract copyright UK Data Service and data collection copyright owner.

The heat pump monitoring datasets are a key output of the Electrification of Heat Demonstration (EoH) project, a government-funded heat pump trial assessing the feasibility of heat pumps across the UK’s diverse housing stock. These datasets are provided in both cleansed and raw form and allow analysis of the initial performance of the heat pumps installed in the trial. From the datasets, insights such as heat pump seasonal performance factor (a measure of the heat pump's efficiency), heat pump performance during the coldest day of the year, and half-hourly performance to inform peak demand can be gleaned.

For the second edition (December 2024), the data were updated to include performance data collected between November 2020 and September 2023. The only documentation currently available with the study is the Excel data dictionary. Reports and other contextual information can be found on the Energy Systems Catapult website.

The EoH project was funded by the Department of Business, Energy and Industrial Strategy. From 2023, it is covered by the new Department for Energy Security and Net Zero.

Data availability

This study comprises the raw data from the EoH project, which is only available to registered UKDS users. Only the summary data file is available via standard UKDS EUL download, due to the large size of the full raw data files. To obtain the full set of raw data, registered UKDS users should:

Download the summary dataset, and then
Contact the UKDS HelpDesk, quoting study number 9049, to arrange FTP access for the raw data.

When unzipped, the raw data available via FTP consists of 742 CSV files. Most of the individual CSV files are too large to open in Excel. Before requesting FTP, users should ensure they have sufficient computing facilities to analyse the data.

The UKDS also holds an accompanying open-access study, SN 9050 Electrification of Heat Demonstration Project: Heat Pump Performance Cleansed Data, 2020-2023. This contains the cleansed data from the EoH project, which does not require UKDS registration to access. However, since the data are similar in size to this study, only the summary dataset is available to download; an order must be placed for FTP delivery of the remaining cleansed data. Other studies in the set include SN 9209, which comprises 30-minute interval heat pump performance data, and SN 9210, which includes daily heat pump performance data.

The Python code used to cleanse the raw data and then perform the analysis is accessible via the Energy Systems Catapult Github

Main Topics:

Heat Pump Performance across the BEIS funded heat pump trial, The Electrification of Heat (EoH) Demonstration Project. See the documentation for data contents.
d
Long-term monotonic trends in annual groundwater level metrics in the United...
catalog.data.gov
data.usgs.gov
Updated Feb 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Long-term monotonic trends in annual groundwater level metrics in the United States through 2020 (ver.2.0, January 2025) [Dataset]. https://catalog.data.gov/dataset/long-term-monotonic-trends-in-annual-groundwater-metrics-in-the-united-states-through-2020
Explore at:
Dataset updated
Feb 21, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
The U.S. Geological Survey (USGS) Water Resources Mission Area (WMA) is working to address a need to understand where the Nation is experiencing water shortages or surpluses relative to the demand by delivering routine assessments of water supply and demand. A key part of these national assessments is identifying long-term trends in water availability, including groundwater and surface water quantity, quality, and use. This data release contains Mann-Kendall monotonic trend analyses for annual groundwater metrics at 54,932 wells located in the conterminous United States, Alaska, Hawaii, and Puerto Rico. The groundwater metrics include annual mean, maximum, and minimum water level and the timing of the annual maximum and minimum groundwater level. These metrics are computed from groundwater water levels from publicly available data from the National Water Information System (NWIS), the National Groundwater Monitoring Network (NGWMN) and the California Open Data Portal. Trend analyses are computed using annual groundwater metrics through the water year, which is defined as the 12-month period October 1, for any given year through September 30 of the following year (for example, October 2019 through September 2020). Trends at each well are available for up to four different periods: i) the longest possible period that meets completeness criteria at each well, (ii) 1980-2020, (iii) 1990-2020, (iv) 2000-2020. Annual mean, maximum, and minimum water-level metrics for wells screened in unconfined aquifers were determined only when a well's water-level time series was at least 70 percent complete. Additionally, each of these time series must have at least 70 percent complete records in the first and last decade. All longest possible period time series for wells in unconfined aquifer must be at least 10 years long and have annual metric values calculated for at least 70% of the years of the record. Annual mean, maximum, and minimum water-level metrics for wells screened in confined aquifers were determined only when a well's water-level time series was at least 50 percent complete. Additionally, each of these time series must have at least 50 percent complete records in the first and last decade. All longest possible period time series for wells in confined aquifer must be at least 10 years long and have annual metric values calculated for at least 50% of the years in the last 10 years of the record. Caution must be exercised when utilizing monotonic trend analyses conducted over periods of up to several decades (and in some places longer ones) due to the potential for confounding deterministic gradual trends with multi-decadal climatic fluctuations. This data release contains: six input files: NGWMN_gwl_meta_v2.0.csv, the metadata from the National Groundwater Monitoring Network NGWMN_gwl_data_v2.0.csv, the groundwater water level data from the National Groundwater Monitoring Network NWIS_gwl_meta_v2.0.csv, the metadata from the National Water Information System NWIS_gwl_data_v2.0.csv, the groundwater water level data from the National Water Information System CA_measurements_v2.0.csv, the groundwater level data from the California Open Data Portal CA_stations_v2.0.csv, the groundwater metadata from the California Open Data Portal two output files: GW_trendsout_v2.0.csv, the groundwater water level trend data from both the National Groundwater Monitoring Network and the National Water Information System GW_confband_out_v2.0.csv, the confidence bands associated with the groundwater water level trend data from both the National Monitoring Network and the National Water Information System A .zip file containing all of the code used to compute these trends along with a README file with information on using the code First posted - Feb 27, 2024 (available from author) Revised - Jan 30, 2025 (version 2.0)
Periodical Cicada Broods (Feature Layer)
agdatacommons.nal.usda.gov
s.cnmilf.com
+5more
bin
Updated Nov 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Forest Service (2024). Periodical Cicada Broods (Feature Layer) [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/Periodical_Cicada_Broods_Feature_Layer_/25972900
Explore at:
binAvailable download formats
Dataset updated
Nov 23, 2024
Dataset provided by
U.S. Department of Agriculture Forest Servicehttp://fs.fed.us/
Authors
U.S. Forest Service
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Note: This is a large dataset. To download, go to ArcGIS Open Data Set and click the download button, and under additional resources select the geodatabase option. Data layer depicting periodical cicada distribution and expected year of emergence by cicada brood and county. The periodical cicada emerges in massive groups once every 13 or 17 years and is completely unique to North America. There are 15 of these mass groups, called broods, of periodical cicadas in the United States. This county-based data, complied by the USFS Northern Research Station, depict where and when the different broods of periodical cicadas are likely to emerge in the US through 2037. The data was compiled for the 2011 publication entitled "Avian predators are less abundant during periodical cicada emergences, but why?" (Koenig et al. https://dx.doi.org/10.1890/10-1583.1) using data from periodical cicada publications listed below. 1) Marlatt, C. L. 1907. "The periodical cicada". Bulletin of the USDA Bureau of Entomology 71:1?181. 2) Simon, C. 1988. "Evolution of 13- and 17-year periodical cicadas". (Homoptera: Cicadidae). Bulletin of the Entomological Society of America 34:163?176. 3) Liebhold, A. M., Bohne, M. J., and R. L. Lilja. 2013. "Active Periodical Cicada Broods of the United States". USDA Forest Service Northern Research Station, Northeastern Area State and Private Forestry. Metadata and DownloadsThis record was taken from the USDA Enterprise Data Inventory that feeds into the https://data.gov catalog. Data for this record includes the following resources: ISO-19139 metadata ArcGIS Hub Dataset ArcGIS GeoService OGC WMS CSV Shapefile GeoJSON KML For complete information, please visit https://data.gov.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2020). Microclimate Sensor - csv - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/canning-microclimate-sensor-csv

Microclimate Sensor - csv - Datasets - data.wa.gov.au

Explore at:

Dataset updated

May 6, 2020

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

Western Australia

Description

Micro - climate sensors collect telemetry at set intervals throughout the day. Sensors are located at various locations in the City of Canning, Western Australia and each sensor has a unique ID. Contact us at opendata@canning.wa.gov.au for a larger data set (The data is supplied is the sensor reading for 30 days). The following lists the locations of each sensor:18zua9muwbb is located at Wharf Street Basin - Pavilion 2hq3byfebne is located at The City’s Civic and Administration Building uu90853psl is located at Wharf Street Basin - Leila Street entrance xd2su7w05m is located at Wharf Street Basin - Nature Play Area

Clear search

Close search

Google apps

Main menu

Microclimate Sensor - csv - Datasets - data.wa.gov.au

CSV file used in statistical analyses

Data from: Datasets used to train the Generative Adversarial Networks used...

Data from: Large Landing Trajectory Data Set for Go-Around Analysis

load minimum data set

select London City Airport, go-arounds, and 2019-01-04

GP Practice Prescribing Presentation-level Data - August 2014

Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

OSNI Open Data 50m Digital Terrain Model CSV

Data from: Large-Scale Dataset for Radio Frequency based Device-Free Crowd...

Open Ended Question Answer Text Dataset in English

What’s Included

Dataset of Melt Pool Variability Measurements for Powder Bed Fusion - Laser...

iCite Database Snapshot 2022-07

hotel in the center of the city

Motor Vehicle Register CSV downloads - Dataset - data.govt.nz - discover and...

Data from: Optimized SMRT-UMI protocol produces highly accurate sequence...

Data for "Direct and indirect Rod and Frame effect: A virtual reality study"...

CMS Open Data 2012 datasets for dimuon exercises

pNEUMA dataset

Electrification of Heat Demonstration Project: Heat Pump Performance Raw...

Long-term monotonic trends in annual groundwater level metrics in the United...

Periodical Cicada Broods (Feature Layer)

Microclimate Sensor - csv - Datasets - data.wa.gov.auSee More Versions

Microclimate Sensor - csv - Datasets - data.wa.gov.au