75 datasets found
  1. d

    Microclimate Sensor - csv - Datasets - data.wa.gov.au

    • catalogue.data.wa.gov.au
    Updated May 6, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Microclimate Sensor - csv - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/canning-microclimate-sensor-csv
    Explore at:
    Dataset updated
    May 6, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Western Australia
    Description

    Micro - climate sensors collect telemetry at set intervals throughout the day. Sensors are located at various locations in the City of Canning, Western Australia and each sensor has a unique ID. Contact us at opendata@canning.wa.gov.au for a larger data set (The data is supplied is the sensor reading for 30 days). The following lists the locations of each sensor:18zua9muwbb is located at Wharf Street Basin - Pavilion 2hq3byfebne is located at The City’s Civic and Administration Building uu90853psl is located at Wharf Street Basin - Leila Street entrance xd2su7w05m is located at Wharf Street Basin - Nature Play Area

  2. CSV file used in statistical analyses

    • data.csiro.au
    • researchdata.edu.au
    • +1more
    Updated Oct 13, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CSIRO (2014). CSV file used in statistical analyses [Dataset]. http://doi.org/10.4225/08/543B4B4CA92E6
    Explore at:
    Dataset updated
    Oct 13, 2014
    Dataset authored and provided by
    CSIROhttp://www.csiro.au/
    License

    https://research.csiro.au/dap/licences/csiro-data-licence/https://research.csiro.au/dap/licences/csiro-data-licence/

    Time period covered
    Mar 14, 2008 - Jun 9, 2009
    Dataset funded by
    CSIROhttp://www.csiro.au/
    Description

    A csv file containing the tidal frequencies used for statistical analyses in the paper "Estimating Freshwater Flows From Tidally-Affected Hydrographic Data" by Dan Pagendam and Don Percival.

  3. c

    Data from: Datasets used to train the Generative Adversarial Networks used...

    • opendata.cern.ch
    Updated 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ATLAS collaboration (2021). Datasets used to train the Generative Adversarial Networks used in ATLFast3 [Dataset]. http://doi.org/10.7483/OPENDATA.ATLAS.UXKX.TXBN
    Explore at:
    Dataset updated
    2021
    Dataset provided by
    CERN Open Data Portal
    Authors
    ATLAS collaboration
    Description

    Three datasets are available, each consisting of 15 csv files. Each file containing the voxelised shower information obtained from single particles produced at the front of the calorimeter in the |η| range (0.2-0.25) simulated in the ATLAS detector. Two datasets contain photons events with different statistics; the larger sample has about 10 times the number of events as the other. The other dataset contains pions. The pion dataset and the photon dataset with the lower statistics were used to train the corresponding two GANs presented in the AtlFast3 paper SIMU-2018-04.

    The information in each file is a table; the rows correspond to the events and the columns to the voxels. The voxelisation procedure is described in the AtlFast3 paper linked above and in the dedicated PUB note ATL-SOFT-PUB-2020-006. In summary, the detailed energy deposits produced by ATLAS were converted from x,y,z coordinates to local cylindrical coordinates defined around the particle 3-momentum at the entrance of the calorimeter. The energy deposits in each layer were then grouped in voxels and for each voxel the energy was stored in the csv file. For each particle, there are 15 files corresponding to the 15 energy points used to train the GAN. The name of the csv file defines both the particle and the energy of the sample used to create the file.

    The size of the voxels is described in the binning.xml file. Software tools to read the XML file and manipulate the spatial information of voxels are provided in the FastCaloGAN repository.

    Updated on February 10th 2022. A new dataset photons_samples_highStat.tgz was added to this record and the binning.xml file was updated accordingly.

    Updated on April 18th 2023. A new dataset pions_samples_highStat.tgz was added to this record.

  4. Data from: Large Landing Trajectory Data Set for Go-Around Analysis

    • zenodo.org
    • explore.openaire.eu
    • +1more
    application/gzip, bin +1
    Updated Dec 16, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raphael Monstein; Raphael Monstein; Benoit Figuet; Benoit Figuet; Timothé Krauth; Timothé Krauth; Manuel Waltert; Manuel Waltert; Marcel Dettling; Marcel Dettling (2022). Large Landing Trajectory Data Set for Go-Around Analysis [Dataset]. http://doi.org/10.5281/zenodo.7148117
    Explore at:
    application/gzip, bin, zipAvailable download formats
    Dataset updated
    Dec 16, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Raphael Monstein; Raphael Monstein; Benoit Figuet; Benoit Figuet; Timothé Krauth; Timothé Krauth; Manuel Waltert; Manuel Waltert; Marcel Dettling; Marcel Dettling
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.

    If you use this data for a scientific publication, please consider citing our paper.

    The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:

    go_arounds_minimal.csv.gz

    Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:

    Column nameTypeDescription
    timedate timeUTC time of landing or first GA attempt
    icao24stringUnique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    callsignstringAircraft identifier in air-ground communications
    airportstringICAO airport code where the aircraft is landing
    runwaystringRunway designator on which the aircraft landed
    has_gastring"True" if at least one GA was performed, otherwise "False"
    n_approachesintegerNumber of approaches identified for this flight
    n_rwy_approachedintegerNumber of unique runways approached by this flight

    The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.

    go_arounds_augmented.csv.gz

    Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:

    Column nameTypeDescription
    timedate timeUTC time of landing or first GA attempt
    icao24stringUnique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    callsignstringAircraft identifier in air-ground communications
    airportstringICAO airport code where the aircraft is landing
    runwaystringRunway designator on which the aircraft landed
    has_gastring"True" if at least one GA was performed, otherwise "False"
    n_approachesintegerNumber of approaches identified for this flight
    n_rwy_approachedintegerNumber of unique runways approached by this flight
    registrationstringAircraft registration
    typecodestringAircraft ICAO typecode
    icaoaircrafttypestringICAO aircraft type
    wtcstringICAO wake turbulence category
    glide_slope_anglefloatAngle of the ILS glide slope in degrees
    has_intersection

    string

    Boolean that is true if the runway has an other runway intersecting it, otherwise false
    rwy_lengthfloatLength of the runway in kilometre
    airport_countrystringISO Alpha-3 country code of the airport
    airport_regionstringGeographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
    operator_countrystringISO Alpha-3 country code of the operator
    operator_regionstringGeographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania)
    wind_speed_kntsintegerMETAR, surface wind speed in knots
    wind_dir_degintegerMETAR, surface wind direction in degrees
    wind_gust_kntsintegerMETAR, surface wind gust speed in knots
    visibility_mfloatMETAR, visibility in m
    temperature_degintegerMETAR, temperature in degrees Celsius
    press_sea_level_pfloatMETAR, sea level pressure in hPa
    press_pfloatMETAR, QNH in hPA
    weather_intensitylistMETAR, list of present weather codes: qualifier - intensity
    weather_precipitationlistMETAR, list of present weather codes: weather phenomena - precipitation
    weather_desclistMETAR, list of present weather codes: qualifier - descriptor
    weather_obscurationlistMETAR, list of present weather codes: weather phenomena - obscuration
    weather_otherlistMETAR, list of present weather codes: weather phenomena - other

    This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.

    go_arounds_agg.csv.gz

    Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:

    Column nameTypeDescription
    airportstringICAO airport code where the aircraft is landing
    runwaystringRunway designator on which the aircraft landed
    n_landingsintegerTotal number of landings observed on this runway in 2019
    ga_ratefloatGo-around rate, per 1000 landings
    glide_slope_anglefloatAngle of the ILS glide slope in degrees
    has_intersectionstringBoolean that is true if the runway has an other runway intersecting it, otherwise false
    rwy_lengthfloatLength of the runway in kilometres
    airport_countrystringISO Alpha-3 country code of the airport
    airport_regionstringGeographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)

    This aggregated data set is used in the paper for the generalized linear regression model.

    Downloading the trajectories

    Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:

    import datetime
    from tqdm.auto import tqdm
    import pandas as pd
    from traffic.data import opensky
    from traffic.core import Traffic

    load minimum data set

    df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False) df["time"] = pd.to_datetime(df["time"])

    select London City Airport, go-arounds, and 2019-01-04

    airport = "EGLC" start = datetime.datetime(year=2019, month=1, day=4).replace( tzinfo=datetime.timezone.utc ) stop = datetime.datetime(year=2019, month=1, day=5).replace( tzinfo=datetime.timezone.utc )

    df_selection = df.query("airport==@airport & has_ga

  5. d

    GP Practice Prescribing Presentation-level Data - August 2014

    • digital.nhs.uk
    csv, zip
    Updated Nov 28, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2014). GP Practice Prescribing Presentation-level Data - August 2014 [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/practice-level-prescribing-data
    Explore at:
    csv(1.7 MB), csv(276.0 kB), csv(1.4 GB), zip(248.4 MB)Available download formats
    Dataset updated
    Nov 28, 2014
    License

    https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions

    Time period covered
    Aug 1, 2014 - Aug 31, 2014
    Area covered
    United Kingdom
    Description

    Warning: Large file size (over 1GB). Each monthly data set is large (over 4 million rows), but can be viewed in standard software such as Microsoft WordPad (save by right-clicking on the file name and selecting 'Save Target As', or equivalent on Mac OSX). It is then possible to select the required rows of data and copy and paste the information into another software application, such as a spreadsheet. Alternatively add-ons to existing software, such as the Microsoft PowerPivot add-on for Excel, to handle larger data sets, can be used. The Microsoft PowerPivot add-on for Excel is available from the Microsoft Download Center, using the link in the 'Related Links' section below. Once PowerPivot has been installed, to load the large files, please follow the instructions below. Note that it may take at least 20 to 30 minutes to load one monthly file. Start Excel as normal Click on the PowerPivot tab Click on the PowerPivot Window icon (top left) In the PowerPivot Window, click on the "From Other Sources" icon In the Table Import Wizard e.g. scroll to the bottom and select Text File Browse to the file you want to open and choose the file extension you require e.g. CSV Once the data has been imported you can view it in a spreadsheet. What does the data cover? General practice prescribing data is a list of all medicines, dressings and appliances that are prescribed and dispensed each month. A record will only be produced when this has occurred and there is no record for a zero total. For each practice in England, the following information is presented at presentation level for each medicine, dressing and appliance, (by presentation name): the total number of items prescribed and dispensed the total net ingredient cost the total actual cost the total quantity The data covers NHS prescriptions written in England and dispensed in the community in the UK. Prescriptions written in England but dispensed outside England are included. The data includes prescriptions written by GPs and other non-medical prescribers (such as nurses and pharmacists) who are attached to GP practices. GP practices are identified only by their national code, so an additional data file - linked to the first by the practice code - provides further detail in relation to the practice. Presentations are identified only by their BNF code, so an additional data file - linked to the first by the BNF code - provides the chemical name for that presentation.

  6. Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

    • zenodo.org
    • explore.openaire.eu
    zip
    Updated Oct 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. http://doi.org/10.5281/zenodo.6832242
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LifeSnaps Dataset Documentation

    Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

    The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

    Data Import: Reading CSV

    For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

    Data Import: Setting up a MongoDB (Recommended)

    To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

    To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

    For the Fitbit data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c fitbit 

    For the SEMA data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c sema 

    For surveys data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c surveys 

    If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

    Data Availability

    The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

    {
      _id: 
  7. e

    OSNI Open Data 50m Digital Terrain Model CSV

    • data.europa.eu
    • data.wu.ac.at
    Updated Oct 11, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenDataNI (2021). OSNI Open Data 50m Digital Terrain Model CSV [Dataset]. https://data.europa.eu/data/datasets/osni-open-data-50m-digital-terrain-model-csv?locale=en
    Explore at:
    Dataset updated
    Oct 11, 2021
    Dataset authored and provided by
    OpenDataNI
    Description

    A Digital Terrain Model (DTM) is a digital file consisting of a grid of regularly spaced points of known height which, when used with other digital data such as maps or orthophotographs, can provide a 3D image of the land surface. 10m and 50m DTM’s are available. This is a large dataset and will take sometime to download. Please be patient. By download or use of this dataset you agree to abide by the LPS Open Government Data Licence.

  8. Z

    Data from: Large-Scale Dataset for Radio Frequency based Device-Free Crowd...

    • data.niaid.nih.gov
    • repository.uantwerpen.be
    • +1more
    Updated Apr 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaya, Abdil (2022). Large-Scale Dataset for Radio Frequency based Device-Free Crowd Estimation [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3813449
    Explore at:
    Dataset updated
    Apr 28, 2022
    Dataset provided by
    Denis, Stijn
    Kaya, Abdil
    Bellekens, Ben
    Weyn, Maarten
    Berkvens, Rafael
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset serves to estimate the status, in particular the size, of a crowd given the impact on radio frequency communication links within a wireless sensor network. To quantify this relation, signal strengths across sub-GHz communication links are collected at the premises of the Tomorrowland music festival. The communication links are formed between the network nodes of wireless sensor networks deployed in three of the festival's stage environments.

    The table below lists the eighteen dataset files. They are collected at the music festival's 2017 and 2018 editions. There are three environments, labeled: ‘Freedom Stage 2017’, ‘Freedom Stage 2018’, and ‘Main Comfort 2018’. Each environment has both 433 MHz and 868 MHz data. The measurements at each environment were collected over a period of three festival days. The dataset files are formatted as Comma-Separated Values (CSV).

    Dataset fileReference fileNumber of messages
    free17_433_fri.csvNone393 852
    free17_868_fri.csvNone472 202
    free17_433_sat.csvfree17_transactions.csv996 033
    free17_868_sat.csvfree17_transactions.csv1 023 059
    free17_433_sun.csvfree17_transactions.csv1 007 066
    free17_868_sun.csvfree17_transactions.csv1 036 456
    free18_433_fri.csvNone765 024
    free18_868_fri.csvNone757 657
    free18_433_sat.csvfree18_transactions.csv711 438
    free18_868_sat.csvfree18_transactions.csv714 390
    free18_433_sun.csvfree18_transactions.csv648 329
    free18_868_sun.csvfree18_transactions.csv656 290
    main18_433_fri.csvNone791 462
    main18_868_fri.csvNone908 407
    main18_433_sat.csvmain18_counts.csv863 666
    main18_868_sat.csvmain18_counts.csv884 682
    main18_433_sun.csvmain18_counts.csv903 862
    main18_868_sun.csvmain18_counts.csv894 496

    In addition to the datasets and reference files, a software example is provided to illustrate the data use and visualise the initial findings and relation between crowd size and network signal strength impact.

    In order to use the software, please retain the following file structure:

    . ├── data ├── data_reference ├── graphs └── software

    The peer-reviewed data descriptor for this dataset has now been published in MDPI Data - an open access journal aiming at enhancing data transparency and reusability, and can be accessed here: https://doi.org/10.3390/data5020052. Please cite this when using the dataset.

  9. F

    Open Ended Question Answer Text Dataset in English

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Open Ended Question Answer Text Dataset in English [Dataset]. https://www.futurebeeai.com/dataset/prompt-response-dataset/english-open-ended-question-answer-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    The English Open-Ended Question Answering Dataset is a meticulously curated collection of comprehensive Question-Answer pairs. It serves as a valuable resource for training Large Language Models (LLMs) and Question-answering models in the English language, advancing the field of artificial intelligence.

    Dataset Content: This QA dataset comprises a diverse set of open-ended questions paired with corresponding answers in English. There is no context paragraph given to choose an answer from, and each question is answered without any predefined context content. The questions cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more.

    Each question is accompanied by an answer, providing valuable information and insights to enhance the language model training process. Both the questions and answers were manually curated by native English Speaking people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.

    This question-answer prompt completion dataset contains different types of prompts, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. The dataset also contains questions and answers with different types of rich text, including tables, code, JSON, etc., with proper markdown.

    Question Diversity: To ensure diversity, this Q&A dataset includes questions with varying complexity levels, ranging from easy to medium and hard. Different types of questions, such as multiple-choice, direct, and true/false, are included. Additionally, questions are further classified into fact-based and opinion-based categories, creating a comprehensive variety. The QA dataset also contains the question with constraints and persona restrictions, which makes it even more useful for LLM training.Answer Formats: To accommodate varied learning experiences, the dataset incorporates different types of answer formats. These formats include single-word, short phrases, single sentences, and paragraph types of answers. The answer contains text strings, numerical values, date and time formats as well. Such diversity strengthens the Language model's ability to generate coherent and contextually appropriate answers.Data Format and Annotation Details: This fully labeled English Open Ended Question Answer Dataset is available in JSON and CSV formats. It includes annotation details such as id, language, domain, question_length, prompt_type, question_category, question_type, complexity, answer_type, rich_text.Quality and Accuracy: The dataset upholds the highest standards of quality and accuracy. Each question undergoes careful validation, and the corresponding answers are thoroughly verified. To prioritize inclusivity, the dataset incorporates questions and answers representing diverse perspectives and writing styles, ensuring it remains unbiased and avoids perpetuating discrimination.

    Both the question and answers in English are grammatically accurate without any word or grammatical errors. No copyrighted, toxic, or harmful content is used while building this dataset.

    Continuous Updates and Customization: The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Continuous efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to collect custom question-answer data tailored to specific needs, providing flexibility and customization options.License: The dataset, created by FutureBeeAI, is now ready for commercial use. Researchers, data scientists, and developers can utilize this fully labeled and ready-to-deploy English Open Ended Question Answer Dataset to enhance the language understanding capabilities of their generative ai models, improve response generation, and explore new approaches to NLP question-answering tasks.

  10. c

    Dataset of Melt Pool Variability Measurements for Powder Bed Fusion - Laser...

    • kilthub.cmu.edu
    txt
    Updated Oct 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Justin Miner; Sneha Prabha Narra (2024). Dataset of Melt Pool Variability Measurements for Powder Bed Fusion - Laser Beam of Ti-6Al-4V [Dataset]. http://doi.org/10.1184/R1/25696293.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 30, 2024
    Dataset provided by
    Carnegie Mellon University
    Authors
    Justin Miner; Sneha Prabha Narra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description Dataset of melt pool geometry variability data in Powder Bed Fusion - Laser Beam of Ti-6Al-4V. This work was conducted on an EOS M290. Contents MTMeasurements.csv: A csv file with the multi track measurements including cap heights, remelt depths, and widths by orientation and velocity

    STMeasurements.csv: A csv file with the single track measurements including cap heights, remelt depths, and widths by orientation and velocity

    Note: These measurements were not used in the manuscript.

    StWidths.csv: A csv file containing the widths as a function of lengths with the beginning and end of each track removed. These are labeled by location along the length, the measured width, velocity, and orientation.

    WARNING: StWidths.csv is too large to open in excel. Saving it in excel will cause data loss.

    figures.ipynb: jupyter notebook will generate all of the figures that were published with the article.

    Additionally, all of the individual figure files are labeled as they occur in the manuscript and are generated by the code. Citation Please use the following reference in case you find this dataset useful.

    @article{Miner2024, author = "Justin Miner and Sneha Prabha Narra", title = "{Dataset of Melt Pool Variability Measurements for Powder Bed Fusion - Laser Beam of Ti-6Al-4V}", year = "2024", month = "5", url = "https://kilthub.cmu.edu/articles/dataset/Dataset_of_Melt_Pool_Variability_Measurements_for_Powder_Bed_Fusion_-_Laser_Beam_of_Ti-6Al-4V/25696293", doi = "10.1184/R1/25696293.v1"}

  11. f

    iCite Database Snapshot 2022-07

    • nih.figshare.com
    bin
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    iCite; B. Ian Hutchins; George Santangelo; Ehsanul Haque (2023). iCite Database Snapshot 2022-07 [Dataset]. http://doi.org/10.35092/yhjc.20439960.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    The NIH Figshare Archive
    Authors
    iCite; B. Ian Hutchins; George Santangelo; Ehsanul Haque
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This is a database snapshot of the iCite web service (provided here as a single zipped CSV file, or compressed, tarred JSON files). In addition, citation links in the NIH Open Citation Collection are provided as a two-column CSV table in open_citation_collection.zip. iCite provides bibliometrics and metadata on publications indexed in PubMed, organized into three modules:

    Influence: Delivers metrics of scientific influence, field-adjusted and benchmarked to NIH publications as the baseline.

    Translation: Measures how Human, Animal, or Molecular/Cellular Biology-oriented each paper is; tracks and predicts citation by clinical articles

    Open Cites: Disseminates link-level, public-domain citation data from the NIH Open Citation Collection

    Definitions for individual data fields:

    pmid: PubMed Identifier, an article ID as assigned in PubMed by the National Library of Medicine

    doi: Digital Object Identifier, if available

    year: Year the article was published

    title: Title of the article

    authors: List of author names

    journal: Journal name (ISO abbreviation)

    is_research_article: Flag indicating whether the Publication Type tags for this article are consistent with that of a primary research article

    relative_citation_ratio: Relative Citation Ratio (RCR)--OPA's metric of scientific influence. Field-adjusted, time-adjusted and benchmarked against NIH-funded papers. The median RCR for NIH funded papers in any field is 1.0. An RCR of 2.0 means a paper is receiving twice as many citations per year than the median NIH funded paper in its field and year, while an RCR of 0.5 means that it is receiving half as many citations per year. Calculation details are documented in Hutchins et al., PLoS Biol. 2016;14(9):e1002541.

    provisional: RCRs for papers published in the previous two years are flagged as "provisional", to reflect that citation metrics for newer articles are not necessarily as stable as they are for older articles. Provisional RCRs are provided for papers published previous year, if they have received with 5 citations or more, despite being, in many cases, less than a year old. All papers published the year before the previous year receive provisional RCRs. The current year is considered to be the NIH Fiscal Year which starts in October. For example, in July 2019 (NIH Fiscal Year 2019), papers from 2018 receive provisional RCRs if they have 5 citations or more, and all papers from 2017 receive provisional RCRs. In October 2019, at the start of NIH Fiscal Year 2020, papers from 2019 receive provisional RCRs if they have 5 citations or more and all papers from 2018 receive provisional RCRs.

    citation_count: Number of unique articles that have cited this one

    citations_per_year: Citations per year that this article has received since its publication. If this appeared as a preprint and a published article, the year from the published version is used as the primary publication date. This is the numerator for the Relative Citation Ratio.

    field_citation_rate: Measure of the intrinsic citation rate of this paper's field, estimated using its co-citation network.

    expected_citations_per_year: Citations per year that NIH-funded articles, with the same Field Citation Rate and published in the same year as this paper, receive. This is the denominator for the Relative Citation Ratio.

    nih_percentile: Percentile rank of this paper's RCR compared to all NIH publications. For example, 95% indicates that this paper's RCR is higher than 95% of all NIH funded publications.

    human: Fraction of MeSH terms that are in the Human category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories)

    animal: Fraction of MeSH terms that are in the Animal category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories)

    molecular_cellular: Fraction of MeSH terms that are in the Molecular/Cellular Biology category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories)

    x_coord: X coordinate of the article on the Triangle of Biomedicine

    y_coord: Y Coordinate of the article on the Triangle of Biomedicine

    is_clinical: Flag indicating that this paper meets the definition of a clinical article.

    cited_by_clin: PMIDs of clinical articles that this article has been cited by.

    apt: Approximate Potential to Translate is a machine learning-based estimate of the likelihood that this publication will be cited in later clinical trials or guidelines. Calculation details are documented in Hutchins et al., PLoS Biol. 2019;17(10):e3000416.

    cited_by: PMIDs of articles that have cited this one.

    references: PMIDs of articles in this article's reference list.

    Large CSV files are zipped using zip version 4.5, which is more recent than the default unzip command line utility in some common Linux distributions. These files can be unzipped with tools that support version 4.5 or later such as 7zip.

    Comments and questions can be addressed to iCite@mail.nih.gov

  12. C

    hotel in the center of the city

    • data.cityofchicago.org
    Updated Mar 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Chicago (2025). hotel in the center of the city [Dataset]. https://data.cityofchicago.org/Community-Economic-Development/hotel-in-the-center-of-the-city/vcf9-ubdz
    Explore at:
    kml, application/geo+json, application/rssxml, csv, xml, tsv, application/rdfxml, kmzAvailable download formats
    Dataset updated
    Mar 22, 2025
    Authors
    City of Chicago
    Description

    This dataset contains all current and active business licenses issued by the Department of Business Affairs and Consumer Protection. This dataset contains a large number of records /rows of data and may not be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Notepad or Wordpad, to view and search.

    Data fields requiring description are detailed below.

    APPLICATION TYPE: 'ISSUE' is the record associated with the initial license application. 'RENEW' is a subsequent renewal record. All renewal records are created with a term start date and term expiration date. 'C_LOC' is a change of location record. It means the business moved. 'C_CAPA' is a change of capacity record. Only a few license types my file this type of application. 'C_EXPA' only applies to businesses that have liquor licenses. It means the business location expanded.

    LICENSE STATUS: 'AAI' means the license was issued.

    Business license owners may be accessed at: http://data.cityofchicago.org/Community-Economic-Development/Business-Owners/ezma-pppn To identify the owner of a business, you will need the account number or legal name.

    Data Owner: Business Affairs and Consumer Protection

    Time Period: Current

    Frequency: Data is updated daily

  13. z

    Motor Vehicle Register CSV downloads - Dataset - data.govt.nz - discover and...

    • portal.zero.govt.nz
    Updated Apr 16, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Motor Vehicle Register CSV downloads - Dataset - data.govt.nz - discover and use data [Dataset]. https://portal.zero.govt.nz/77d6ef04507c10508fcfc67a7c24be32/dataset/motor-vehicle-register-csv-downloads2
    Explore at:
    Dataset updated
    Apr 16, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A point-in-time ‘snapshot’ of all vehicles currently registered in New Zealand. The data relates to currently-registered vehicles as recorded on the Motor Vehicle Register (MVR). We update it monthly, so it's accurate up to the end of the previous month.Access Motor Vehicle Register data via APIRegistration is the process where we add a vehicle’s details to the MVR and issue its number plates. It is not the same thing as vehicle licensing, also called ‘rego’. To give you a quick overview of the data, see the charts in the ‘Attributes’ section below. These will give you information about each of the attributes (variables) in the dataset. Each chart is specific to a variable, and shows all data (without any filters applied). Motor Vehicle Register data - field descriptions Data reuse caveats: as per license. We’ve taken reasonable care in compiling this information, and provide it on an ‘as is, where is’ basis. We are not liable for any action taken on the basis of the information. For further information see the Waka Kotahi website, as well as the terms of the CC-BY 4.0 International license under which we publish this data. CC-BY 4.0 International licence details Variables in the dataset are formatted for analytical use. This can result in attribute charts that may not appear meaningful, and are not suitable for broader analysis or use. In addition, some variables are not mutually exclusive and should not be considered in isolation. As such, these charts should not be taken and used directly as analysis of the overall data. Data quality statement: this data relates to vehicles, not people. We have included some information about vehicle registered owners live. This is based on the most recent information we have about their physical address. To make sure it isn’t possible to identify a person in the data, we have provided this at Territorial Authority (TA) level. A TA is a broad geographical area defined under the Local Government Act 2002 as a city council or district council. There are 67 TAs consisting of 12 city councils, 53 districts, Auckland Council and Chatham Island Council. We haven’t included vehicles that belong to people with a confidential listing. We have restricted the Vehicle Identification Number (VIN) to the first 11 characters – these are generic and don’t identify specific vehicles. Data quality caveats: many of the fields in the (MVR) are free text fields, which means there may be spelling mistakes and other human errors. We have algorithmically cleaned the data to correct identified errors (particularly with respect to a vehicle’s make and model). However, due to the large number of vehicles on the Register we may not have corrected some information. Additionally, some variables may be subject to differences in how people have recorded details – for example, manufacturers release a variety of sub-models and these may not be referred to, or put into the system, in the same way. We have made our cleaning code open source.Vehicle make and model cleansing code (GitHub)

  14. Data from: Optimized SMRT-UMI protocol produces highly accurate sequence...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Dec 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies [Dataset]. https://data.niaid.nih.gov/resources?id=dryad_w3r2280w0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 7, 2023
    Dataset provided by
    HIV Vaccine Trials Networkhttp://www.hvtn.org/
    HIV Prevention Trials Networkhttp://www.hptn.org/
    National Institute of Allergy and Infectious Diseaseshttp://www.niaid.nih.gov/
    PEPFAR
    Authors
    Dylan Westfall; Mullins James
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence PCR amplicons derived from cDNA templates tagged with universal molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR and the use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Handling of the large datasets produced from SMRT-UMI sequencing was facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline), that automatically filters and parses reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination or early cycle PCR errors, resulting in highly accurate sequence datasets. The optimized SMRT-UMI sequencing method presented here represents a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus (HIV) quasispecies. Methods This serves as an overview of the analysis performed on PacBio sequence data that is summarized in Analysis Flowchart.pdf and was used as primary data for the paper by Westfall et al. "Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies" Five different PacBio sequencing datasets were used for this analysis: M027, M2199, M1567, M004, and M005 For the datasets which were indexed (M027, M2199), CCS reads from PacBio sequencing files and the chunked_demux_config files were used as input for the chunked_demux pipeline. Each config file lists the different Index primers added during PCR to each sample. The pipeline produces one fastq file for each Index primer combination in the config. For example, in dataset M027 there were 3–4 samples using each Index combination. The fastq files from each demultiplexed read set were moved to the sUMI_dUMI_comparison pipeline fastq folder for further demultiplexing by sample and consensus generation with that pipeline. More information about the chunked_demux pipeline can be found in the README.md file on GitHub. The demultiplexed read collections from the chunked_demux pipeline or CCS read files from datasets which were not indexed (M1567, M004, M005) were each used as input for the sUMI_dUMI_comparison pipeline along with each dataset's config file. Each config file contains the primer sequences for each sample (including the sample ID block in the cDNA primer) and further demultiplexes the reads to prepare data tables summarizing all of the UMI sequences and counts for each family (tagged.tar.gz) as well as consensus sequences from each sUMI and rank 1 dUMI family (consensus.tar.gz). More information about the sUMI_dUMI_comparison pipeline can be found in the paper and the README.md file on GitHub. The consensus.tar.gz and tagged.tar.gz files were moved from sUMI_dUMI_comparison pipeline directory on the server to the Pipeline_Outputs folder in this analysis directory for each dataset and appended with the dataset name (e.g. consensus_M027.tar.gz). Also in this analysis directory is a Sample_Info_Table.csv containing information about how each of the samples was prepared, such as purification methods and number of PCRs. There are also three other folders: Sequence_Analysis, Indentifying_Recombinant_Reads, and Figures. Each has an .Rmd file with the same name inside which is used to collect, summarize, and analyze the data. All of these collections of code were written and executed in RStudio to track notes and summarize results. Sequence_Analysis.Rmd has instructions to decompress all of the consensus.tar.gz files, combine them, and create two fasta files, one with all sUMI and one with all dUMI sequences. Using these as input, two data tables were created, that summarize all sequences and read counts for each sample that pass various criteria. These are used to help create Table 2 and as input for Indentifying_Recombinant_Reads.Rmd and Figures.Rmd. Next, 2 fasta files containing all of the rank 1 dUMI sequences and the matching sUMI sequences were created. These were used as input for the python script compare_seqs.py which identifies any matched sequences that are different between sUMI and dUMI read collections. This information was also used to help create Table 2. Finally, to populate the table with the number of sequences and bases in each sequence subset of interest, different sequence collections were saved and viewed in the Geneious program. To investigate the cause of sequences where the sUMI and dUMI sequences do not match, tagged.tar.gz was decompressed and for each family with discordant sUMI and dUMI sequences the reads from the UMI1_keeping directory were aligned using geneious. Reads from dUMI families failing the 0.7 filter were also aligned in Genious. The uncompressed tagged folder was then removed to save space. These read collections contain all of the reads in a UMI1 family and still include the UMI2 sequence. By examining the alignment and specifically the UMI2 sequences, the site of the discordance and its case were identified for each family as described in the paper. These alignments were saved as "Sequence Alignments.geneious". The counts of how many families were the result of PCR recombination were used in the body of the paper. Using Identifying_Recombinant_Reads.Rmd, the dUMI_ranked.csv file from each sample was extracted from all of the tagged.tar.gz files, combined and used as input to create a single dataset containing all UMI information from all samples. This file dUMI_df.csv was used as input for Figures.Rmd. Figures.Rmd used dUMI_df.csv, sequence_counts.csv, and read_counts.csv as input to create draft figures and then individual datasets for eachFigure. These were copied into Prism software to create the final figures for the paper.

  15. m

    Data for "Direct and indirect Rod and Frame effect: A virtual reality study"...

    • data.mendeley.com
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michał Adamski (2025). Data for "Direct and indirect Rod and Frame effect: A virtual reality study" [Dataset]. http://doi.org/10.17632/pcf2n8b4rd.1
    Explore at:
    Dataset updated
    Feb 12, 2025
    Authors
    Michał Adamski
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the raw experimental data and supplementary materials for the "Asymmetry Effects in Virtual Reality Rod and Frame Test". The materials included are:

    •  Raw Experimental Data: older.csv and young.csv
    •  Mathematica Notebooks: a collection of Mathematica notebooks used for data analysis and visualization. These notebooks provide scripts for processing the experimental data, performing statistical analyses, and generating the figures used in the project.
    •  Unity Package: a Unity package featuring a sample scene related to the project. The scene was built using Unity’s Universal Rendering Pipeline (URP). To utilize this package, ensure that URP is enabled in your Unity project. Instructions for enabling URP can be found in the Unity URP Documentation.
    

    Requirements:

    •  For Data Files: software capable of opening CSV files (e.g., Microsoft Excel, Google Sheets, or any programming language that can read CSV formats).
    •  For Mathematica Notebooks: Wolfram Mathematica software to run and modify the notebooks.
    •  For Unity Package: Unity Editor version compatible with URP (2019.3 or later recommended). URP must be installed and enabled in your Unity project.
    

    Usage Notes:

    •  The dataset facilitates comparative studies between different age groups based on the collected variables.
    •  Users can modify the Mathematica notebooks to perform additional analyses.
    •  The Unity scene serves as a reference to the project setup and can be expanded or integrated into larger projects.
    

    Citation: Please cite this dataset when using it in your research or publications.

  16. o

    CMS Open Data 2012 datasets for dimuon exercises

    • explore.openaire.eu
    • zenodo.org
    Updated Aug 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alejandro Gomez Espinosa (2021). CMS Open Data 2012 datasets for dimuon exercises [Dataset]. http://doi.org/10.5281/zenodo.5343105
    Explore at:
    Dataset updated
    Aug 31, 2021
    Authors
    Alejandro Gomez Espinosa
    Description

    These datasets are a subset of the CMS Open data with 2021 data-taking conditions for education purposes. In this version, the data and simulation files are compressed into one big file for easy access. They are stored in two different formats (CSV and PKL) with the same content, therefore just use one of them. Once unzipped: - Data files, starting with output_data_CMS_Run2012B, correspond to 4429.37 /pb of data collected by the CMS Experiment. They are a subset of the dataset on reference [1]. - Simulation files, starting with output_sim_CMS_MonteCarlo2012, are a subset of the dataset referenced on [2]. The number of generated events in this case is 30458871, and the cross section is 3503.71. All the files were processed with a modified version of the AOD2NanoAODOutreachTool [3]. The small modifications are related to the number of triggers stored, and some objects like taus were removed. -------------------------------------------------------- [1] CMS collaboration (2017). DoubleMuParked primary dataset in AOD format from Run of 2012 (/DoubleMuParked/Run2012B-22Jan2013-v1/AOD). CERN Open Data Portal. DOI:10.7483/OPENDATA.CMS.YLIC.86ZZ [2] Wunsch, Stefan; (2019). DYJetsToLL dataset in reduced NanoAOD format for education and outreach. CERN Open Data Portal. DOI:10.7483/OPENDATA.CMS.SRRA.2GON [3] https://github.com/cms-opendata-analyses/AOD2NanoAODOutreachTool {"references": ["CMS collaboration (2017). DoubleMuParked primary dataset in AOD format from Run of 2012 (/DoubleMuParked/Run2012B-22Jan2013-v1/AOD). CERN Open Data Portal. DOI:10.7483/OPENDATA.CMS.YLIC.86ZZ", "Wunsch, Stefan; (2019). DYJetsToLL dataset in reduced NanoAOD format for education and outreach. CERN Open Data Portal. DOI:10.7483/OPENDATA.CMS.SRRA.2GON", "https://github.com/cms-opendata-analyses/AOD2NanoAODOutreachTool"]} For the CSV files you might need to open them using pandas as: pandas.read_csv('output_data.csv', index_col=['entry','subentry']) For the pickle files, you might need to use python3.

  17. pNEUMA dataset

    • zenodo.org
    html, zip
    Updated Jan 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emmanouil Barmpounakis; Emmanouil Barmpounakis; Nikolas Geroliminis; Nikolas Geroliminis (2024). pNEUMA dataset [Dataset]. http://doi.org/10.5281/zenodo.10491409
    Explore at:
    zip, htmlAvailable download formats
    Dataset updated
    Jan 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Emmanouil Barmpounakis; Emmanouil Barmpounakis; Nikolas Geroliminis; Nikolas Geroliminis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    pNEUMA is an open large-scale dataset of naturalistic trajectories of half a million vehicles that have been collected by a one-of-a-kind experiment by a swarm of drones in the congested downtown area of Athens, Greece. A unique observatory of traffic congestion, a scale an-order-of-magnitude higher than what was not available until now, that researchers from different disciplines around the globe can use to develop and test their own models.

    How are the .csv files organized?

    For each .csv file the following apply:
    • each row represents the data of a single vehicle
    • the first 10 columns in the 1st row include the columns’ names
    • the first 4 columns include information about the trajectory like the unique trackID, the type of vehicle, the distance traveled in meters and the average speed of the vehicle in km/h
    • the last 6 columns are then repeated every 6 columns based on the time frequency. For example, column_5 contains the latitude of the vehicle at time column_10, and column_11 contains the latitude of the vehicle at time column_16.
    • Speed is in km/h, Longitudinal and Lateral Acceleration in m/sec2 and time in seconds.

    For more details about the pNEUMA dataset, please check our website at https://open-traffic.epfl.ch

  18. c

    Electrification of Heat Demonstration Project: Heat Pump Performance Raw...

    • datacatalogue.cessda.eu
    • beta.ukdataservice.ac.uk
    Updated Dec 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Energy Systems Catapult (2024). Electrification of Heat Demonstration Project: Heat Pump Performance Raw Data, 2020-2023 [Dataset]. http://doi.org/10.5255/UKDA-SN-9049-2
    Explore at:
    Dataset updated
    Dec 20, 2024
    Authors
    Energy Systems Catapult
    Time period covered
    Nov 1, 2020 - Sep 29, 2023
    Area covered
    United Kingdom
    Variables measured
    Families/households, Subnational
    Measurement technique
    Measurements and tests
    Description

    Abstract copyright UK Data Service and data collection copyright owner.


    The heat pump monitoring datasets are a key output of the Electrification of Heat Demonstration (EoH) project, a government-funded heat pump trial assessing the feasibility of heat pumps across the UK’s diverse housing stock. These datasets are provided in both cleansed and raw form and allow analysis of the initial performance of the heat pumps installed in the trial. From the datasets, insights such as heat pump seasonal performance factor (a measure of the heat pump's efficiency), heat pump performance during the coldest day of the year, and half-hourly performance to inform peak demand can be gleaned.

    For the second edition (December 2024), the data were updated to include performance data collected between November 2020 and September 2023. The only documentation currently available with the study is the Excel data dictionary. Reports and other contextual information can be found on the Energy Systems Catapult website.

    The EoH project was funded by the Department of Business, Energy and Industrial Strategy. From 2023, it is covered by the new Department for Energy Security and Net Zero.

    Data availability

    This study comprises the raw data from the EoH project, which is only available to registered UKDS users. Only the summary data file is available via standard UKDS EUL download, due to the large size of the full raw data files. To obtain the full set of raw data, registered UKDS users should:

    1. Download the summary dataset, and then
    2. Contact the UKDS HelpDesk, quoting study number 9049, to arrange FTP access for the raw data.

    When unzipped, the raw data available via FTP consists of 742 CSV files. Most of the individual CSV files are too large to open in Excel. Before requesting FTP, users should ensure they have sufficient computing facilities to analyse the data.

    The UKDS also holds an accompanying open-access study, SN 9050 Electrification of Heat Demonstration Project: Heat Pump Performance Cleansed Data, 2020-2023. This contains the cleansed data from the EoH project, which does not require UKDS registration to access. However, since the data are similar in size to this study, only the summary dataset is available to download; an order must be placed for FTP delivery of the remaining cleansed data. Other studies in the set include SN 9209, which comprises 30-minute interval heat pump performance data, and SN 9210, which includes daily heat pump performance data.

    The Python code used to cleanse the raw data and then perform the analysis is accessible via the Energy Systems Catapult Github


    Main Topics:

    Heat Pump Performance across the BEIS funded heat pump trial, The Electrification of Heat (EoH) Demonstration Project. See the documentation for data contents.

  19. d

    Long-term monotonic trends in annual groundwater level metrics in the United...

    • catalog.data.gov
    • data.usgs.gov
    Updated Feb 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Long-term monotonic trends in annual groundwater level metrics in the United States through 2020 (ver.2.0, January 2025) [Dataset]. https://catalog.data.gov/dataset/long-term-monotonic-trends-in-annual-groundwater-metrics-in-the-united-states-through-2020
    Explore at:
    Dataset updated
    Feb 21, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    United States
    Description

    The U.S. Geological Survey (USGS) Water Resources Mission Area (WMA) is working to address a need to understand where the Nation is experiencing water shortages or surpluses relative to the demand by delivering routine assessments of water supply and demand. A key part of these national assessments is identifying long-term trends in water availability, including groundwater and surface water quantity, quality, and use. This data release contains Mann-Kendall monotonic trend analyses for annual groundwater metrics at 54,932 wells located in the conterminous United States, Alaska, Hawaii, and Puerto Rico. The groundwater metrics include annual mean, maximum, and minimum water level and the timing of the annual maximum and minimum groundwater level. These metrics are computed from groundwater water levels from publicly available data from the National Water Information System (NWIS), the National Groundwater Monitoring Network (NGWMN) and the California Open Data Portal. Trend analyses are computed using annual groundwater metrics through the water year, which is defined as the 12-month period October 1, for any given year through September 30 of the following year (for example, October 2019 through September 2020). Trends at each well are available for up to four different periods: i) the longest possible period that meets completeness criteria at each well, (ii) 1980-2020, (iii) 1990-2020, (iv) 2000-2020. Annual mean, maximum, and minimum water-level metrics for wells screened in unconfined aquifers were determined only when a well's water-level time series was at least 70 percent complete. Additionally, each of these time series must have at least 70 percent complete records in the first and last decade. All longest possible period time series for wells in unconfined aquifer must be at least 10 years long and have annual metric values calculated for at least 70% of the years of the record. Annual mean, maximum, and minimum water-level metrics for wells screened in confined aquifers were determined only when a well's water-level time series was at least 50 percent complete. Additionally, each of these time series must have at least 50 percent complete records in the first and last decade. All longest possible period time series for wells in confined aquifer must be at least 10 years long and have annual metric values calculated for at least 50% of the years in the last 10 years of the record. Caution must be exercised when utilizing monotonic trend analyses conducted over periods of up to several decades (and in some places longer ones) due to the potential for confounding deterministic gradual trends with multi-decadal climatic fluctuations. This data release contains: six input files: NGWMN_gwl_meta_v2.0.csv, the metadata from the National Groundwater Monitoring Network NGWMN_gwl_data_v2.0.csv, the groundwater water level data from the National Groundwater Monitoring Network NWIS_gwl_meta_v2.0.csv, the metadata from the National Water Information System NWIS_gwl_data_v2.0.csv, the groundwater water level data from the National Water Information System CA_measurements_v2.0.csv, the groundwater level data from the California Open Data Portal CA_stations_v2.0.csv, the groundwater metadata from the California Open Data Portal two output files: GW_trendsout_v2.0.csv, the groundwater water level trend data from both the National Groundwater Monitoring Network and the National Water Information System GW_confband_out_v2.0.csv, the confidence bands associated with the groundwater water level trend data from both the National Monitoring Network and the National Water Information System A .zip file containing all of the code used to compute these trends along with a README file with information on using the code First posted - Feb 27, 2024 (available from author) Revised - Jan 30, 2025 (version 2.0)

  20. Periodical Cicada Broods (Feature Layer)

    • agdatacommons.nal.usda.gov
    • s.cnmilf.com
    • +5more
    bin
    Updated Nov 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Forest Service (2024). Periodical Cicada Broods (Feature Layer) [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/Periodical_Cicada_Broods_Feature_Layer_/25972900
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 23, 2024
    Dataset provided by
    U.S. Department of Agriculture Forest Servicehttp://fs.fed.us/
    Authors
    U.S. Forest Service
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Note: This is a large dataset. To download, go to ArcGIS Open Data Set and click the download button, and under additional resources select the geodatabase option. Data layer depicting periodical cicada distribution and expected year of emergence by cicada brood and county. The periodical cicada emerges in massive groups once every 13 or 17 years and is completely unique to North America. There are 15 of these mass groups, called broods, of periodical cicadas in the United States. This county-based data, complied by the USFS Northern Research Station, depict where and when the different broods of periodical cicadas are likely to emerge in the US through 2037. The data was compiled for the 2011 publication entitled "Avian predators are less abundant during periodical cicada emergences, but why?" (Koenig et al. https://dx.doi.org/10.1890/10-1583.1) using data from periodical cicada publications listed below. 1) Marlatt, C. L. 1907. "The periodical cicada". Bulletin of the USDA Bureau of Entomology 71:1?181. 2) Simon, C. 1988. "Evolution of 13- and 17-year periodical cicadas". (Homoptera: Cicadidae). Bulletin of the Entomological Society of America 34:163?176. 3) Liebhold, A. M., Bohne, M. J., and R. L. Lilja. 2013. "Active Periodical Cicada Broods of the United States". USDA Forest Service Northern Research Station, Northeastern Area State and Private Forestry. Metadata and DownloadsThis record was taken from the USDA Enterprise Data Inventory that feeds into the https://data.gov catalog. Data for this record includes the following resources: ISO-19139 metadata ArcGIS Hub Dataset ArcGIS GeoService OGC WMS CSV Shapefile GeoJSON KML For complete information, please visit https://data.gov.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2020). Microclimate Sensor - csv - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/canning-microclimate-sensor-csv

Microclimate Sensor - csv - Datasets - data.wa.gov.au

Explore at:
Dataset updated
May 6, 2020
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Western Australia
Description

Micro - climate sensors collect telemetry at set intervals throughout the day. Sensors are located at various locations in the City of Canning, Western Australia and each sensor has a unique ID. Contact us at opendata@canning.wa.gov.au for a larger data set (The data is supplied is the sensor reading for 30 days). The following lists the locations of each sensor:18zua9muwbb is located at Wharf Street Basin - Pavilion 2hq3byfebne is located at The City’s Civic and Administration Building uu90853psl is located at Wharf Street Basin - Leila Street entrance xd2su7w05m is located at Wharf Street Basin - Nature Play Area

Search
Clear search
Close search
Google apps
Main menu