17 datasets found

Z
Supplementary data to the paper: Toward a Novel Set of Pinna Anthropometric...
data.niaid.nih.gov
Updated Jul 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Avanzini, Federico (2024). Supplementary data to the paper: Toward a Novel Set of Pinna Anthropometric Features for Individualizing Head-Related Transfer Functions [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10805884
Explore at:
Dataset updated
Jul 9, 2024
Dataset provided by
Fantini, Davide
Ntalampiras, Stavros
PRESTI, GIORGIO
Avanzini, Federico
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Supplementary research data to the paper:

Davide Fantini, Federico Avanzini, Stavros Ntalampiras and Giorgio Presti (2024) "Toward a Novel Set of Pinna Anthropometric Features for Individualizing Head-Related Transfer Functions", Sound and Music Computing Conference

The repository includes the research data generated in the abovementioned paper. In particular, the repository includes:

README.md: instructions for the data

pinna_images.mat: pinna depth images extracted from the 3D head meshes of the HUTUBS dataset

landmarks.mat: coordinates of the landmarks manually annotated on pinna depth images

anthropometry.mat: anthropometric parameters automatically extracted from manually annotated landmarks

anthropometry_documentation.pdf: documentation of the pinna anthropometric parameters

poster.pdf: poster presented at the SMC conference 2024

The data are provided in the Matlab file format MAT. Nevertheless, the MAT files can be read with other programming languages, such as Python (scipy.io.loadmat).

A GitHub repository to automatically extract the pinna landmarks and features as described in the paper is available here.
Film Circulation dataset
zenodo.org
data.niaid.nih.gov
bin, csv, png
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Skadi Loist; Skadi Loist; Evgenia (Zhenya) Samoilova; Evgenia (Zhenya) Samoilova (2024). Film Circulation dataset [Dataset]. http://doi.org/10.5281/zenodo.7887672
Explore at:
csv, png, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7887672
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Skadi Loist; Skadi Loist; Evgenia (Zhenya) Samoilova; Evgenia (Zhenya) Samoilova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Complete dataset of “Film Circulation on the International Film Festival Network and the Impact on Global Film Culture”

A peer-reviewed data paper for this dataset is in review to be published in NECSUS_European Journal of Media Studies - an open access journal aiming at enhancing data transparency and reusability, and will be available from https://necsus-ejms.org/ and https://mediarep.org

Please cite this when using the dataset.

Detailed description of the dataset:

1 Film Dataset: Festival Programs

The Film Dataset consists a data scheme image file, a codebook and two dataset tables in csv format.

The codebook (csv file “1_codebook_film-dataset_festival-program”) offers a detailed description of all variables within the Film Dataset. Along with the definition of variables it lists explanations for the units of measurement, data sources, coding and information on missing data.

The csv file “1_film-dataset_festival-program_long” comprises a dataset of all films and the festivals, festival sections, and the year of the festival edition that they were sampled from. The dataset is structured in the long format, i.e. the same film can appear in several rows when it appeared in more than one sample festival. However, films are identifiable via their unique ID.

The csv file “1_film-dataset_festival-program_wide” consists of the dataset listing only unique films (n=9,348). The dataset is in the wide format, i.e. each row corresponds to a unique film, identifiable via its unique ID. For easy analysis, and since the overlap is only six percent, in this dataset the variable sample festival (fest) corresponds to the first sample festival where the film appeared. For instance, if a film was first shown at Berlinale (in February) and then at Frameline (in June of the same year), the sample festival will list “Berlinale”. This file includes information on unique and IMDb IDs, the film title, production year, length, categorization in length, production countries, regional attribution, director names, genre attribution, the festival, festival section and festival edition the film was sampled from, and information whether there is festival run information available through the IMDb data.

2 Survey Dataset

The Survey Dataset consists of a data scheme image file, a codebook and two dataset tables in csv format.

The codebook “2_codebook_survey-dataset” includes coding information for both survey datasets. It lists the definition of the variables or survey questions (corresponding to Samoilova/Loist 2019), units of measurement, data source, variable type, range and coding, and information on missing data.

The csv file “2_survey-dataset_long-festivals_shared-consent” consists of a subset (n=161) of the original survey dataset (n=454), where respondents provided festival run data for films (n=206) and gave consent to share their data for research purposes. This dataset consists of the festival data in a long format, so that each row corresponds to the festival appearance of a film.

The csv file “2_survey-dataset_wide-no-festivals_shared-consent” consists of a subset (n=372) of the original dataset (n=454) of survey responses corresponding to sample films. It includes data only for those films for which respondents provided consent to share their data for research purposes. This dataset is shown in wide format of the survey data, i.e. information for each response corresponding to a film is listed in one row. This includes data on film IDs, film title, survey questions regarding completeness and availability of provided information, information on number of festival screenings, screening fees, budgets, marketing costs, market screenings, and distribution. As the file name suggests, no data on festival screenings is included in the wide format dataset.

3 IMDb & Scripts

The IMDb dataset consists of a data scheme image file, one codebook and eight datasets, all in csv format. It also includes the R scripts that we used for scraping and matching.

The codebook “3_codebook_imdb-dataset” includes information for all IMDb datasets. This includes ID information and their data source, coding and value ranges, and information on missing data.

The csv file “3_imdb-dataset_aka-titles_long” contains film title data in different languages scraped from IMDb in a long format, i.e. each row corresponds to a title in a given language.

The csv file “3_imdb-dataset_awards_long” contains film award data in a long format, i.e. each row corresponds to an award of a given film.

The csv file “3_imdb-dataset_companies_long” contains data on production and distribution companies of films. The dataset is in a long format, so that each row corresponds to a particular company of a particular film.

The csv file “3_imdb-dataset_crew_long” contains data on names and roles of crew members in a long format, i.e. each row corresponds to each crew member. The file also contains binary gender assigned to directors based on their first names using the GenderizeR application.

The csv file “3_imdb-dataset_festival-runs_long” contains festival run data scraped from IMDb in a long format, i.e. each row corresponds to the festival appearance of a given film. The dataset does not include each film screening, but the first screening of a film at a festival within a given year. The data includes festival runs up to 2019.

The csv file “3_imdb-dataset_general-info_wide” contains general information about films such as genre as defined by IMDb, languages in which a film was shown, ratings, and budget. The dataset is in wide format, so that each row corresponds to a unique film.

The csv file “3_imdb-dataset_release-info_long” contains data about non-festival release (e.g., theatrical, digital, tv, dvd/blueray). The dataset is in a long format, so that each row corresponds to a particular release of a particular film.

The csv file “3_imdb-dataset_websites_long” contains data on available websites (official websites, miscellaneous, photos, video clips). The dataset is in a long format, so that each row corresponds to a website of a particular film.

The dataset includes 8 text files containing the script for webscraping. They were written using the R-3.6.3 version for Windows.

The R script “r_1_unite_data” demonstrates the structure of the dataset, that we use in the following steps to identify, scrape, and match the film data.

The R script “r_2_scrape_matches” reads in the dataset with the film characteristics described in the “r_1_unite_data” and uses various R packages to create a search URL for each film from the core dataset on the IMDb website. The script attempts to match each film from the core dataset to IMDb records by first conducting an advanced search based on the movie title and year, and then potentially using an alternative title and a basic search if no matches are found in the advanced search. The script scrapes the title, release year, directors, running time, genre, and IMDb film URL from the first page of the suggested records from the IMDb website. The script then defines a loop that matches (including matching scores) each film in the core dataset with suggested films on the IMDb search page. Matching was done using data on directors, production year (+/- one year), and title, a fuzzy matching approach with two methods: “cosine” and “osa.” where the cosine similarity is used to match titles with a high degree of similarity, and the OSA algorithm is used to match titles that may have typos or minor variations.

The script “r_3_matching” creates a dataset with the matches for a manual check. Each pair of films (original film from the core dataset and the suggested match from the IMDb website was categorized in the following five categories: a) 100% match: perfect match on title, year, and director; b) likely good match; c) maybe match; d) unlikely match; and e) no match). The script also checks for possible doubles in the dataset and identifies them for a manual check.

The script “r_4_scraping_functions” creates a function for scraping the data from the identified matches (based on the scripts described above and manually checked). These functions are used for scraping the data in the next script.

The script “r_5a_extracting_info_sample” uses the function defined in the “r_4_scraping_functions”, in order to scrape the IMDb data for the identified matches. This script does that for the first 100 films, to check, if everything works. Scraping for the entire dataset took a few hours. Therefore, a test with a subsample of 100 films is advisable.

The script “r_5b_extracting_info_all” extracts the data for the entire dataset of the identified matches.

The script “r_5c_extracting_info_skipped” checks the films with missing data (where data was not scraped) and tried to extract data one more time to make sure that the errors were not caused by disruptions in the internet connection or other technical issues.

The script “r_check_logs” is used for troubleshooting and tracking the progress of all of the R scripts used. It gives information on the amount of missing values and errors.

4 Festival Library Dataset

The Festival Library Dataset consists of a data scheme image file, one codebook and one dataset, all in csv format.

The codebook (csv file “4_codebook_festival-library_dataset”) offers a detailed description of all variables within the Library Dataset. It lists the definition of variables, such as location and festival name, and festival categories,
H
Replication Data and additional supporting files for Sharing Individual...
dataverse.harvard.edu
Updated Aug 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elizabeth Pisani; Stella Botchway (2017). Replication Data and additional supporting files for Sharing Individual Patient and Parasite-level Data through the WorldWide Antimalarial Resistance Network Platform: a Qualitative Case Study [Dataset]. http://doi.org/10.7910/DVN/V1TKIO
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/V1TKIO
Dataset updated
Aug 1, 2017
Dataset provided by
Harvard Dataverse
Authors
Elizabeth Pisani; Stella Botchway
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
These data relate to the paper: Sharing Individual Patient and Parasite-level Data through the WorldWide Antimalarial Resistance Network Platform: a Qualitative Case Study, by Elizabeth Pisani and Stella Botchway, submitted to Wellcome Open Research, July 2017. We provide the study outline, the coding tree, the completed COREQ reporting form, and an .xslx format set of data relating to papers published in relation to the WWARN platform until June 30, 2016
Focus Groups on Data Sharing and Research Data Management with Scientists...
figshare.com
pdf
Updated Apr 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Devan Ray Donaldson (2022). Focus Groups on Data Sharing and Research Data Management with Scientists from Five Disciplines [Dataset]. http://doi.org/10.6084/m9.figshare.19493060.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19493060.v1
Dataset updated
Apr 1, 2022
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Devan Ray Donaldson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset resulted from conducting focus groups with scientists from five disciplines (atmospheric and earth science, chemistry, computer science, ecology, and neuroscience) about data management to lead into a discussion of what features they think are necessary to include in data repository systems and services to help them implement the data sharing and preservation parts of their data management plans. Participants identified metadata quality control and training as problem areas in data management. Participants discussed several desired repository features, including: metadata control, data traceability, security, stable infrastructure, and data use restrictions. Our dataset includes five anonymized focus group transcripts in .pdf file format (one for each focus group with scientists from each discipline), our codebook as a spreadsheet in excel file format (.xlsx), and coded segments of our transcript text to visualize our data analysis in an excel spreadsheet in excel file format (.xlsx).
o
Graph topological features extracted from expression profiles of...
explore.openaire.eu
zenodo.org
Updated Aug 7, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Léon-Charles Tranchevent; Francisco Azuaje; Jagath C Rajapakse (2019). Graph topological features extracted from expression profiles of neuroblastoma patients [Dataset]. http://doi.org/10.5281/zenodo.3357673
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.3357673
Dataset updated
Aug 7, 2019
Authors
Léon-Charles Tranchevent; Francisco Azuaje; Jagath C Rajapakse
Description
Introduction This dataset contains the data described in the paper titled "A deep neural network approach to predicting clinical outcomes of neuroblastoma patients." by Tranchevent, Azuaje and Rajapakse. More precisely, this dataset contains the topological features extracted from graphs built from publicly available expression data (see details below). This dataset does not contain the original expression data, which are available elsewhere. We thank the scientists who did generate and share these data (please see below the relevant links and publications). Content File names start with the name of the publicly available dataset they are built on (among "Fischer", "Maris" and "Versteeg"). This name is followed by a tag representing whether they contain raw data ("raw", which means, in this case, the raw topological features) or TF formatted data ("TF", which stands for TensorFlow). This tag is then followed by a unique identifier representing a unique configuration. The configuration file "Global_configuration.tsv" contains details about these configurations such as which topological features are present and which clinical outcome is considered. The code associated to the same manuscript that uses these data is at https://gitlab.com/biomodlih/SingalunDeep. The procedure by which the raw data are transformed into the TensorFlow ready data is described in the paper. File format All files are TSV files that correspond to matrices with samples as rows and features as columns (or clinical data as columns for clinical data files). The data files contain various sets of topological features that were extracted from the sample graphs (or Patient Similarity Networks - PSN). The clinical files contain relevant clinical outcomes. The raw data files only contain the topological data. For instance, the file "Fischer_raw_2d0000_data_tsv" contains 24 values for each sample corresponding to the 12 centralities computed for both the microarray (Fischer-M) and RNA-seq (Fischer-R) datasets. The TensorFlow ready files do not contain the sample identifiers in the first column. However, they contain two extra columns at the end. The first extra column is the sample weights (for the classifiers and because we very often have a dominant class). The second extra column is the class labels (binary), based on the clinical outcome of interest. Dataset details The Fischer dataset is used to train, evaluate and validate the models, so the dataset is split into train / eval / valid files, which contains respectively 249, 125 and 124 rows (samples) of the original 498 samples. In contrast, the other two datasets (Maris and Versteeg) are smaller and are only used for validation (and therefore have no training or evaluation file). The Fischer dataset also has more data files because various configurations were tested (see manuscript). In contrast, the validation, using the Maris and Versteeg datasets is only done for a single configuration and there are therefore less files. For Fischer, a few configurations are listed in the global configuration file but there is no corresponding raw data. This is because these items are derived from concatenations of the original raw data (see global configuration file and manuscript for details). References This dataset is associated with Tranchevent L., Azuaje F.. Rajapakse J.C., A deep neural network approach to predicting clinical outcomes of neuroblastoma patients. If you use these data in your research, please do not forget to also cite the researchers who have generated the original expression datasets. Fischer dataset: Zhang W. et al., Comparison of RNA-seq and microarray-based models for clinical endpoint prediction. Genome Biology 16(1) (2015). doi:10.1186/s13059-015-0694-1 Wang C. et al., The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat. Biotechnol. 32(9), 926–932. doi:10.1038/nbt.3001 Versteeg dataset: Molenaar J.J. et al., Sequencing of neuroblastoma identifies chromothripsis and defects in neuritogenesis genes. Nature 483(7391), 589–593. doi:10.1038/nature10910 Maris dataset: Wang Q. et al., Integrative genomics identifies distinct molecular classes of neuroblastoma and shows that multiple genes are targeted by regional alterations in DNA copy number. Cancer Res. 66(12), 6050–6062. doi:10.1158/0008-5472.CAN-05-4618 Project supported by the Fonds National de la Recherche (FNR), Luxembourg (SINGALUN project). This research was also partially supported by Tier-2 grant MOE2016-T2-1-029 by the Ministry of Education, Singapore.

Backup-As-A-Service Market Analysis North America, APAC, Europe, South...

technavio.com

Updated Jan 11, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Technavio (2025). Backup-As-A-Service Market Analysis North America, APAC, Europe, South America, Middle East and Africa - US, China, India, UK, Germany, Canada, South Korea, France, Japan, Italy - Size and Forecast 2025-2029 [Dataset]. https://www.technavio.com/report/backup-as-a-service-market-size-industry-analysis

Explore at:

Dataset updated

Jan 11, 2025

Dataset provided by

TechNavio

Authors

Technavio

Time period covered

2021 - 2025

Area covered

Germany, United States, Global

Description

Snapshot img

Backup-As-A-Service Market Size 2025-2029

The backup-as-a-service market size is forecast to increase by USD 53.81 billion at a CAGR of 38.4% between 2024 and 2029.

The market is experiencing significant growth due to several key trends. The increasing shift from capital expenditures (CAPEX) to operational expenditures (OPEX) is driving the demand for BaaS solutions. Additionally, the exponential growth in data volumes necessitates efficient and reliable backup solutions. Digital transformation is a significant factor fueling the adoption of BaaS, with advanced technologies such as artificial intelligence (AI), machine learning (ML), virtualization, and cloud computing creating numerous use cases. However, the market also faces challenges such as potential implementation failures, which can lead to data loss and downtime. To mitigate these risks, BaaS providers must focus on offering strong and secure solutions, ensuring seamless implementation, and providing excellent customer support. By addressing these trends and challenges, the market is poised for continued growth in the coming years. Organizations in North America are increasingly adopting BaaS to manage their data backup needs, making it a lucrative market for providers.

What will the size of the market be during the forecast period?

Request Free Sample

Backup-as-a-Service (BaaS) has emerged as a critical solution for enterprises seeking strong data protection in today's complex IT infrastructure landscape. With the increasing prevalence of SaaS workloads and virtualized infrastructure, traditional backup methods are no longer sufficient. BaaS offers enterprises a cost-effective and efficient alternative to managing their IT infrastructure's data protection in-house. Customer interest in BaaS is on the rise due to its ability to safeguard against various threats, including ransomware attacks, unauthorized access, corruption, hacking, theft, and human error. BaaS providers offer geographical presence and support capabilities that cater to businesses with global operations. Enterprise backup requirements demand advanced features such as data migration, disaster recovery, and hybrid cloud support.



BaaS solutions provide these capabilities, ensuring business continuity and minimizing downtime. The flexibility of BaaS allows for offsite cloud storage, reducing the risk of data loss due to catastrophic errors or natural disasters. BaaS offers several advantages over traditional backup methods. It eliminates the need for IT staff to manage backup processes, freeing up resources for other critical tasks. Additionally, BaaS solutions offer scalability, enabling businesses to easily increase their backup frequency and storage capacity as their data grows. Data protection is a top priority for businesses, and BaaS provides a comprehensive solution for safeguarding files, images, and data sets.

How is this market segmented and which is the largest segment?

The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

End-user

  Large enterprises
  SMEs


Application

  Online backup
  Cloud backup


Geography

  North America

    Canada
    US


  APAC

    China
    India
    Japan
    South Korea


  Europe

    Germany
    UK
    France
    Italy


  South America



  Middle East and Africa

By End-user Insights

The large enterprises segment is estimated to witness significant growth during the forecast period.

Backup-as-a-Service (BaaS) has gained significant traction among large enterprises due to the increasing complexity of data management and the necessity for reliable data protection. With massive volumes of data, enterprises require comprehensive and scalable backup solutions. BaaS providers offer flexibility to accommodate data growth without the burden of acquiring and managing additional hardware. The benefits of BaaS extend beyond cost-effectiveness, as it ensures data resilience, compliance, and operational efficiency. Data loss due to human error, cyberattacks, catastrophic errors, or workload overload can result in substantial financial and reputational damage.

.Furthermore, BaaS solutions provide redundancy, ensuring data is protected from potential threats and silos, enabling seamless data recovery. Internal IT teams can focus on core business functions, while BaaS providers manage backup infrastructure, allowing for simplified data management and improved business continuity.

Get a glance at the market report of share of various segments Request Free Sample

The large enterprises segment was valued at USD 2.07 billion in 2019 and showed a gradual increase during the forecast period.

Regional Analysis

North America is estimated to contribute 32% to the

m
Data from: SC-CoMIcs (Superconductivity Corpus for Materials Infomatics)
data.mendeley.com
Updated Jun 29, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kyosuke Yamaguchi (2021). SC-CoMIcs (Superconductivity Corpus for Materials Infomatics) [Dataset]. http://doi.org/10.17632/xc9fjz2p3h.3
Explore at:
Unique identifier
https://doi.org/10.17632/xc9fjz2p3h.3
Dataset updated
Jun 29, 2021
Authors
Kyosuke Yamaguchi
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Description
A corpus of 1000 Materials Informatics abstracts related to superconductivity. Named entities and relations in these text files are separately annotated in *.ann files in the format of the stand-off annotation.

Materials Informatics (MI) needs textual datasets to accelerate the studies in this area, but there is no sizable datasets suitable for our superconducting material search. In this respect, we decided to create a new corpus from scratch for MI information extraction. SC-CoMIcs is the corpus that can contribute to the advancement of MI studies, especially in superconductivity.

Experiment tools over the dataset can be found at the github linked below.

Note that you need to agree with the license displayed here. The set of 1,000 MI abstracts (0001.txt-1000.txt) are specially permitted to share in the research community under Creative Commons BY-NC 3.0 by Elsevier under a written agreement (#200221-005626). This is the strict condition you MUST obey.

NB: There is no difference in the dataset per set from version 1.
Z
The sharing of research raw data in journals indexed in the Cell & Tissue...
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucas-Domínguez, Rut (2020). The sharing of research raw data in journals indexed in the Cell & Tissue Engineering JCR category (2011-2015) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1162302
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Aleixandre-Benavent, Rafael
Sixto-Costoya, Andrea
Vidal-Infer, Antonio
Lucas-Domínguez, Rut
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
The availability of research data sets is an important milestone since it can enhance the dynamics of research. This study aims to analyze the PubMed Central repository to determine the availability and type of raw data sets in Cell & Tissue Engineering journals indexed in the Journal Citation Reports. The number and types of files were registered. A search of the 21 journals from the Cell & Tissue Engineering category of the 2015 Journal Citation Reports was conducted. Information was collected from October to December 2016. A study of the supplementary material of the original articles published between 2011-2015 was performed through a search in the PubMed Central repository, which is the most used free full-text repository in biomedicine. Only articles with supplementary material were retrieved. The number and types of files were registered. In cases where a compressed file, such as a .zip or .rar file, was found, it was opened to check what kinds of files it contained.
Z
Data from: Sigfox and LoRaWAN Datasets for Fingerprint Localization in Large...
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Jun 23, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Weyn, Maarten (2020). Sigfox and LoRaWAN Datasets for Fingerprint Localization in Large Urban and Rural Areas [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1193562
Explore at:
Dataset updated
Jun 23, 2020
Dataset provided by
Berkvens, Rafael
Weyn, Maarten
Van Vlaenderen, Koen
Aernouts, Michiel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
INTRODUCTION

The goal of these LPWAN datasets is to provide the global research community with a benchmark tool to evaluate fingerprint localization algorithms in large outdoor environments with various properties. An identical collection methodology was used for all datasets: during a period of three months, numerous devices containing a GPS receiver periodically obtained new location data, which was sent to a local data server via a Sigfox or LoRaWAN message. Together with network information such as the receiving time of the message, base station IDs' of all receiving base stations and the Received Signal Strength Indicator (RSSI) per base station, this location data was stored in one of the three LPWAN datasets:

lorawan_dataset_antwerp.csv

130 430 LoRaWAN messages, obtained in the city center of Antwerp

sigfox_dataset_antwerp.csv

14 378 Sigfox messages, obtained in the city center of Antwerp

sigfox_dataset_rural.csv

25 638 Sigfox messages, obtained in a rural area between Antwerp and Ghent

As the rural and urban Sigfox datasets were recorded in adjacent areas, many base stations that are located at the border of these areas can be found in both datasets. However, they do not necessarily share the same identifier: e.g. ‘BS 1’ in the urban Sigfox dataset could be the same base station as ‘BS 36’ in the rural Sigfox dataset. If the user intends to combine both Sigfox datasets, the mapping of the ID's of these base stations can be found in the file:

sigfox_bs_mapping.csv

The collection methodology of the datasets, and the first results of a basic fingerprinting implementation are documented in the following journal paper: http://www.mdpi.com/2306-5729/3/2/13

UPDATES IN VERSION 1.2

In this version of the LPWAN dataset, only the LoRaWAN set has been updated. The Sigfox datasets remain identical to version 1.0 and 1.1. The main updates in the LoRaWAN set are the following:

New data: the LoRaWAN messages in the new set are collected 1 year after the previous dataset version. To be consistent with the previous versions, the new LoRaWAN set is uploaded in the same .CSV format as before. This upload can still be found in this repository as ‘lorawan_dataset_antwerp.csv’.

More gateways: Compared to the previous dataset, 4 gateways were added to the LoRaWAN network. The RSSI of these gateways are shown in columns ‘BS 69’, ‘BS 70’,‘BS 71’ and ‘BS 72’. All other ‘BS’ columns are in the same order as in previous dataset versions.

More metadata: In the previous LoRaWAN dataset, metadata was limited to 3 receiving gateways per message. In the new dataset version, metadata from all receiving gateways is included in every message. Moreover, some gateways provide a timestamp with nanosecond precision, which can be used to evaluate Time Difference of Arrival localization methods with LoRaWAN.

2 file formats: As more metadata becomes available, we find it important to share the dataset in a clearer overview. This also allows researchers to evaluate the performance of LoRaWAN in an urban environment. Therefore, we publish the new LoRaWAN dataset as a .CSV file as described above, but also as a .JSON file (lorawan_antwerp_2019_dataset.json.txt, the .txt file type had to be appended, otherwise the file could not be uploaded to Zenodo) An example of one message in this JSON format can be seen below:

JSON format description:

HDOP: Horizontal Dilution of Precision

dev_addr: LoRaWAN device address

dev_eui: LoRaWAN device EUI

sf: Spreading factor

channel: TX channel (EU region)

payload: application payload

adr: Adaptive Data Rate (1 = enabled, 0= disabled)

counter: device uplink message counter

latitude: Groundtruth TX location latitude

longitude: Groundtruth TX location longitude

airtime: signal airtime (seconds)

gateways:

rssi: Received Signal Strength

esp: Estimated Signal Power

snr: Signal-to-Noise Ratio

ts_type: Timestamp type. If this says "GPS_RADIO", a nanosecond precise timestamp is available

time: time of arrival at the gateway

id: gateway ID

JSON example

{ "hdop": 0.7, "dev_addr": "07000EFE", "payload": "008d000392d54c4284d18c403333333f04682aa9410500e8fd4106cabdbc420f00db0d470ce32ac93f0d582be93f0bfa3f8d3f", "adr": 1, "latitude": 51.20856475830078, "counter": 31952, "longitude": 4.400575637817383, "airtime": 0.112896, "gateways": [ { "rssi": -115, "esp": -115.832695, "snr": 6.75, "rx_time": { "ts_type": "None", "time": "2019-01-04T08:59:53.079+01:00" }, "id": "08060716" }, { "rssi": -116, "esp": -125.51497, "snr": -9.0, "rx_time": { "ts_type": "GPS_RADIO", "time": "2019-01-04T08:59:53.962029179+01:00" }, "id": "FF0178DF" } ], "dev_eui": "3432333853376B18", "sf": 7, "channel": 8 }
i
CRAWDAD umass/diesel (v. 2008-09-14)
ieee-dataport.org
Updated Sep 14, 2008
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Burgess (2008). CRAWDAD umass/diesel (v. 2008-09-14) [Dataset]. http://doi.org/10.15783/C7488P
Explore at:
Unique identifier
https://doi.org/10.15783/C7488P
Dataset updated
Sep 14, 2008
Dataset provided by
IEEE Dataport
Authors
John Burgess
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The bus-based DTN (Disruption-tolerant networks) traces from UMass Amherst campus.Note: This dataset has multiple versions. The dataset file names of the data associated with this version are listed below, under the 'Traceset' heading and can be downloaded under 'Dataset Files' on the right-hand side of the page.This dataset includes the real mobility and real transfers of the bus-based DTN (Disruption-tolerant-network) testbed, called UMassDieselNet, operating from the UMass Amherst campus and the surrounding county.last modified :2008-10-21release date :2008-10-21date/time of measurement start :2005-01-25date/time of measurement end :2007-12-14collection environment :We have constructed a DTN testbed composed of 30-40 buses operated by the UMass Amherst branch of the Pioneer Valley Transport Authority (PVTA) that we have fitted with a custom package of off-the-shelf hardware. This testbed is called UMassDieselNet. The transit buses service an area sparsely covering approximately 150 square miles. The route each bus is placed on each day is chosen by the garage dispatcher and can change during the day. Buses can leave the network at any time. We did not try to automatically determine the routes of buses, though this is possible with some significant effort. We decided against this approach after finding GPS data often inconsistent or containing gaps where line-of-sight to satellites was lost.network configuration :The testbed began operating in May 2004 with five buses. Each bus carries a HaCom Open Brick computer (P6-compatible 577Mhz CPU, 256MB RAM). An 802.11b Access Point (AP) is attached to each brick to provide DHCP access to passengers and passersby. A second USB-based 802.11b interface constantly scans the surrounding area for DHCP offers and other buses. Each bus also has a GPS device attached to the brick. Each brick runs Linux on a 40GB notebook hard drive.data collection methodology :See the metadata of each traceset or trace for details of collection methodology.Tracesetumass/diesel/throwboxBus-to-bus, bus-to-throwbox transfer record collected from DieselNet during 2006 summer.files: DieselNetThrowbox.tar.gzdescription: This traceset was collected during the throwbox deployment in Umass DieselNet in Summer 2006. The traces contain bus-bus transfer records and bus-throwbox transfer records.measurement purpose: Routing Protocol for DTNs (Disruption Tolerant Networks), User Mobility Characterizationmethodology: One solution for improving DTN performance is to place additional stationary nodes in the network, which increases the number and frequency of contact opportunities. We proposed the use of throwboxes within a DTN for this purpose. Throwboxes are inexpensive, battery-powered, stationary nodes with radios and storage. When two nodes pass by the same location at different times, the throwbox acts as a router, creating a new contact opportunity. To support a real-world test of the throwbox, we used our DTN testbed, the UMassDieselNet. The testbed normally consists of 40 buses covering an area of more than 150 square miles. However, when the experiments were performed, during a reduced summer bus schedule, only 10 buses were running on three routes. Each bus is a highly mobile DTN node using a small computer with an attached access point and WiFi interface. Buses constantly scan for other nodes and transfer DTN data whenever a connection can be made. We augmented the equipment on the buses with an XTend radio and added scripts to beacon the position, speed, and direction of motion of the buses once each second. We deployed three always-on throwbox prototypes in fixed locations for three weeks on the UMassDieselNet bus routes.last modified: 2007-12-05dataname: umass/diesel/throwboxversion: 20071202change: the initial versionrelease date: 2007-12-02date/time of measurement start: 2005-01-25date/time of measurement end: 2007-05-14umass/diesel/throwbox Tracessummer2006: Bus-to-bus, bus-to-throwbox transfer record collected from DieselNet during 2006 summer.configuration: The traces contains connection event between buses (in DieselNet) and buses and throwboxes placed in the network.format: The name of the bus nodes follow the pattern "PVTA_(bus number)". The three throwboxes placed in the network have names "PVTA_TB0", "PVTA_TB1", "PVTA_TB2". The connectivity traces have data about the duration of contact, time at which the contact took place, the amount of data transfered, the position (longitude and latitude) at which the connections happened and the speed and direction of the bus motion when the connection event took place. The filenames indicate the date on which the connection trace was collected. For example, 6-23-2006 would mean June 23, 2006.description: The traces contains connection event between buses (in DieselNet) and buses and throwboxes placed in the network.last modified: 2007-12-05dataname: umass/diesel/throwbox/summer2006version: 20071202change: the initial versionrelease date: 2007-12-02date/time of measurement start: 2006-06-23date/time of measurement end: 2006-07-21url: /download/umass/diesel/DieselNetThrowbox.tar.gzumass/diesel/transferData transfer logs between buses on UmassDieselNet, a disruption-tolerant network (DTN).files: dieselnet-fall-2007.tar.gz, mobicom-traces.tar.gz, vifi-release.tar.gz, APConnectivitySpring2007.tar.gz, DieselNetTracesSpring2006.tar.gz, DieselNetTracesSpring2007.tar.gz, UMassDieselNet_Spring2005.tar.gzdescription: This set of DieselNet logs were compiled from busses running routes serviced by UmassTransit, which lists their bus routes on the web at http://www.umass.edu/campus_services/transit/. Of UmassTransit's busses, 30-40 busses were equipped with DieselNet equipment and a certain portion of those operated daily as dictated by bus failures and maintenance.measurement purpose: Routing Protocol for DTNs (Disruption Tolerant Networks), User Mobility Characterizationmethodology: To maintain and monitor our network, we use numerous external APs that offer free service along the bus routes hosted by third parties. We have installed only two APs - one on campus and one at the bus garage. Whenever the buses have web access, they retrieve software updates from a central server. At that time a bus provides its current GPS location and MAC address, and it uploads logs of its performance during the day, including the throughput of bus-to-bus transfer opportunities, APs contacted, a record of movement, and application records. To enable bus-to-bus transfers, the buses beacon on a single channel once every 100ms. We programmed the bricks in each bus to transfer the largest amount of data possible using TCP at each transfer opportunity. To allow us to easily test different routing algorithms in a real DTN environment, we set the UMassDieselNet buses to transmit random data to one another whenever they are within range and record the time, transmission size, and buses involved.last modified: 2008-10-21dataname: umass/diesel/transferversion: 20080914change: The following traces have been added: transfer/spring2006, transfer/spring2007, and transfer/ap_connectivityrelease date: 2008-09-14date/time of measurement start: 2005-01-25date/time of measurement end: 2007-12-14hole: We excluded holidays and other occasions causing buses to run infrequently.umass/diesel/transfer Tracesfall2007: Data transfer logs on UmassDieselNet, a disruption-tolerant network (DTN) in the months October-November 2007.configuration: The released traces have five directories 1. gps_logs : This is the directory which has the cumulative gps_logs for all the buses seen during the period which the traces were collected. 2. mobile-mobile: This is the collection of all the mobile-mobile contact events. 3. basestation-mobile: The traces are similar to the mobile-mobile traces but are for the AP-mobile contact events. 4. mobile-mobile-mesh-relay: these are the traces for the connection events between a mobile node and the six stationary relay/mesh boxes placed in the network. 5. xtendtrace: This has the connection events between mobile nodes over the Xtend Maxstream radio. If you use these traces in a research paper, please reference the traces as Relays, Base Stations, and Meshes: Enhancing Mobile Networks with Infrastructure. Nilanjan Banerjee, Mark D. Corner, Don Towsley, and Brian Neil Levine. In Proceedings of ACM MobiCom, San Francisco, CA, USA, September 2008.format: The released traces have five directories 1. gps_logs : This is the directory which has the cumulative gps_logs for all the buses seen during the period which the traces were collected. There is a directory corresponding to every bus and within each directory for a bus there are files for every day with a set of time stamped gps locations. 2. mobile-mobile: This is the collection of all the mobile-mobile contact events. Each line includes the time at which a contact occured, the amount of data that was transfered, the duration of contact and the position where the contact took place. 3. basestation-mobile: The traces are similar to the mobile-mobile traces but are for the AP-mobile contact events. 4. mobile-mobile-mesh-relay: these are the traces for the connection events between a mobile node and the six stationary relay/mesh boxes placed in the network. 5. xtendtrace: This has the connection events between mobile nodes over the Xtend Maxstream radio.description: Data transfer logs on UmassDieselNet, a disruption-tolerant network (DTN) in the months October-November 2007.last modified: 2008-10-21dataname: umass/diesel/transfer/fall2007version: 20080914change: the initial versionrelease date: 2008-09-14date/time of measurement start: 2007-10-23date/time of measurement end: 2007-11-11url: /download/umass/diesel/dieselnet-fall-2007.tar.gzap_connectivity_fall2007: DieselNet traces consisting of Bus-Bus and Bus-AP interactions during the Fall semester of 2007.configuration: This set of DieselNet traces where compiled during Fall
Z
SMARTEOLE Wind Farm Control open dataset
data.niaid.nih.gov
zenodo.org
Updated Nov 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eric Simley (2022). SMARTEOLE Wind Farm Control open dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7342465
Explore at:
Dataset updated
Nov 25, 2022
Dataset provided by
Eric Simley
Thomas Duc
License
https://github.com/DISIC/politique-de-contribution-open-source/blob/master/LICENSE.pdfhttps://github.com/DISIC/politique-de-contribution-open-source/blob/master/LICENSE.pdf
Description
Introduction

This dataset is issued from the third and final field campaign of the French national project SMARTEOLE. It consists in data from 7 wind turbines of a single wind farm (Sole du Moulin Vieux, located in France) for which Wind Farm Control field tests were performed to evaluate the performance of a wake steering strategy for improving the power production.

The wind farm consists of 7x Senvion MM82 wind turbines (rotor diameter of 82m, nominal power of 2.05 MW).

Description

The tests were realized between 17 February – 25 May 2020, with wake steering implemented on turbine SMV6. This dataset covers this full period, and it has been pre-processed to facilitate the analysis of the Wind Farm Control experiment. All timesteps when at least one turbine was stopped were removed, and SCADA nacelle position and wind direction signals have been corrected to remove any north alignment issues. Finally, the time resolution has been standardized at 1-min from the raw data recorded at higher frequencies from the different sensors. For more details about the development of the field campaign and the pre-processing steps followed in the data analysis, please consult the related publication : https://wes.copernicus.org/articles/6/1427/2021/wes-6-1427-2021.html. Some information can also be found in the related IEA task 44 wiki page.

The following files can be found in the dataset :

SMARTEOLE_WakeSteering_SCADA_1minData.csv : the Supervisory Control and Data Acquisition (SCADA) data from the 7 turbines.

SMARTEOLE_WakeSteering_ControlLog_1minData.csv : logs from the control system located on turbine SMV6, responsible for the application of the wake steering. The applied yaw offset on the turbine at each timestep can be found here.

SMARTEOLE_WakeSteering_WindCube_1minData.csv : data from the ground based WindCube profiler lidar, located between SMV2 and SMV3. This can be used to assess the ambient environmental wind conditions at the farm.

SMARTEOLE_WakeSteering_Coordinates_staticData.csv : file listing the coordinates of the wind turbines in the farm and WindCube location in traditional Latitude / Longitude system (WGS84) and XY metric system (French Lambert 93).

SMARTEOLE_WakeSteering_Map.pdf : the map of the farm showing the location of wind turbines and WindCube. This is the exact same map as the one seen in the paper indicated above.

SMARTEOLE_WakeSteering_NTF_SMV6_staticData.csv : the transfer function used in the paper to correct the wind speed measured by SMV6 to better match the freestream wind speed at 150m upstream (i.e. approximately 1.8 diameters), derived using WindCube nacelle lidar installed on top of the turbine.

SMARTEOLE_WakeSteering_correction_factors_SMV1237_staticData.csv : the transfer function used in the paper to derive and correct the reference power and wind speed signals —defined as the mean values of the power and wind speeds from SMV1, SMV2, SMV3, and SMV7— to remove biases from the values at SMV6 as a function of wind direction and wind speed. These corrected reference signals are used for quantifying the impact of the wake steering.

SMARTEOLE_WakeSteering_GuaranteedPowerCurve_staticData.csv : the warranted power and thrust curves for the standard mode (Mode 0) of the MM82 wind turbine.

SMARTEOLE_WakeSteering_ReadMe.xlsx : read me file indicating for each dataset the signification of the different variables.

Unfortunately, the WindCube nacelle lidar data on top of SMV6 could not be shared, instead the transfer functions derived thanks to this sensor can be used to correct the SCADA channels. The Wind Energy Science publication describes how these transfer functions were obtained.

Acknowledgement

The creation of this dataset was realized in the scope of French national project SMARTEOLE, supported by the Agence Nationale de la Recherche (grant no. ANR-14-CE05-0034).

Furthermore, we would like to thank ENGIE Green for allowing us to make this dataset publicly available.

How to cite this dataset

When using this dataset in future research, please add the following sentence in the Ackowledgement section of your publication :

"The dataset used in this research has been obtained by ENGIE Green in the scope of French national project SMARTEOLE (grant no. ANR-14-CE05-0034)".

When citing the dataset in the core text of a paper, the reference to Simley et al. can simply be used.

Related datasets and publications

Several field test campaigns were realized in the scope of SMARTEOLE project. Although these data are not made publicly available by default, they can be shared in a per-project basis and under the protection of a dedicated NDA. Please refer to the following publications listed below to get an idea of the content of the different datasets.

SMARTEOLE Field Test 1

Ahmad T. et al., Field Implementation and Trial of Coordinated Control of WIND Farms, IEEE Transactions on Sustainable Energy, 9(3), 2018, 10.1109/TSTE.2017.2774508.

Duc T., Optimization of wind farm power production using innovative control strategies, Master’s thesis, DTU Wind Energy-M-0161, 2017.

Duc T. et al., Local turbulence parameterization improves the Jensen wake model and its implementation for power optimization of an operating wind farm, Wind Energy Science, 4(2), 2019, 10.5194/wes-4-287-2019.

Torres Garcia E. et al., Statistical characteristics of interacting wind turbine wakes from a 7-month LiDAR measurement campaign, Renewable Energy, 130, 2019, 10.1016/j.renene.2018.06.030.

Hegazy A. et al., LiDAR and SCADA data processing for interacting wind turbine wakes with comparison to analytical wake models, Renewable Energy, 181, 2022, 10.1016/j.renene.2021.09.019.

SMARTEOLE Field Test 2

Tagliatti F., Investigation of Wind Turbine Fatigue Loads under Wind Farm Control: Analysis of Field Measurements, Master’s thesis, DTU Wind Energy-M-0302, 2019.

Göçmen T. et al., FarmConners wind farm flow control benchmark – Part 1: Blind test results, Wind Energy Science, 7(5), 2022, 10.5194/wes-7-1791-2022.

SMARTEOLE Field Test 3

Simley E. et al., Results from a wake-steering experiment at a commercial wind plant: investigating the wind speed dependence of wake-steering performance, Wind Energy Science, 6(6) 2021, 10.5194/wes-6-1427-2021.

Release Notes

v1.0 (2022-11-24) : first version of the dataset.
Interview Margaret Sevenjhazi: Mapping Urban Driven Innovations for...
figshare.com
mpga
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarina Kilham (2023). Interview Margaret Sevenjhazi: Mapping Urban Driven Innovations for Sustainable Food Systems in Sydney, Australia [Dataset]. http://doi.org/10.6084/m9.figshare.23300432.v1
Explore at:
mpgaAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23300432.v1
Dataset updated
Jun 6, 2023
Dataset provided by
figshare
Authors
Sarina Kilham
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Sydney, Australia
Description
GENERAL INFORMATION

Title of Dataset: Interview Margaret Sevenjhazi: Mapping Urban Driven Innovations for Sustainable Food Systems in Sydney, Australia

Participant Name: Margaret Sevenjhazi

Consent for Open-Access Data Repository:Yes

Author Information A. Principal Investigator Contact Information Name: SARINA JOAN KILHAM ORCID: 0000-0001-5234-2764 Institution: CHARLES STURT UNIVERSITY, WAGGA WAGGA, AUSTRALIA Address: Email: sarinakilham@gmail.com

Date of data collection (single date, range, approximate date): 2022-2023

Geographic location of data collection : Sydney, Australia

Information about funding sources that supported the collection of the data: City of Sydney Knowledge Exchange Sponsorship [2021/159649 / KEX 202122041] The Australian Sociological Association: Gary Bouma Memorial Workshop Program 2022

Human Research Ethics Protocol: Charles Sturt University Human Research Ethics Committee Protocol Approval: H21466

Description/Abstract: This research project will pilot and adapt a holistic mapping methodology called URBAL (Urban-Driven Innovations for Sustainable Food Systems) to examine urban food system sustainability innovation in the City of Sydney LGA. The research findings will give policy-makers and community led organisations (CLOs) a shared view and understanding of ‘what food systems innovations work’ and ‘why they work’. The results will be used by local governments, NGOs and food-based CLOs to inform, support and develop future food policy and sustainable food projects. This is interdisciplinary research that primarily uses contemporary social science approaches.

Subject: Social Sciences

Australian and New Zealand Standard Research Classification (ANZSRC) Field of Research Code: 160804

Australian and New Zealand Standard Research Classification (ANZSRC) Socio-economic Objective (SEO) Code: 960799

Keywords: food governance, innovation, urban food systems, social innovation,

Contributor Information A. Research Associate Name: TANIA LEIMBACH ORCID: 0000-0002-8144-5065 Email: tanialeimbach@gmail.com B. Research Associate Name: TANJA ROSENQVIST ORCID: 0000-0003-3899-6694 Email: tsrosenqvist@gmail.com 15: Type of Data: qualitative, interview transcripts, audio interviews 16: Format of Data: text READme file , .mp3 audio files , .rtf transcripts 17: Software: MSOffice 365 Education version 16.57, 2021. Descript Audio App. Zoom Meeting app. Rode Reporter Audio Recording App. 18: Version: Version 1. First Release 19: Terms and conditions of use: Creative Commons Attribution Non Commercial CC BY-NC

SHARING/ACCESS INFORMATION

Licenses/restrictions placed on the data: Yes

Links to publications that cite or use the data: https://figshare.com/projects/Sydney_Urban_Food_Innovation_Pathways/160649

Links to other publicly accessible locations of the data: https://figshare.com/projects/Sydney_Urban_Food_Innovation_Pathways/160649

Links/relationships to ancillary data sets: Yes. Additional Datasets are available in the fig share link.

Was data derived from another source? No A. If yes, list source(s):

Recommended citation for this dataset: Kilham, Sarina (2023). Interview Margaret Sevenjhazi: Mapping Urban Driven Innovations for Sustainable Food Systems in Sydney, Australia

METHODOLOGICAL INFORMATION

Description of methods used for collection/generation of data: Semi-structured interviews. Data was created through interviews with participants in Australian English. Interviews were audio recorded on iPhone 8 with rode microphone.

Consent: Each participant was provided a Participant Information Form and a consent form. Participants were asked explicitly asked about inclusion of their material in the open- access databank and provided informed consent.

Ownership and Access: R01-Sarina Kilham. Open-Access Data Archive- Creative Commons Attribution Non Commercial CC BY-NC

Collection Mode: Audio recorded semi-structured interview. Interviews conducted in person or via online meeting software.

Sampling Method: purposeful sampling via open invitation to community of practitioners.

Methods for processing the data: Interviews were recorded in multiple audio-files to prevent data loss by stopping and re-starting the audio recorder. Audio files were combined in the software "Descript" and exported as a single audio file. The single audio file was transcribed by AI-Tool in the software "Descript" and checked by a human for accuracy. Transcripts were sent to participants to for accuracy checking, and the version included here incorporates any amendments or rectifications from the participants.

Instrument- or software-specific information needed to interpret the data: Software: Descript. Used for AI-transcriptions and combining multiple audio files into single file.

Standards and calibration information, if appropriate: n/a

Environmental/experimental conditions: n/a

Describe any quality-assurance procedures performed on the data: Sarina Kilham checked all transcriptions against the original audio for accuracy. Participants were sent transcriptions for checking.

People involved with sample collection, processing, analysis and/or submission: R01-SARINA KILHAM

DATA & FILE OVERVIEW

File List: 001-interviewaudio_p01_margaretsevenjhazi_v01.mp3: single audio file of interview 001-interviewtranscript_p01_margaretsevenjhazi_v01.txt : transcript of audio Readme.txt: metadata, license and file data.

Relationship between files, if important: Dataset files can be used individually or as a dataset collection.

Additional related data collected that was not included in the current data package: No

4. Are there multiple versions of the dataset? No

DATA-SPECIFIC INFORMATION FOR: 002-interviewtranscript_p02_margaretsevenjhazi_v01.txt

Number of variables: n/a

Number of cases/rows: n/a

Variable List: n/a

Missing data codes: n/a

Specialized formats or other abbreviations used: n/a 6: Kind: text format 7: word count: 5580

DATA-SPECIFIC INFORMATION FOR: 001-interviewaudio_p01_margaretsevenjhazi_v01.mp3

Number of variables: n/a

Number of cases/rows: n/a

Variable List: n/a

Missing data codes: n/a

Specialized formats or other abbreviations used: n/a

Duration: 33 minutes 13 seconds

Audio channels: mono

Sample rate: 44.1kHz

Kind: mp3 audio

CITATION TOPICS Web of Science Macro Level Citation topic: Social Sciences Web of Science Meso label: 6.86 Human Geography 6.153 Climate Change 6.153 Climate Change 6.153 Climate Change 6.263 Agricultural Policy 6.263 Agricultural Policy 6.303 Sociology 6.303 Sociology Web of Science Micro label: 6.86.149 Gentrification 6.153.558 Climate Change Adaptation 6.153.742 Science Communication 6.153.2227 Strategic Environmental Assessment 6.263.898 Farmers 6.263.1407 Urban Agriculture 6.303.1915 Public Sociology 6.303.2393 Social Policies

--END---
Z
Groove2Groove MIDI Dataset: synthetic accompaniments in 3k styles
data.niaid.nih.gov
zenodo.org
Updated Apr 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ondřej Cífka (2021). Groove2Groove MIDI Dataset: synthetic accompaniments in 3k styles [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3957999
Explore at:
Dataset updated
Apr 26, 2021
Dataset provided by
Gaël Richard
Umut Şimşekli
Ondřej Cífka
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
The Groove2Groove MIDI Dataset is a parallel corpus of synthetic MIDI accompaniments in almost 3000 different styles, created as described in the paper Groove2Groove: One-Shot Accompaniment Style Transfer with Supervision from Synthetic Data [pdf]. See the README.md file or the Groove2Groove website for more information.

The dataset is split into the following sections:

train contains 5744 MIDI files in 2872 styles (exactly 2 files per style). Each file contains 252 measures following a 2 measure count-in.

val and test each contain 1200 files in 40 styles (exactly 30 files per style, 16 bars per file after the count-in). The sets of styles are disjoint from each other and from those in train.

itest is generated from the same chord charts as test, but in 40 styles from the training set.

Chord charts for all MIDI files are provided in the ABC format and the Band-in-a-Box (MGU) format. Each chord chart corresponds to at least 2 MIDI files in different styles.

The code used to automate Band-in-a-Box is available in the pybiab package.

If you use the data in your research, please reference the paper (not just the Zenodo record):

@article{groove2groove, author={Ond\v{r}ej C\'{i}fka and Umut \c{S}im\c{s}ekli and Ga"{e}l Richard}, title={{Groove2Groove}: One-Shot Music Style Transfer with Supervision from Synthetic Data}, journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing}, publisher={IEEE}, year={2020}, volume={28}, pages={2638--2650}, doi={10.1109/TASLP.2020.3019642}, url={https://doi.org/10.1109/TASLP.2020.3019642} }
Twitter archive from Future of Libraries Summit, #Fol15 held at The Michael...
figshare.com
xlsx
Updated Jan 20, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Gallagher (2016). Twitter archive from Future of Libraries Summit, #Fol15 held at The Michael Fowler Centre, Wellington, New Zealand, 31 July 2015 [Dataset]. http://doi.org/10.6084/m9.figshare.1501373.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1501373.v1
Dataset updated
Jan 20, 2016
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Sarah Gallagher
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Wellington, New Zealand
Description
This file contains a data set of 1800 tweets (1136 unique tweets) tagged with #Fol15 (case not sensitive) that were published on Twitter between 26/07/2015 23:21:06 and 04/08/2015 03:12:21 GMT. This file was created and shared by Sarah Gallagher @sarahlibrarina (University of Otago, Dunedin, New Zealand) with a Creative Commons Attribution license (CC-BY) to encourage research in scholarly communication and for educational purposes. The Tweets contained in this file were collected using Martin Hawksey’s [@mhawksey] TAGS 6.0 http://tags.hawksey.info/ An automatic deduplication has been performed and spam tweets have been manually reoved. Despite this intervention data may require further refining. The contents of each Tweet are responsibility of the original authors. This data set has been shared “as is”, for educational and research purposes. As noted by Priego (2014) “both research and experience show that the Twitter search API isn't 100% reliable. Large tweet volumes affect the search collection process. The API might "over-represent the more central users", not offering "an accurate picture of peripheral activity" (González-Bailón, Sandra, et al. 2012).” Therefore there is no guarantee that this file all Tweets tagged with (#Fol15) during the indicated period. This dataset is shared to encourage open research into scholarly activity on Twitter. If you use or refer to this data in any way please cite using the preferred citation above. Contact: @sarahlibrarina
Z
Data from: A dataset of global variations in directional solar radiation...
data.niaid.nih.gov
zenodo.org
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Durand, Maxime (2022). A dataset of global variations in directional solar radiation exposure for ocular research using the libRadtran radiative transfer model [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7050988
Explore at:
Dataset updated
Dec 14, 2022
Dataset provided by
McLeod, Andrew
Durand, Maxime
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Directional solar photon flux density has particular relevance to eye disease research (keratitis, cataract formation, macula degeneration) because ocular components (cornea, lens, retina) experience different exposures dependent on global location, structural geometry of the eye and human behaviour (Sliney, 1997). The human macula has a field of view of ~17°, or 0.06901537 sr (Strasburger, Rentschler & Jüttner, 2011) and its cone of exposure can be modelled at a range of global locations using a radiation transfer model to estimate different directions of irradiation. This dataset provides examples of spectral radiance within the macula field of vision, calculated with the radiative transfer model libRadtran v2.0.3 (Mayer & Kylling, 2005). Three data sets are provided at different latitudes without correction for spectral ocular transmission. Unless otherwise specified, all simulations were parametrized according to local meteorological condition (altitude, pressure, temperature) and atmospheric conditions on the simulated day (aerosol optical density, water column, O3 and NO2 concentrations). The model was parametrized for a subject looking northward toward the ground (-15° from horizon), at a height of 170 cm above the ground.

For each simulation, a separate file is available for each condition (latitude, time, date, see below) that includes radiance at each wavelength. Radiance values are in mW m-2 nm-1 sr-1.

The technique provides future opportunity to model global exposures of different ocular components to spectral solar irradiance using information on ocular transmission, local terrain, albedo and human behaviour in order to explore their relevance in epidemiological studies of age-related eye disease.

For each simulation, a separate file is available for each condition (latitude, time, date, see below) that includes radiance at each wavelength. Radiance values are in mW m-2 nm-1 sr-1.

The technique provides future opportunity to model global exposures of different ocular components to spectral solar irradiance using information on ocular transmission, local terrain, albedo and human behaviour in order to explore their relevance in epidemiological studies of age-related eye disease.

Simulation 1: This data set reports the spectral radiance from 250 - 500 nm at:

3 latitudes (61.0: Southern Finland, 50.1 Northern France, 38.0: Central Spain).

4 dates (April 17th, July 1st, September 1st, November 6th 2019).

24 hours.

8 cardinal directions (every 45° from North).

2 aerosol optical densities (0.1 and 2.5).

Simulation 2: This data set reports the spectral radiance from 250 - 2,500 nm at:

3 latitudes (61.0: Southern Finland, 50.1 Northern France, 38.0: Central Spain).

4 dates (April 17th, July 1st, September 1st, November 6th 2019).

24 hours.

1 cardinal direction (North).

2 aerosol optical densities (0.1 and 2.5).

Simulation 3: This data set reports the spectral radiance from 250 - 500 nm at:

1 latitude (61.0: Southern Finland).

4 dates (April 17th, July 1st, September 1st, November 6th 2019).

24 hours.

9 cardinal directions (every 40° from North).

3 bidirectional reflectance distribution functions for the ground (forest, urban, snow).

2 tilt angles for the eye direction (0° from horizon or -15° from horizon, toward the ground).
Z
PP-ind: A Repository of Industrial Pair Programming Research Data
data.niaid.nih.gov
Updated Feb 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prechelt, Lutz (2021). PP-ind: A Repository of Industrial Pair Programming Research Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4529142
Explore at:
Dataset updated
Feb 16, 2021
Dataset provided by
Prechelt, Lutz
Zieris, Franz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PP-ind is a repository of research data on industrial pair programming sessions. Since 2007, our research group has collected audio-video-recordings and questionnaire data in 13 companies. A total of 57 developers worked together (mostly in groups of two, but also three or four) in 67 sessions with a mean length of 1:35 hours. A separate tech report provides many details on how this data was collected.

While we cannot share the original video recordings due to confidentiality agreements, we do provide transcripts of the pairs' dialog in this data set. Since we perform our analyses directly on the video material, we only transcribe our data on an is-needed basis, e.g., in preparation for a publication. This data set will therefore contain only few and partial transcripts, which may be amended in future versions.

Files named session-
Z
The motion of trees in the wind - a collection of multiple data sets
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Jun 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Van Bloem, Skip (2021). The motion of trees in the wind - a collection of multiple data sets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4911395
Explore at:
Dataset updated
Jun 10, 2021
Dataset provided by
Gardiner, Barry
Wellpott, Axel
Bunce, Amanda
Jackson, Toby
James, Ken
Achim, Alexis
Van Bloem, Skip
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The tree motion data in this repository were collected from multiple sites with different sensors. There are two types of data:

(a) 1-hour sample data from 27 sites (/one_hour_samples/*.csv). Each file represents a single tree and contains two columns representing the two horizontal axes of tree motion. These data are uncalibrated and do not have a time-stamp. File names refer to: OwnersInitials_Site_treeID_resolution_SensorType for example TJ_MY_7_4Hz_Strain These filenames match the first column in the 'TreeSummary.csv' table

(b) long-term data from 9 data sets (/site_name.zip/) The wind and tree data are stored in separate files for each day of data, labelled with site name, tree ID or wind, year, month, day. For example Kershope_T25_1990_5_23

In each file, the first column is the time and date in format YYYY-MM-DD HH:mm:ss.SSS The following columns are tree motion data or wind data with units in the variable names. Some of the data were calibrated by pulling the tree with a known force, in these cases calibration coefficients are given in the 'TreeSummary' table

Wind data for the Australian data set is stored in the folder with data for each tree, because the trees were not close together enough to share a wind measurement.

The data for Orange, Storrs and Torrington were collected with inclinometers. The data for all trees is stored together in each daily file.

The long-term tree motion data have not been pre-processed because user choices affect the results. Importantly, you will need to remove a varying offset (caused by sensor drift) from the tree motion data. See discussion in the accompanying publication and references therein.

References: Publication describing all collated data and necessary pre-processing. Jackson, T. et al (2021) The motion of trees in the wind: a data synthesis. Biogeosciences.

Australian data decsribed in: James, K. et al (2006).: Mechanical stability of trees under dynamic loads, Am. J. Bot. doi:10.3732/ajb.93.10.1522

Kershope and Rivox data described in: Gardiner, B. A., et al. "Field and wind tunnel assessments of the implications of respacing and thinning for tree stability." Forestry: An International Journal of Forest Research 70.3 (1997): 233-252.

Kyloe and Clocaenog described in: Hale, Sophie E., et al. "Wind loading of trees: influence of tree size and competition." European Journal of Forest Research 131.1 (2012): 203-217.

Orange, Storrs and Torrington described in: Bunce, Amanda, et al. "Determinants of tree sway frequency in temperate deciduous forests of the Northeast United States." Agricultural and Forest Meteorology 266 (2019): 87-96.

Other large tree motion data sets available at the time of writing (June 2021): https://doi.org/10.7910/DVN/FHJBYG http://www.hydroshare.org/resource/38ae9d9fb88d49f9ad2eed1ee07475c0 https://doi.org/10.5683/SP2/WZIKSR https://data.4tu.nl/articles/dataset/Tree_sway_of_19_Amazon_trees/12714989/1 https://doi.org/10.5285/657f420e-f956-4c33-b7d6-98c7a18aa07a https://doi.org/10.5285/533d87d3-48c1-4c6e-9f2f-fda273ab45bc
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Avanzini, Federico (2024). Supplementary data to the paper: Toward a Novel Set of Pinna Anthropometric Features for Individualizing Head-Related Transfer Functions [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10805884

Supplementary data to the paper: Toward a Novel Set of Pinna Anthropometric Features for Individualizing Head-Related Transfer Functions

Explore at:

Dataset updated

Jul 9, 2024

Dataset provided by

Fantini, Davide
Ntalampiras, Stavros
PRESTI, GIORGIO
Avanzini, Federico

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Supplementary research data to the paper:

Davide Fantini, Federico Avanzini, Stavros Ntalampiras and Giorgio Presti (2024) "Toward a Novel Set of Pinna Anthropometric Features for Individualizing Head-Related Transfer Functions", Sound and Music Computing Conference

The repository includes the research data generated in the abovementioned paper. In particular, the repository includes:

README.md: instructions for the data

pinna_images.mat: pinna depth images extracted from the 3D head meshes of the HUTUBS dataset

landmarks.mat: coordinates of the landmarks manually annotated on pinna depth images

anthropometry.mat: anthropometric parameters automatically extracted from manually annotated landmarks

anthropometry_documentation.pdf: documentation of the pinna anthropometric parameters

poster.pdf: poster presented at the SMC conference 2024

The data are provided in the Matlab file format MAT. Nevertheless, the MAT files can be read with other programming languages, such as Python (scipy.io.loadmat).

A GitHub repository to automatically extract the pinna landmarks and features as described in the paper is available here.

Clear search

Close search

Google apps

Main menu

Supplementary data to the paper: Toward a Novel Set of Pinna Anthropometric...

Film Circulation dataset

Replication Data and additional supporting files for Sharing Individual...

Focus Groups on Data Sharing and Research Data Management with Scientists...

Graph topological features extracted from expression profiles of...

Backup-As-A-Service Market Analysis North America, APAC, Europe, South...

Snapshot img

Data from: SC-CoMIcs (Superconductivity Corpus for Materials Infomatics)

The sharing of research raw data in journals indexed in the Cell & Tissue...

Data from: Sigfox and LoRaWAN Datasets for Fingerprint Localization in Large...

CRAWDAD umass/diesel (v. 2008-09-14)

SMARTEOLE Wind Farm Control open dataset

Interview Margaret Sevenjhazi: Mapping Urban Driven Innovations for...

4. Are there multiple versions of the dataset? No

Groove2Groove MIDI Dataset: synthetic accompaniments in 3k styles

Twitter archive from Future of Libraries Summit, #Fol15 held at The Michael...

Data from: A dataset of global variations in directional solar radiation...

PP-ind: A Repository of Industrial Pair Programming Research Data

The motion of trees in the wind - a collection of multiple data sets

Supplementary data to the paper: Toward a Novel Set of Pinna Anthropometric Features for Individualizing Head-Related Transfer FunctionsSee More Versions

Supplementary data to the paper: Toward a Novel Set of Pinna Anthropometric Features for Individualizing Head-Related Transfer Functions