100+ datasets found

Z
Dataset: A Systematic Literature Review on the topic of High-value datasets
data.niaid.nih.gov
Updated Jun 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anastasija Nikiforova (2023). Dataset: A Systematic Literature Review on the topic of High-value datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7944424
Explore at:
Dataset updated
Jun 23, 2023
Dataset provided by
Andrea Miletič
Magdalena Ciesielska
Nina Rizun
Anastasija Nikiforova
Charalampos Alexopoulos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains data collected during a study ("Towards High-Value Datasets determination for data-driven development: a systematic literature review") conducted by Anastasija Nikiforova (University of Tartu), Nina Rizun, Magdalena Ciesielska (Gdańsk University of Technology), Charalampos Alexopoulos (University of the Aegean) and Andrea Miletič (University of Zagreb) It being made public both to act as supplementary data for "Towards High-Value Datasets determination for data-driven development: a systematic literature review" paper (pre-print is available in Open Access here -> https://arxiv.org/abs/2305.10234) and in order for other researchers to use these data in their own work.

The protocol is intended for the Systematic Literature review on the topic of High-value Datasets with the aim to gather information on how the topic of High-value datasets (HVD) and their determination has been reflected in the literature over the years and what has been found by these studies to date, incl. the indicators used in them, involved stakeholders, data-related aspects, and frameworks. The data in this dataset were collected in the result of the SLR over Scopus, Web of Science, and Digital Government Research library (DGRL) in 2023.

Methodology

To understand how HVD determination has been reflected in the literature over the years and what has been found by these studies to date, all relevant literature covering this topic has been studied. To this end, the SLR was carried out to by searching digital libraries covered by Scopus, Web of Science (WoS), Digital Government Research library (DGRL).

These databases were queried for keywords ("open data" OR "open government data") AND ("high-value data*" OR "high value data*"), which were applied to the article title, keywords, and abstract to limit the number of papers to those, where these objects were primary research objects rather than mentioned in the body, e.g., as a future work. After deduplication, 11 articles were found unique and were further checked for relevance. As a result, a total of 9 articles were further examined. Each study was independently examined by at least two authors.

To attain the objective of our study, we developed the protocol, where the information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information.

Test procedure Each study was independently examined by at least two authors, where after the in-depth examination of the full-text of the article, the structured protocol has been filled for each study. The structure of the survey is available in the supplementary file available (see Protocol_HVD_SLR.odt, Protocol_HVD_SLR.docx) The data collected for each study by two researchers were then synthesized in one final version by the third researcher.

Description of the data in this data set

Protocol_HVD_SLR provides the structure of the protocol Spreadsheets #1 provides the filled protocol for relevant studies. Spreadsheet#2 provides the list of results after the search over three indexing databases, i.e. before filtering out irrelevant studies

The information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information

Descriptive information
1) Article number - a study number, corresponding to the study number assigned in an Excel worksheet 2) Complete reference - the complete source information to refer to the study 3) Year of publication - the year in which the study was published 4) Journal article / conference paper / book chapter - the type of the paper -{journal article, conference paper, book chapter} 5) DOI / Website- a link to the website where the study can be found 6) Number of citations - the number of citations of the article in Google Scholar, Scopus, Web of Science 7) Availability in OA - availability of an article in the Open Access 8) Keywords - keywords of the paper as indicated by the authors 9) Relevance for this study - what is the relevance level of the article for this study? {high / medium / low}

Approach- and research design-related information 10) Objective / RQ - the research objective / aim, established research questions 11) Research method (including unit of analysis) - the methods used to collect data, including the unit of analy-sis (country, organisation, specific unit that has been ana-lysed, e.g., the number of use-cases, scope of the SLR etc.) 12) Contributions - the contributions of the study 13) Method - whether the study uses a qualitative, quantitative, or mixed methods approach? 14) Availability of the underlying research data- whether there is a reference to the publicly available underly-ing research data e.g., transcriptions of interviews, collected data, or explanation why these data are not shared? 15) Period under investigation - period (or moment) in which the study was conducted 16) Use of theory / theoretical concepts / approaches - does the study mention any theory / theoretical concepts / approaches? If any theory is mentioned, how is theory used in the study?

Quality- and relevance- related information
17) Quality concerns - whether there are any quality concerns (e.g., limited infor-mation about the research methods used)? 18) Primary research object - is the HVD a primary research object in the study? (primary - the paper is focused around the HVD determination, sec-ondary - mentioned but not studied (e.g., as part of discus-sion, future work etc.))

HVD determination-related information
19) HVD definition and type of value - how is the HVD defined in the article and / or any other equivalent term? 20) HVD indicators - what are the indicators to identify HVD? How were they identified? (components & relationships, “input -> output") 21) A framework for HVD determination - is there a framework presented for HVD identification? What components does it consist of and what are the rela-tionships between these components? (detailed description) 22) Stakeholders and their roles - what stakeholders or actors does HVD determination in-volve? What are their roles? 23) Data - what data do HVD cover? 24) Level (if relevant) - what is the level of the HVD determination covered in the article? (e.g., city, regional, national, international)

Format of the file .xls, .csv (for the first spreadsheet only), .odt, .docx

Licenses or restrictions CC-BY

For more info, see README.txt
Dataset 1: Studies included in literature review
catalog.data.gov
data.amerigeoss.org
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Dataset 1: Studies included in literature review [Dataset]. https://catalog.data.gov/dataset/dataset-1-studies-included-in-literature-review
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
This dataset contains the results of a literature review of experimental nutrient addition studies to determine which nutrient forms were most often measured in the scientific literature. To obtain a representative selection of relevant studies, we searched Web of Science™ using a search string to target experimental studies in artificial and natural lotic systems while limiting irrelevant papers. We screened the titles and abstracts of returned papers for relevance (experimental studies in streams/stream mesocosms that manipulated nutrients). To supplement this search, we sorted the relevant articles from the Web of Science™ search alphabetically by author and sequentially examined the bibliographies for additional relevant articles (screening titles for relevance, and then screening abstracts of potentially relevant articles) until we had obtained a total of 100 articles. If we could not find a relevant article electronically, we moved to the next article in the bibliography. Our goal was not to be completely comprehensive, but to obtain a fairly large sample of published, peer-reviewed studies from which to assess patterns. We excluded any lentic or estuarine studies from consideration and included only studies that used mesocosms mimicking stream systems (flowing water or stream water source) or that manipulated nutrient concentrations in natural streams or rivers. We excluded studies that used nutrient diffusing substrate (NDS) because these manipulate nutrients on substrates and not in the water column. We also excluded studies examining only nutrient uptake, which rely on measuring dissolved nutrient concentrations with the goal of characterizing in-stream processing (e.g., Newbold et al., 1983). From the included studies, we extracted or summarized the following information: study type, study duration, nutrient treatments, nutrients measured, inclusion of TN and/or TP response to nutrient additions, and a description of how results were reported in relation to the research-management mismatch, if it existed. Below is information on how the search was conducted: Search string used for Web of Science advanced search Search conducted on 27 September 2016. TS= (stream OR creek OR river* OR lotic OR brook OR headwater OR tributary) AND TS = (mesocosm OR flume OR "artificial stream" OR "experimental stream" OR "nutrient addition") AND TI= (nitrogen OR phosphorus OR nutrient OR enrichment OR fertilization OR eutrophication)
l
Data from: Where do engineering students really get their information? :...
opal.latrobe.edu.au
researchdata.edu.au
pdf
Updated Mar 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Clayton Bolitho (2025). Where do engineering students really get their information? : using reference list analysis to improve information literacy programs [Dataset]. http://doi.org/10.4225/22/59d45f4b696e4
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.4225/22/59d45f4b696e4
Dataset updated
Mar 13, 2025
Dataset provided by
La Trobe
Authors
Clayton Bolitho
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundAn understanding of the resources which engineering students use to write their academic papers provides information about student behaviour as well as the effectiveness of information literacy programs designed for engineering students. One of the most informative sources of information which can be used to determine the nature of the material that students use is the bibliography at the end of the students’ papers. While reference list analysis has been utilised in other disciplines, few studies have focussed on engineering students or used the results to improve the effectiveness of information literacy programs. Gadd, Baldwin and Norris (2010) found that civil engineering students undertaking a finalyear research project cited journal articles more than other types of material, followed by books and reports, with web sites ranked fourth. Several studies, however, have shown that in their first year at least, most students prefer to use Internet search engines (Ellis & Salisbury, 2004; Wilkes & Gurney, 2009).PURPOSEThe aim of this study was to find out exactly what resources undergraduate students studying civil engineering at La Trobe University were using, and in particular, the extent to which students were utilising the scholarly resources paid for by the library. A secondary purpose of the research was to ascertain whether information literacy sessions delivered to those students had any influence on the resources used, and to investigate ways in which the information literacy component of the unit can be improved to encourage students to make better use of the resources purchased by the Library to support their research.DESIGN/METHODThe study examined student bibliographies for three civil engineering group projects at the Bendigo Campus of La Trobe University over a two-year period, including two first-year units (CIV1EP – Engineering Practice) and one-second year unit (CIV2GR – Engineering Group Research). All units included a mandatory library session at the start of the project where student groups were required to meet with the relevant faculty librarian for guidance. In each case, the Faculty Librarian highlighted specific resources relevant to the topic, including books, e-books, video recordings, websites and internet documents. The students were also shown tips for searching the Library catalogue, Google Scholar, LibSearch (the LTU Library’s research and discovery tool) and ProQuest Central. Subject-specific databases for civil engineering and science were also referred to. After the final reports for each project had been submitted and assessed, the Faculty Librarian contacted the lecturer responsible for the unit, requesting copies of the student bibliographies for each group. References for each bibliography were then entered into EndNote. The Faculty Librarian grouped them according to various facets, including the name of the unit and the group within the unit; the material type of the item being referenced; and whether the item required a Library subscription to access it. A total of 58 references were collated for the 2010 CIV1EP unit; 237 references for the 2010 CIV2GR unit; and 225 references for the 2011 CIV1EP unit.INTERIM FINDINGSThe initial findings showed that student bibliographies for the three group projects were primarily made up of freely available internet resources which required no library subscription. For the 2010 CIV1EP unit, all 58 resources used were freely available on the Internet. For the 2011 CIV1EP unit, 28 of the 225 resources used (12.44%) required a Library subscription or purchase for access, while the second-year students (CIV2GR) used a greater variety of resources, with 71 of the 237 resources used (29.96%) requiring a Library subscription or purchase for access. The results suggest that the library sessions had little or no influence on the 2010 CIV1EP group, but the sessions may have assisted students in the 2011 CIV1EP and 2010 CIV2GR groups to find books, journal articles and conference papers, which were all represented in their bibliographiesFURTHER RESEARCHThe next step in the research is to investigate ways to increase the representation of scholarly references (found by resources other than Google) in student bibliographies. It is anticipated that such a change would lead to an overall improvement in the quality of the student papers. One way of achieving this would be to make it mandatory for students to include a specified number of journal articles, conference papers, or scholarly books in their bibliographies. It is also anticipated that embedding La Trobe University’s Inquiry/Research Quiz (IRQ) using a constructively aligned approach will further enhance the students’ research skills and increase their ability to find suitable scholarly material which relates to their topic. This has already been done successfully (Salisbury, Yager, & Kirkman, 2012)CONCLUSIONS & CHALLENGESThe study shows that most students rely heavily on the free Internet for information. Students don’t naturally use Library databases or scholarly resources such as Google Scholar to find information, without encouragement from their teachers, tutors and/or librarians. It is acknowledged that the use of scholarly resources doesn’t automatically lead to a high quality paper. Resources must be used appropriately and students also need to have the skills to identify and synthesise key findings in the existing literature and relate these to their own paper. Ideally, students should be able to see the benefit of using scholarly resources in their papers, and continue to seek these out even when it’s not a specific assessment requirement, though it can’t be assumed that this will be the outcome.REFERENCESEllis, J., & Salisbury, F. (2004). Information literacy milestones: building upon the prior knowledge of first-year students. Australian Library Journal, 53(4), 383-396.Gadd, E., Baldwin, A., & Norris, M. (2010). The citation behaviour of civil engineering students. Journal of Information Literacy, 4(2), 37-49.Salisbury, F., Yager, Z., & Kirkman, L. (2012). Embedding Inquiry/Research: Moving from a minimalist model to constructive alignment. Paper presented at the 15th International First Year in Higher Education Conference, Brisbane. Retrieved from http://www.fyhe.com.au/past_papers/papers12/Papers/11A.pdfWilkes, J., & Gurney, L. J. (2009). Perceptions and applications of information literacy by first year applied science students. Australian Academic & Research Libraries, 40(3), 159-171.
Z
Conceptualization of public data ecosystems
data.niaid.nih.gov
Updated Sep 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin, Lnenicka (2024). Conceptualization of public data ecosystems [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13842001
Explore at:
Dataset updated
Sep 26, 2024
Dataset provided by
Martin, Lnenicka
Anastasija, Nikiforova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains data collected during a study "Understanding the development of public data ecosystems: from a conceptual model to a six-generation model of the evolution of public data ecosystems" conducted by Martin Lnenicka (University of Hradec Králové, Czech Republic), Anastasija Nikiforova (University of Tartu, Estonia), Mariusz Luterek (University of Warsaw, Warsaw, Poland), Petar Milic (University of Pristina - Kosovska Mitrovica, Serbia), Daniel Rudmark (Swedish National Road and Transport Research Institute, Sweden), Sebastian Neumaier (St. Pölten University of Applied Sciences, Austria), Karlo Kević (University of Zagreb, Croatia), Anneke Zuiderwijk (Delft University of Technology, Delft, the Netherlands), Manuel Pedro Rodríguez Bolívar (University of Granada, Granada, Spain).

As there is a lack of understanding of the elements that constitute different types of value-adding public data ecosystems and how these elements form and shape the development of these ecosystems over time, which can lead to misguided efforts to develop future public data ecosystems, the aim of the study is: (1) to explore how public data ecosystems have developed over time and (2) to identify the value-adding elements and formative characteristics of public data ecosystems. Using an exploratory retrospective analysis and a deductive approach, we systematically review 148 studies published between 1994 and 2023. Based on the results, this study presents a typology of public data ecosystems and develops a conceptual model of elements and formative characteristics that contribute most to value-adding public data ecosystems, and develops a conceptual model of the evolutionary generation of public data ecosystems represented by six generations called Evolutionary Model of Public Data Ecosystems (EMPDE). Finally, three avenues for a future research agenda are proposed.

This dataset is being made public both to act as supplementary data for "Understanding the development of public data ecosystems: from a conceptual model to a six-generation model of the evolution of public data ecosystems ", Telematics and Informatics*, and its Systematic Literature Review component that informs the study.

Description of the data in this data set

PublicDataEcosystem_SLR provides the structure of the protocol

Spreadsheet#1 provides the list of results after the search over three indexing databases and filtering out irrelevant studies

Spreadsheets #2 provides the protocol structure.

Spreadsheets #3 provides the filled protocol for relevant studies.

The information on each selected study was collected in four categories:(1) descriptive information,(2) approach- and research design- related information,(3) quality-related information,(4) HVD determination-related information

Descriptive Information

Article number

A study number, corresponding to the study number assigned in an Excel worksheet

Complete reference

The complete source information to refer to the study (in APA style), including the author(s) of the study, the year in which it was published, the study's title and other source information.

Year of publication

The year in which the study was published.

Journal article / conference paper / book chapter

The type of the paper, i.e., journal article, conference paper, or book chapter.

Journal / conference / book

Journal article, conference, where the paper is published.

DOI / Website

A link to the website where the study can be found.

Number of words

A number of words of the study.

Number of citations in Scopus and WoS

The number of citations of the paper in Scopus and WoS digital libraries.

Availability in Open Access

Availability of a study in the Open Access or Free / Full Access.

Keywords

Keywords of the paper as indicated by the authors (in the paper).

Relevance for our study (high / medium / low)

What is the relevance level of the paper for our study

Approach- and research design-related information

Approach- and research design-related information

Objective / Aim / Goal / Purpose & Research Questions

The research objective and established RQs.

Research method (including unit of analysis)

The methods used to collect data in the study, including the unit of analysis that refers to the country, organisation, or other specific unit that has been analysed such as the number of use-cases or policy documents, number and scope of the SLR etc.

Study’s contributions

The study’s contribution as defined by the authors

Qualitative / quantitative / mixed method

Whether the study uses a qualitative, quantitative, or mixed methods approach?

Availability of the underlying research data

Whether the paper has a reference to the public availability of the underlying research data e.g., transcriptions of interviews, collected data etc., or explains why these data are not openly shared?

Period under investigation

Period (or moment) in which the study was conducted (e.g., January 2021-March 2022)

Use of theory / theoretical concepts / approaches? If yes, specify them

Does the study mention any theory / theoretical concepts / approaches? If yes, what theory / concepts / approaches? If any theory is mentioned, how is theory used in the study? (e.g., mentioned to explain a certain phenomenon, used as a framework for analysis, tested theory, theory mentioned in the future research section).

Quality-related information

Quality concerns

Whether there are any quality concerns (e.g., limited information about the research methods used)?

Public Data Ecosystem-related information

Public data ecosystem definition

How is the public data ecosystem defined in the paper and any other equivalent term, mostly infrastructure. If an alternative term is used, how is the public data ecosystem called in the paper?

Public data ecosystem evolution / development

Does the paper define the evolution of the public data ecosystem? If yes, how is it defined and what factors affect it?

What constitutes a public data ecosystem?

What constitutes a public data ecosystem (components & relationships) - their "FORM / OUTPUT" presented in the paper (general description with more detailed answers to further additional questions).

Components and relationships

What components does the public data ecosystem consist of and what are the relationships between these components? Alternative names for components - element, construct, concept, item, helix, dimension etc. (detailed description).

Stakeholders

What stakeholders (e.g., governments, citizens, businesses, Non-Governmental Organisations (NGOs) etc.) does the public data ecosystem involve?

Actors and their roles

What actors does the public data ecosystem involve? What are their roles?

Data (data types, data dynamism, data categories etc.)

What data do the public data ecosystem cover (is intended / designed for)? Refer to all data-related aspects, including but not limited to data types, data dynamism (static data, dynamic, real-time data, stream), prevailing data categories / domains / topics etc.

Processes / activities / dimensions, data lifecycle phases

What processes, activities, dimensions and data lifecycle phases (e.g., locate, acquire, download, reuse, transform, etc.) does the public data ecosystem involve or refer to?

Level (if relevant)

What is the level of the public data ecosystem covered in the paper? (e.g., city, municipal, regional, national (=country), supranational, international).

Other elements or relationships (if any)

What other elements or relationships does the public data ecosystem consist of?

Additional comments

Additional comments (e.g., what other topics affected the public data ecosystems and their elements, what is expected to affect the public data ecosystems in the future, what were important topics by which the period was characterised etc.).

New papers

Does the study refer to any other potentially relevant papers?

Additional references to potentially relevant papers that were found in the analysed paper (snowballing).

Format of the file.xls, .csv (for the first spreadsheet only), .docx

Licenses or restrictionsCC-BY

For more info, see README.txt
o
Career promotions, research publications, Open Access dataset
ordo.open.ac.uk
zip
Updated Feb 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matteo Cancellieri; Nancy Pontika; David Pride; Petr Knoth; Hannah Metzler; Antonia Correia; Helene Brinken; Bikash Gyawali (2022). Career promotions, research publications, Open Access dataset [Dataset]. http://doi.org/10.21954/ou.rd.19228785.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.21954/ou.rd.19228785.v1
Dataset updated
Feb 28, 2022
Dataset provided by
The Open University
Authors
Matteo Cancellieri; Nancy Pontika; David Pride; Petr Knoth; Hannah Metzler; Antonia Correia; Helene Brinken; Bikash Gyawali
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is a compilation of processed data on citation and references for research papers including their author, institution and open access info for a selected sample of academics analysed using Microsoft Academic Graph (MAG) data and CORE. The data for this dataset was collected during December 2019 to January 2020.Six countries (Austria, Brazil, Germany, India, Portugal, United Kingdom and United States) were the focus of the six questions which make up this dataset. There is one csv file per country and per question (36 files in total). More details about the creation of this dataset are available on the public ON-MERRIT D3.1 deliverable report.The dataset is a combination of two different data sources, one part is a dataset created on analysing promotion policies across the target countries, while the second part is a set of data points available to understand the publishing behaviour. To facilitate the analysis the dataset is organised in the following seven folders:PRTThe dataset with the file name "PRT_policies.csv" contains the related information as this was extracted from promotion, review and tenure (PRT) policies. Q1: What % of papers coming from a university are Open Access?- Dataset Name format: oa_status_countryname_papers.csv- Dataset Contents: Open Access (OA) status of all papers of all the universities listed in Times Higher Education World University Rankings (THEWUR) for the given country. A paper is marked OA if there is at least an OA link available. OA links are collected using the CORE Discovery API.- Important considerations about this dataset: - Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. - The service we used to recognise if a paper is OA, CORE Discovery, does not contain entries for all paperids in MAG. This implies that some of the records in the dataset extracted will not have either a true or false value for the _is_OA_ field. - Only those records marked as true for _is_OA_ field can be said to be OA. Others with false or no value for is_OA field are unknown status (i.e. not necessarily closed access).Q2: How are papers, published by the selected universities, distributed across the three scientific disciplines of our choice?- Dataset Name format: fsid_countryname_papers.csv- Dataset Contents: For the given country, all papers for all the universities listed in THEWUR with the information of fieldofstudy they belong to.- Important considerations about this dataset: * MAG can associate a paper to multiple fieldofstudyid. If a paper belongs to more than one of our fieldofstudyid, separate records were created for the paper with each of those _fieldofstudyid_s.- MAG assigns fieldofstudyid to every paper with a score. We preserve only those records whose score is more than 0.5 for any fieldofstudyid it belongs to.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.Q3: What is the gender distribution in authorship of papers published by the universities?- Dataset Name format: author_gender_countryname_papers.csv- Dataset Contents: All papers with their author names for all the universities listed in THEWUR.- Important considerations about this dataset :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- An external script was executed to determine the gender of the authors. The script is available here.Q4: Distribution of staff seniority (= number of years from their first publication until the last publication) in the given university.- Dataset Name format: author_ids_countryname_papers.csv- Dataset Contents: For a given country, all papers for authors with their publication year for all the universities listed in THEWUR.- Important considerations about this work :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- Calculating staff seniority can be achieved in various ways. The most straightforward option is to calculate it as _academic_age = MAX(year) - MIN(year) _for each authorid.Q5: Citation counts (incoming) for OA vs Non-OA papers published by the university.- Dataset Name format: cc_oa_countryname_papers.csv- Dataset Contents: OA status and OA links for all papers of all the universities listed in THEWUR and for each of those papers, count of incoming citations available in MAG.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to.- Only those records marked as true for _is_OA_ field can be said to be OA. Others with false or no value for is_OA field are unknown status (i.e. not necessarily closed access).Q6: Count of OA vs Non-OA references (outgoing) for all papers published by universities.- Dataset Name format: rc_oa_countryname_-papers.csv- Dataset Contents: Counts of all OA and unknown papers referenced by all papers published by all the universities listed in THEWUR.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers being referenced.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.Additional files:- _fieldsofstudy_mag_.csv: this file contains a dump of fieldsofstudy table of MAG mapping each of the ids to their actual field of study name.
U
Datasets for "The voices of home educated adolescents: a participatory...
researchdata.bath.ac.uk
docx, jpeg, pdf
Updated Jun 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fadoua Govaerts (2024). Datasets for "The voices of home educated adolescents: a participatory research study exploring their home education experiences" PhD project [Dataset]. http://doi.org/10.15125/BATH-01328
Explore at:
jpeg, pdf, docxAvailable download formats
Unique identifier
https://doi.org/10.15125/BATH-01328
Dataset updated
Jun 26, 2024
Dataset provided by
University of Bath
Authors
Fadoua Govaerts
License
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Dataset funded by
Self-funded
Description
This dataset relates to the PhD study, "The voices of home educated adolescents: a participatory research study exploring their home education experiences". The study is a participatory research project with young people aged 13–17 who are home educated. They used vlogs, blogs, or visual boards to collect data on their experiences of being home educated, with a particular focus on their perceptions of their educational outcomes and social development.

The dataset includes resources created by participants, including a vlog, three blogs and three visual boards.

The vlog is an insight into how playing video games is an opportunity of learning for the participant: it demonstrates his interest in historical events and weaponry. Furthermore, the research project and creating the vlog itself was a new experience for him and was seen as a learning opportunity and became integrated into his home education experience. To align with the research methodology and remain socially and culturally appropriate, the participant used this method of data collection as an insight into his lived experience as home educated. Home educated young people have the autonomy and flexibility to learn through various mediums and learning tools that interest and relate to them. Therefore this vlog demonstrates that doing research with children can include various data collection methods that relate to the child's lived experience.

The visual boards are representations of participants' experiences being home educated and their perceptions of their educational outcomes. The blogs are a collection of thoughts or diary entries of their experience being home educated.
Data of the article "Journal research data sharing policies: a study of...
zenodo.org
Updated May 26, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antti Rousi; Antti Rousi (2021). Data of the article "Journal research data sharing policies: a study of highly-cited journals in neuroscience, physics, and operations research" [Dataset]. http://doi.org/10.5281/zenodo.3635511
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.3635511
Dataset updated
May 26, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Antti Rousi; Antti Rousi
Description
The journals’ author guidelines and/or editorial policies were examined on whether they take a stance with regard to the availability of the underlying data of the submitted article. The mere explicated possibility of providing supplementary material along with the submitted article was not considered as a research data policy in the present study. Furthermore, the present article excluded source codes or algorithms from the scope of the paper and thus policies related to them are not included in the analysis of the present article.

For selection of journals within the field of neurosciences, Clarivate Analytics’ InCites Journal Citation Reports database was searched using categories of neurosciences and neuroimaging. From the results, journals with the 40 highest Impact Factor (for the year 2017) indicators were extracted for scrutiny of research data policies. Respectively, the selection journals within the field of physics was created by performing a similar search with the categories of physics, applied; physics, atomic, molecular & chemical; physics, condensed matter; physics, fluids & plasmas; physics, mathematical; physics, multidisciplinary; physics, nuclear and physics, particles & fields. From the results, journals with the 40 highest Impact Factor indicators were again extracted for scrutiny. Similarly, the 40 journals representing the field of operations research were extracted by using the search category of operations research and management.

Journal-specific data policies were sought from journal specific websites providing journal specific author guidelines or editorial policies. Within the present study, the examination of journal data policies was done in May 2019. The primary data source was journal-specific author guidelines. If journal guidelines explicitly linked to the publisher’s general policy with regard to research data, these were used in the analyses of the present article. If journal-specific research data policy, or lack of, was inconsistent with the publisher’s general policies, the journal-specific policies and guidelines were prioritized and used in the present article’s data. If journals’ author guidelines were not openly available online due to, e.g., accepting submissions on an invite-only basis, the journal was not included in the data of the present article. Also journals that exclusively publish review articles were excluded and replaced with the journal having the next highest Impact Factor indicator so that each set representing the three field of sciences consisted of 40 journals. The final data thus consisted of 120 journals in total.

‘Public deposition’ refers to a scenario where researcher deposits data to a public repository and thus gives the administrative role of the data to the receiving repository. ‘Scientific sharing’ refers to a scenario where researcher administers his or her data locally and by request provides it to interested reader. Note that none of the journals examined in the present article required that all data types underlying a submitted work should be deposited into a public data repositories. However, some journals required public deposition of data of specific types. Within the journal research data policies examined in the present article, these data types are well presented by the Springer Nature policy on “Availability of data, materials, code and protocols” (Springer Nature, 2018), that is, DNA and RNA data; protein sequences and DNA and RNA sequencing data; genetic polymorphisms data; linked phenotype and genotype data; gene expression microarray data; proteomics data; macromolecular structures and crystallographic data for small molecules. Furthermore, the registration of clinical trials in a public repository was also considered as a data type in this study. The term specific data types used in the custom coding framework of the present study thus refers to both life sciences data and public registration of clinical trials. These data types have community-endorsed public repositories where deposition was most often mandated within the journals’ research data policies.

The term ‘location’ refers to whether the journal’s data policy provides suggestions or requirements for the repositories or services used to share the underlying data of the submitted works. A mere general reference to ‘public repositories’ was not considered a location suggestion, but only references to individual repositories and services. The category of ‘immediate release of data’ examines whether the journals’ research data policy addresses the timing of publication of the underlying data of submitted works. Note that even though the journals may only encourage public deposition of the data, the editorial processes could be set up so that it leads to either publication of the research data or the research data metadata in conjunction to publishing of the submitted work.
Data from: Machine Learning for Software Engineering: A Tertiary Study
data.europa.eu
unknown
Updated Jul 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). Machine Learning for Software Engineering: A Tertiary Study [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-5715475?locale=da
Explore at:
unknown(1323715)Available download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset of the research paper: Machine Learning for Software Engineering: A Tertiary Study Machine learning (ML) techniques increase the effectiveness of software engineering (SE) lifecycle activities. We systematically collected, quality-assessed, summarized, and categorized 39 reviews on ML for SE published between 2009–2020, covering 2,506 primary studies. The SE areas most tackled with ML are software testing and quality, while human-centered areas appear more challenging for ML. We propose a number of ML for SE research challenges and actions including: conducting further empirical validation and industrial studies on ML; reconsidering deficient SE methods; documenting and automating data collection and pipeline processes; reexamining how industrial practitioners distribute their proprietary data; and implementing incremental ML approaches. The following data and source files are included. review-protocol.md: The protocol employed in this tertiary study data/ dl-search/ input/ acm_comput_surveys_overviews.bib: Surveys of ACM Computing Surveys journal acm_comput_surveys_overviews_titles.txt: Titles of surveys acm_comput_ml_surveys.bib: ML-related surveys of ACM Computing Surveys journal acm_comput_ml_surveys_titles.txt: Titles of ML-related surveys dl_search_queries.txt: Search queries applied to IEEE Xplore, ACM Digital Library, and Elsevier Scopus ml_keywords.txt: ML-related keywords extracted from ML-related survey titles and used in the search queries se_keywords.txt: SE-related keywords derived from the 15 SWEBOK Knowledge Areas (KAs—except for Computing Foundations, Mathematical Foundations, and Engineering Foundations) and used in the search queries secondary_studies_keywords.txt: Survey-related keywords composed of the 15 keywords introduced in the tertiary study on SLRs in SE by Kitchenham et al. (2010), and the survey titles, and used in the search queries output/ acm/ acm{1–9}.bib: Search results from ACM Digital Library ieee.csv: Search results from IEEE Xplore scopus_analyze_year.csv: Yearly distribution of ML and SE documents extracted from Scopus's Analyze search results page scopus.csv: Search results from Scopus study-selection/ backward_snowballing.csv: Additional secondary studies found through the backward snowballing process cohen_kappa_agreement.csv: Inter-rater reliability of reviewers in study selection dl_search_results.csv: Aggregated search results of all three digital libraries study_selection_reviewer_{1–2}.csv: Divided search results assessed by reviewer 1 and 2, correspondingly, based on IC/EC quality-assessment/ dare_assessment.csv: Quality assessment (QA) of selected secondary studies based on the Database of Abstracts of Reviews of Effects (DARE) criteria by York University, Centre for Reviews and Dissemination quality_accepted_studies.csv: Details of quality-accepted studies studies_for_review.bib: Bibliography details and QA scores of selected secondary studies data-extraction/ further_research.csv: Recommendations for further research of quality-accepted studies knowledge_areas.csv: Classification of quality-accepted studies using the SWEBOK KAs and subareas ml_techniques.csv: Classification of the quality-accepted studies based on a four-axis ML classification scheme, along with extracted ML techniques employed in the studies primary_studies.csv: Details of reviewed primary studies by the quality-accepted secondary research_methods.csv: Citations of the research methods employed by the quality-accepted studies research_types_methods.csv: Research types and methods employed by the quality-accepted studies src/ data-analysis.ipynb: Analysis of data extraction results (data preprocessing, top authors and institutions, study types, yearly distribution of publishers and QA scores) and creation of all figures included in the study scopus-year-analysis.ipynb: Yearly distribution of ML and SE publications retrieved from Elsevier Scopus study-selection-preprocessing.ipynb: Processing of digital library search results to conduct the inter-rater reliability estimation and study selection process
g
Data from: Willingness to Participate in Passive Mobile Data Collection
search.gesis.org
da-ra.de
Updated Mar 27, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Keusch, Florian (2019). Willingness to Participate in Passive Mobile Data Collection [Dataset]. http://doi.org/10.4232/1.13246
Explore at:
(15751447), (423955)Available download formats
Unique identifier
https://doi.org/10.4232/1.13246
Dataset updated
Mar 27, 2019
Dataset provided by
GESIS Data Archive
GESIS search
Authors
Keusch, Florian
License
https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
Time period covered
Dec 12, 2016 - Feb 22, 2017
Description
The goal of this study is to measure willingness to participate in passive mobile data collection among German smartphone owners. The data come from a two-wave web survey among German smartphone users 18 years and older who were recruited from a German nonprobability online panel. In December 2016, 2,623 participants completed the Wave 1 questionnaire on smartphone use and skills, privacy and security concerns, and general attitudes towards survey research and research institutions. In January 2017, all respondents from Wave 1 were invited to participate in a second web survey which included vignettes that varied the levels of several dimensions of a hypothetical study using passive mobile data collection, and respondents were asked to rate their willingness to participate in such a study. A total of 1,957 respondents completed the Wave 2 questionnaire.

Wave 1

Topics: Ownership of smartphone, mobile phone, PC, tablet, and/or e-book reader; type of smartphone; frequency of smartphone use; smartphone activities (browsing, e-mails, taking photos, view/ post social media content, shopping, online banking, installing apps, using GPS-enabled apps, connecting via Bluethooth, play games, stream music/ videos); self-assessment of smartphone skills; attitude towards surveys and participaton at research studies (personal interest, waste of time, sales pitch, interesting experience, useful); trust in institutions regarding data privacy (market research companies, university researchers, statistical office, mobile service provider, app companies, credit card companies, online retailer, and social networks); concerns regarding the disclosure of personal data by the aforementioned institutions; general privacy concern; privacy violated by banks/ credit card companies, tax authorities, government agencies, market research companies, social networks, apps, internet browsers); concern regarding data security with smartphone activities for research (online survey, survey apps, research apps, SMS survey, camera, activity data, GPS location, Bluetooth); number of online surveys in which the respondent has participated in the last 30 days; Panel memberships other than that of mingle; previous participation in a study with downloading a research app to the smartphone (passive mobile data collection).

Wave 2

Topics: Willingness to participate in passive mobile data collection (using eight vignettes with different scenarios that varied the levels of several dimensions of a hypothetical study using passive mobile data collection. The research app collects the following data for research purposes: technical characteristics of the smartphone (e.g. phone brand, screen size), the currently used telephone network (e.g. signal strength), the current location (every 5 minutes), which apps are used and which websites are visited, number of incoming and outgoing calls and SMS messages on the smartphone); reason why the respondent wouldn´t (respectively would) participate in the research study used in the first scenario (open answer); recognition of differences between the eight scenarios; kind of recognized difference (open answer); remembered data the research app collects (recall); previous invitation for research app download; research app download.

Demography: sex; age; federal state; highest level of school education; highest level of vocational qualification.

Additionally coded was: running number; respondent ID; duration (response time in seconds); device type used to fill out the questionnaire; vignette text; vignette intro time; vignette time.
D
Replication Data for: A Three-Year Mixed Methods Study of Undergraduates’...
dataverse.no
dataverse.azure.uit.no
+1more
Updated Oct 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ellen Nierenberg; Ellen Nierenberg (2024). Replication Data for: A Three-Year Mixed Methods Study of Undergraduates’ Information Literacy Development: Knowing, Doing, and Feeling [Dataset]. http://doi.org/10.18710/SK0R1N
Explore at:
txt(21865), txt(19475), csv(55030), txt(14751), txt(26578), txt(16861), txt(28211), pdf(107685), pdf(657212), txt(12082), txt(16243), text/x-fixed-field(55030), pdf(65240), txt(8172), pdf(634629), txt(31896), application/x-spss-sav(51476), txt(4141), pdf(91121), application/x-spss-sav(31612), txt(35011), txt(23981), text/x-fixed-field(15653), txt(25369), txt(17935), csv(15653)Available download formats
Unique identifier
https://doi.org/10.18710/SK0R1N
Dataset updated
Oct 8, 2024
Dataset provided by
DataverseNO
Authors
Ellen Nierenberg; Ellen Nierenberg
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
Aug 8, 2019 - Jun 10, 2022
Area covered
Norway
Description
This data set contains the replication data and supplements for the article "Knowing, Doing, and Feeling: A three-year, mixed-methods study of undergraduates’ information literacy development." The survey data is from two samples: - cross-sectional sample (different students at the same point in time) - longitudinal sample (the same students and different points in time)Surveys were distributed via Qualtrics during the students' first and sixth semesters. Quantitative and qualitative data were collected and used to describe students' IL development over 3 years. Statistics from the quantitative data were analyzed in SPSS. The qualitative data was coded and analyzed thematically in NVivo. The qualitative, textual data is from semi-structured interviews with sixth-semester students in psychology at UiT, both focus groups and individual interviews. All data were collected as part of the contact author's PhD research on information literacy (IL) at UiT. The following files are included in this data set: 1. A README file which explains the quantitative data files. (2 file formats: .txt, .pdf)2. The consent form for participants (in Norwegian). (2 file formats: .txt, .pdf)3. Six data files with survey results from UiT psychology undergraduate students for the cross-sectional (n=209) and longitudinal (n=56) samples, in 3 formats (.dat, .csv, .sav). The data was collected in Qualtrics from fall 2019 to fall 2022. 4. Interview guide for 3 focus group interviews. File format: .txt5. Interview guides for 7 individual interviews - first round (n=4) and second round (n=3). File format: .txt 6. The 21-item IL test (Tromsø Information Literacy Test = TILT), in English and Norwegian. TILT is used for assessing students' knowledge of three aspects of IL: evaluating sources, using sources, and seeking information. The test is multiple choice, with four alternative answers for each item. This test is a "KNOW-measure," intended to measure what students know about information literacy. (2 file formats: .txt, .pdf)7. Survey questions related to interest - specifically students' interest in being or becoming information literate - in 3 parts (all in English and Norwegian): a) information and questions about the 4 phases of interest; b) interest questionnaire with 26 items in 7 subscales (Tromsø Interest Questionnaire - TRIQ); c) Survey questions about IL and interest, need, and intent. (2 file formats: .txt, .pdf)8. Information about the assignment-based measures used to measure what students do in practice when evaluating and using sources. Students were evaluated with these measures in their first and sixth semesters. (2 file formats: .txt, .pdf)9. The Norwegain Centre for Research Data's (NSD) 2019 assessment of the notification form for personal data for the PhD research project. In Norwegian. (Format: .pdf)
j
Data from: List of studies reviewed in “Virtual Reality Research in...
jstagedata.jst.go.jp
xlsx
Updated Jul 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reo Fukuda (2023). List of studies reviewed in “Virtual Reality Research in Marketing Focusing on Consumers” [Dataset]. http://doi.org/10.50998/data.marketing.21816264.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.50998/data.marketing.21816264.v2
Dataset updated
Jul 27, 2023
Dataset provided by
Japan Marketing Academy
Authors
Reo Fukuda
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This document provides an overview of the characteristics of 49 empirical papers and 63 studies that were reviewed for "Virtual Reality Research in Marketing Focusing on Consumers”. This study used two distinct strategies to identify relevant research articles for inclusion in the analysis. Strategy A: Utilizing the SCImago Institutions Rankings as a reference, we targeted the top 50 ranked academic journals and extracted articles that incorporated "Virtual Reality" within their titles, abstracts, or keywords. Articles were obtained directly from the publisher of each journal. In cases where keyword search functionality was absent (e.g., Emerald), we limited the search to papers containing "Virtual Reality" in either their titles or abstracts. This approach resulted in the identification of 65 articles. Strategy B: Our search extended to various databases featured on the EBSCOhost platform, including "Academic Search Premier," "Business Source Premier," "Psychology and Behavioral Sciences Collection," "ERIC," "EconLit with Full Text," and "Teacher Reference Center." We focused on articles that pertained to both "Marketing" and "Virtual Reality," ensuring that they were peer-reviewed and available in full text. After removing duplicates identified in Strategy A, we extracted an additional 58 articles. In total, 123 articles were retrieved from both strategies. We meticulously reviewed the abstracts and keywords of each article to exclude those unrelated to Virtual Reality or not targeting consumers (e.g., articles on education, research, and development). Consequently, our final dataset included 51 articles (49 empirical papers, comprising 48 quantitative and 1 qualitative study, and 2 framework papers) for further examination.
Data from: Replication package for the paper: "A Study on the Pythonic...
zenodo.org
zip
Updated Nov 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous; Anonymous (2023). Replication package for the paper: "A Study on the Pythonic Functional Constructs' Understandability" [Dataset]. http://doi.org/10.5281/zenodo.10101383
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10101383
Dataset updated
Nov 10, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anonymous; Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Replication Package for A Study on the Pythonic Functional Constructs' Understandability
This package contains several folders and files with code and data used in the study.

examples/
Contains the code snippets used as objects of the study, named as reported in Table 1, summarizing the experiment design.
RQ1-RQ2-files-for-statistical-analysis/
Contains three .csv files used as input for conducting the statistical analysis and drawing the graphs for addressing the first two research questions of the study. Specifically:
- ConstructUsage.csv contains the declared frequency usage of the three functional constructs object of the study. This file is used to draw Figure 4.
- RQ1.csv contains the collected data used for the mixed-effect logistic regression relating the use of functional constructs with the correctness of the change task, and the logistic regression relating the use of map/reduce/filter functions with the correctness of the change task.
- RQ1Paired-RQ2.csv contains the collected data used for the ordinal logistic regression of the relationship between the perceived ease of understanding of the functional constructs and (i) participants' usage frequency, and (ii) constructs' complexity (except for map/reduce/filter).
inter-rater-RQ3-files/
Contains four .csv files used as input for computing the inter-rater agreement for the manual labeling used for addressing RQ3. Specifically, you will find one file for each functional construct, i.e., comprehension.csv, lambda.csv, and mrf.csv, and a different file used for highlighting the reasons why participants prefer to use the procedural paradigm, i.e., procedural.csv.
Questionnaire-Example.pdf
This file contains the questionnaire submitted to one of the ten experimental groups within our controlled experiment. Other questionnaires are similar, except for the code snippets used for the first section, i.e., change tasks, and the second section, i.e., comparison tasks.
RQ2ManualValidation.csv
This file contains the results of the manual validation being done to sanitize the answers provided by our participants used for addressing RQ2. Specifically, we coded the behavior description using four different levels: (i) correct, (ii) somewhat correct, (iii) wrong, and (iv) automatically generated.
RQ3ManualValidation.xlsx
This file contains the results of the open coding applied to address our third research question. Specifically, you will find four sheets, one for each functional construct and one for the procedural paradigm. For each sheet, you will find the provided answers together with the categories assigned to them.
Appendix.pdf
This file contains the results of the logistic regression relating the use of map, filter, and reduce functions with the correctness of the change task, not shown in the paper.
FuncConstructs-Statistics.r
This file contains an R script that you can reuse to re-run all the analyses conducted and discussed in the paper.
FuncConstructs-Statistics.ipynb
This file contains the code to re-execute all the analysis conducted in the paper as a notebook.
Leading areas where B2B marketers used marketing automation worldwide 2025
statista.com
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Leading areas where B2B marketers used marketing automation worldwide 2025 [Dataset]. https://www.statista.com/statistics/1607438/top-marketing-automation-areas-b2b-marketers/
Explore at:
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
During a survey among business-to-business (B2B) marketers published in February 2025, approximately ** percent of participants said they utilized marketing automation in email marketing. Social media management followed, mentioned by ** percent of respondents. According to the same study, B2B marketers' top goals when improving marketing automation included better data quality and ideal customer and prospect identification.
Data from: Population Assessment of Tobacco and Health (PATH) Study [United...
icpsr.umich.edu
Updated Jun 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Inter-university Consortium for Political and Social Research [distributor] (2025). Population Assessment of Tobacco and Health (PATH) Study [United States] Restricted-Use Files [Dataset]. http://doi.org/10.3886/ICPSR36231.v42
Explore at:
Unique identifier
https://doi.org/10.3886/ICPSR36231.v42
Dataset updated
Jun 27, 2025
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
License
https://www.icpsr.umich.edu/web/ICPSR/studies/36231/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36231/terms
Area covered
United States
Description
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Dataset 0002 (DS0002) contains the data from the State Design Data. This file contains 7 variables and 82,139 cases. The state identifier in the State Design file reflects the participant's state of residence at the time of selection and recruitment for the PATH Study. Dataset 1011 (DS1011) contains the data from the Wave 1 Adult Questionnaire. This data file contains 2,021 variables and 32,320 cases. Each of the cases represents a single, completed interview. Dataset 1012 (DS1012) contains the data from the Wave 1 Youth and Parent Questionnaire. This file contains 1,431 variables and 13,651 cases. Dataset 1411 (DS1411) contains the Wave 1 State Identifier data for Adults and has 5 variables and 32,320 cases. Dataset 1412 (DS1412) contains the Wave 1 State Identifier data for Youth (and Parents) and has 5 variables and 13,651 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 1, which is also their state of residence at the time of recruitment. Dataset 1611 (DS1611) contains the Tobacco Universal Product Code (UPC) data from Wave 1. This data file contains 32 variables and 8,601 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 1. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used
s
PRIEST study anonymised dataset
orda.shef.ac.uk
figshare.shef.ac.uk
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benjamin Thomas; Laura Sutton; Steve Goodacre; Katie Biggs; Amanda Loban (2023). PRIEST study anonymised dataset [Dataset]. http://doi.org/10.15131/shef.data.13194845.v1
Explore at:
Unique identifier
https://doi.org/10.15131/shef.data.13194845.v1
Dataset updated
May 30, 2023
Dataset provided by
The University of Sheffield
Authors
Benjamin Thomas; Laura Sutton; Steve Goodacre; Katie Biggs; Amanda Loban
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The PRIEST study used patient data from the early phases of the COVID-19 pandemic. The PRIEST study provided descriptive statistics of UK patients with suspected COVID-19 in an emergency department cohort, analysis of existing triage tools, and derivation and validation of a COVID-19 specific tool for adults with suspected COVID-19. For more details please go to the study website:https://www.sheffield.ac.uk/scharr/research/centres/cure/priestFiles contained in PRIEST study data repository Main files include:PRIEST.csv dataset contains 22445 observations and 119 variables. Data include initial presentation and follow-up, one row per participant.PRIEST_variables.csv contains variable names, values and brief description.Additional files include:Follow-up v4.0 PDF - Blank 30-day follow-up data collection toolPandemic Respiratory Infection Form v7 PDF - Blank baseline data collection toolPRIEST protocol v11.0_17Aug20 PDF - Study protocolPRIEST_SAP_v1.0_19jun20 PDF - Statistical analysis planThe PRIEST data sharing plan follows a controlled access model as described in Good Practice Principles for Sharing Individual Participant Data from Publicly Funded Clinical Trials. Data sharing requests should be emailed to priest-study@sheffield.ac.uk. Data sharing requests will be considered carefully as to whether it is necessary to fulfil the purpose of the data sharing request. For approval of a data sharing request an approved ethical review and study protocol must be provided. The PRIEST study was approved by NRES Committee North West - Haydock. REC reference: 12/NW/0303
o
Promise: A review of protocols of clinical trials to summarise what types of...
osf.io
Updated Dec 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonia Marsden; Jack Wilkinson; Sarah Cotterill; Selman Mirza (2024). Promise: A review of protocols of clinical trials to summarise what types of evidence are currently used for demonstrating the promise of an intervention [Dataset]. https://osf.io/4ns9q
Explore at:
Dataset updated
Dec 9, 2024
Dataset provided by
Center For Open Science
Authors
Antonia Marsden; Jack Wilkinson; Sarah Cotterill; Selman Mirza
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Increasingly funders are asking researchers to provide evidence of 'promise' (or 'evidence the intervention can work') when submitting bids for randomised controlled trials (RCTs) or feasibility studies of interventions. To avoid research waste, it is important to establish that an intervention has some promise before proceeding with a series of expensive clinical research studies that may be unwarranted if an intervention has little or no effect on patient outcomes. There is no shared understanding among stakeholders about what constitutes ‘promise’, and what study designs and research methods are appropriate for 'promise' studies. Nor is it clear whether it is appropriate to undertake 'promise' research alongside or separately from feasibility and pilot studies. This uncertainty leads to research waste, as expensive studies may be performed on unpromising treatments, while promising treatments may not be pursued. Guidance is needed on how to evaluate the promise of an intervention, to ensure that effective interventions are introduced in an efficient fashion. Other terms used to describe ‘promise’ are ‘proof of concept’ and ‘evidence of efficacy’.

This review represents the second stage in a project (the Promise study) aiming to develop guidance on how to define, report, and evaluate ‘promise’, and will review protocols for clinical trials to understand what research designs for ‘promise’ are reported by applicants. The first stage was a review to examine what funders at looking for in terms of ‘promise’ (Stage 1 protocol - OSF Registries | Promise: A review of research funder guidance relating to promise of the intervention). Subsequent stages will develop guidance on suitable research designs and methods for 'promise' in clinical research, to guide relevant stakeholders, including researchers, funders and research users.

Part of the scope of the Promise study is to define ‘promise of the intervention’. Here, we adopt an inclusive working definition, which includes any evidence presented by the protocol authors intended to suggest that there is likely to be a benefit of the intervention.
Researchers of Tomorrow, 2009-2011
beta.ukdataservice.ac.uk
Updated 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joint Information Systems Committee Higher Education Funding Councils; The Research Partnership (2012). Researchers of Tomorrow, 2009-2011 [Dataset]. http://doi.org/10.5255/ukda-sn-7029-1
Explore at:
Unique identifier
https://doi.org/10.5255/ukda-sn-7029-1
Dataset updated
2012
Dataset provided by
DataCitehttps://www.datacite.org/
UK Data Servicehttps://ukdataservice.ac.uk/
Authors
Joint Information Systems Committee Higher Education Funding Councils; The Research Partnership
Description
Researchers of Tomorrow (RoT) was a three-year study, sponsored by the British Library (BL) and the Joint Information Systems Committee (JISC). The study was carried out by Education for Change, with The Research Partnership. The study tracked a cohort of 70 young 'Generation Y' doctoral students (the children of the 'Baby Boomers'), defined in this study as those born between 1982 and 1994. The students were based at UK colleges and universities. The study used quantitative context-setting surveys and qualitative research to examine the students' information-seeking behaviour, analysing their habits in online and physical research environments and assessing their usage of library and information sources on- and off-line.

The study aimed to establish a benchmark for research behaviour against which subsequent generations of scholars can be measured. Its ultimate aim is to provide guidance to academic institutions, libraries and information specialists on how best to meet the research needs of Generation Y scholars and their immediate successors. The main focus areas of the study were:
mapping emerging research behaviour trends across the main subject disciplines
investigating how doctoral scholars, in particular those from Generation Y, seek information both on- and off-line
measuring the relative use of digital resources and physical resources (including research spaces)
understanding how Generation Y students search for and use digital content for research, and
if and how they use emergent technologies to do so.
Further information on the project may be found on the Education for Change Exploration for Change: Researchers of Tomorrow and the JISC Mapping the needs of a generation webpage.

The UK Data Archive holds data from the three context-setting surveys spanning 2009-2011, but does not currently hold any qualitative materials from the study.
m
Dataset for a research study on scientific productivity of Polish technical...
mostwiedzy.pl
zip
Updated Jun 22, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Magdalena Szuflita-Żurawska (2021). Dataset for a research study on scientific productivity of Polish technical universities (Koszalin University of Technology 2016-2020). [Dataset]. http://doi.org/10.34808/db42-xw09
Explore at:
zip(46650)Available download formats
Unique identifier
https://doi.org/10.34808/db42-xw09
Dataset updated
Jun 22, 2021
Authors
Magdalena Szuflita-Żurawska
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Koszalin
Description
This dataset was created for the purpose of research on scientific productivity at Polish technical universities. The raw data was retrieved in June 2021 by the SciVal benchmarking tool in xlsx format and will be used to create the research profiles of the universities and underlying data of journals articles. The most common definition of research productivity (interchangeable with academic productivity or scientific productivity) states as research productivity are the number of publications per researchers. However, a more extended definition of scientific productivity, connecting quantity and quality approach involves producing quality research represented by publishing academic papers in reputable international journals, citing these papers, gaining research funding (e.g. national and international grants), and collaborating in scientific teams.
e
Researchers of Tomorrow, 2009-2011 - Dataset - B2FIND
b2find.eudat.eu
Updated Jun 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Researchers of Tomorrow, 2009-2011 - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/dd49d5e2-c0c7-5232-b042-4a23fcde8eb4
Explore at:
Dataset updated
Jun 10, 2023
Description
Abstract copyright UK Data Service and data collection copyright owner. Researchers of Tomorrow (RoT) was a three-year study, sponsored by the British Library (BL) and the Joint Information Systems Committee (JISC). The study was carried out by Education for Change, with The Research Partnership. The study tracked a cohort of 70 young 'Generation Y' doctoral students (the children of the 'Baby Boomers'), defined in this study as those born between 1982 and 1994. The students were based at UK colleges and universities. The study used quantitative context-setting surveys and qualitative research to examine the students' information-seeking behaviour, analysing their habits in online and physical research environments and assessing their usage of library and information sources on- and off-line. The study aimed to establish a benchmark for research behaviour against which subsequent generations of scholars can be measured. Its ultimate aim is to provide guidance to academic institutions, libraries and information specialists on how best to meet the research needs of Generation Y scholars and their immediate successors. The main focus areas of the study were:mapping emerging research behaviour trends across the main subject disciplinesinvestigating how doctoral scholars, in particular those from Generation Y, seek information both on- and off-linemeasuring the relative use of digital resources and physical resources (including research spaces)understanding how Generation Y students search for and use digital content for research, and if and how they use emergent technologies to do so.Further information on the project may be found on the Education for Change Exploration for Change: Researchers of Tomorrow and the JISC Mapping the needs of a generation webpage. The UK Data Archive holds data from the three context-setting surveys spanning 2009-2011, but does not currently hold any qualitative materials from the study. Main Topics: Across the three years, the surveys covered: personal characteristics; doctoral training; training in and techniques used for finding information and research resources; research, technology and information seeking support; institutional research support; the research process; social media; openness and sharing in research; details regarding doctorate; funding. Volunteer sample Self-completion Online web-based survey.
Animals used for study of disease in the European Union 27 2011, by disease...
statista.com
Updated Dec 16, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2013). Animals used for study of disease in the European Union 27 2011, by disease class [Dataset]. https://www.statista.com/statistics/634300/animals-used-for-study-of-disease-european-union-eu/
Explore at:
Dataset updated
Dec 16, 2013
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2011
Area covered
Europe
Description
This statistic displays the share of animals used in the study of disease in the European Union- 27 countries in 2011. The majority of animals were used in the study of human diseases, of which ***** percent were used in the study of human cancer.

Facebook

Twitter

Click to copy link

Link copied

Cite

Anastasija Nikiforova (2023). Dataset: A Systematic Literature Review on the topic of High-value datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7944424

Dataset: A Systematic Literature Review on the topic of High-value datasets

Explore at:

Dataset updated

Jun 23, 2023

Dataset provided by

Andrea Miletič
Magdalena Ciesielska
Nina Rizun
Anastasija Nikiforova
Charalampos Alexopoulos

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset contains data collected during a study ("Towards High-Value Datasets determination for data-driven development: a systematic literature review") conducted by Anastasija Nikiforova (University of Tartu), Nina Rizun, Magdalena Ciesielska (Gdańsk University of Technology), Charalampos Alexopoulos (University of the Aegean) and Andrea Miletič (University of Zagreb) It being made public both to act as supplementary data for "Towards High-Value Datasets determination for data-driven development: a systematic literature review" paper (pre-print is available in Open Access here -> https://arxiv.org/abs/2305.10234) and in order for other researchers to use these data in their own work.

The protocol is intended for the Systematic Literature review on the topic of High-value Datasets with the aim to gather information on how the topic of High-value datasets (HVD) and their determination has been reflected in the literature over the years and what has been found by these studies to date, incl. the indicators used in them, involved stakeholders, data-related aspects, and frameworks. The data in this dataset were collected in the result of the SLR over Scopus, Web of Science, and Digital Government Research library (DGRL) in 2023.

Methodology

To understand how HVD determination has been reflected in the literature over the years and what has been found by these studies to date, all relevant literature covering this topic has been studied. To this end, the SLR was carried out to by searching digital libraries covered by Scopus, Web of Science (WoS), Digital Government Research library (DGRL).

These databases were queried for keywords ("open data" OR "open government data") AND ("high-value data*" OR "high value data*"), which were applied to the article title, keywords, and abstract to limit the number of papers to those, where these objects were primary research objects rather than mentioned in the body, e.g., as a future work. After deduplication, 11 articles were found unique and were further checked for relevance. As a result, a total of 9 articles were further examined. Each study was independently examined by at least two authors.

To attain the objective of our study, we developed the protocol, where the information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information.

Test procedure Each study was independently examined by at least two authors, where after the in-depth examination of the full-text of the article, the structured protocol has been filled for each study. The structure of the survey is available in the supplementary file available (see Protocol_HVD_SLR.odt, Protocol_HVD_SLR.docx) The data collected for each study by two researchers were then synthesized in one final version by the third researcher.

Description of the data in this data set

Protocol_HVD_SLR provides the structure of the protocol Spreadsheets #1 provides the filled protocol for relevant studies. Spreadsheet#2 provides the list of results after the search over three indexing databases, i.e. before filtering out irrelevant studies

The information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information

Descriptive information
1) Article number - a study number, corresponding to the study number assigned in an Excel worksheet 2) Complete reference - the complete source information to refer to the study 3) Year of publication - the year in which the study was published 4) Journal article / conference paper / book chapter - the type of the paper -{journal article, conference paper, book chapter} 5) DOI / Website- a link to the website where the study can be found 6) Number of citations - the number of citations of the article in Google Scholar, Scopus, Web of Science 7) Availability in OA - availability of an article in the Open Access 8) Keywords - keywords of the paper as indicated by the authors 9) Relevance for this study - what is the relevance level of the article for this study? {high / medium / low}

Approach- and research design-related information 10) Objective / RQ - the research objective / aim, established research questions 11) Research method (including unit of analysis) - the methods used to collect data, including the unit of analy-sis (country, organisation, specific unit that has been ana-lysed, e.g., the number of use-cases, scope of the SLR etc.) 12) Contributions - the contributions of the study 13) Method - whether the study uses a qualitative, quantitative, or mixed methods approach? 14) Availability of the underlying research data- whether there is a reference to the publicly available underly-ing research data e.g., transcriptions of interviews, collected data, or explanation why these data are not shared? 15) Period under investigation - period (or moment) in which the study was conducted 16) Use of theory / theoretical concepts / approaches - does the study mention any theory / theoretical concepts / approaches? If any theory is mentioned, how is theory used in the study?

Quality- and relevance- related information
17) Quality concerns - whether there are any quality concerns (e.g., limited infor-mation about the research methods used)? 18) Primary research object - is the HVD a primary research object in the study? (primary - the paper is focused around the HVD determination, sec-ondary - mentioned but not studied (e.g., as part of discus-sion, future work etc.))

HVD determination-related information
19) HVD definition and type of value - how is the HVD defined in the article and / or any other equivalent term? 20) HVD indicators - what are the indicators to identify HVD? How were they identified? (components & relationships, “input -> output") 21) A framework for HVD determination - is there a framework presented for HVD identification? What components does it consist of and what are the rela-tionships between these components? (detailed description) 22) Stakeholders and their roles - what stakeholders or actors does HVD determination in-volve? What are their roles? 23) Data - what data do HVD cover? 24) Level (if relevant) - what is the level of the HVD determination covered in the article? (e.g., city, regional, national, international)

Format of the file .xls, .csv (for the first spreadsheet only), .odt, .docx

Licenses or restrictions CC-BY

For more info, see README.txt

Clear search

Close search

Google apps

Main menu

Dataset: A Systematic Literature Review on the topic of High-value datasets

Dataset 1: Studies included in literature review

Data from: Where do engineering students really get their information? :...

Conceptualization of public data ecosystems

Career promotions, research publications, Open Access dataset

Datasets for "The voices of home educated adolescents: a participatory...

Data of the article "Journal research data sharing policies: a study of...

Data from: Machine Learning for Software Engineering: A Tertiary Study

Data from: Willingness to Participate in Passive Mobile Data Collection

Replication Data for: A Three-Year Mixed Methods Study of Undergraduates’...

Data from: List of studies reviewed in “Virtual Reality Research in...

Data from: Replication package for the paper: "A Study on the Pythonic...

Leading areas where B2B marketers used marketing automation worldwide 2025

Data from: Population Assessment of Tobacco and Health (PATH) Study [United...

PRIEST study anonymised dataset

Promise: A review of protocols of clinical trials to summarise what types of...

Researchers of Tomorrow, 2009-2011

Dataset for a research study on scientific productivity of Polish technical...

Researchers of Tomorrow, 2009-2011 - Dataset - B2FIND

Animals used for study of disease in the European Union 27 2011, by disease...

Dataset: A Systematic Literature Review on the topic of High-value datasets