100+ datasets found
  1. Dataset 1: Studies included in literature review

    • catalog.data.gov
    • data.amerigeoss.org
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Dataset 1: Studies included in literature review [Dataset]. https://catalog.data.gov/dataset/dataset-1-studies-included-in-literature-review
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This dataset contains the results of a literature review of experimental nutrient addition studies to determine which nutrient forms were most often measured in the scientific literature. To obtain a representative selection of relevant studies, we searched Web of Science™ using a search string to target experimental studies in artificial and natural lotic systems while limiting irrelevant papers. We screened the titles and abstracts of returned papers for relevance (experimental studies in streams/stream mesocosms that manipulated nutrients). To supplement this search, we sorted the relevant articles from the Web of Science™ search alphabetically by author and sequentially examined the bibliographies for additional relevant articles (screening titles for relevance, and then screening abstracts of potentially relevant articles) until we had obtained a total of 100 articles. If we could not find a relevant article electronically, we moved to the next article in the bibliography. Our goal was not to be completely comprehensive, but to obtain a fairly large sample of published, peer-reviewed studies from which to assess patterns. We excluded any lentic or estuarine studies from consideration and included only studies that used mesocosms mimicking stream systems (flowing water or stream water source) or that manipulated nutrient concentrations in natural streams or rivers. We excluded studies that used nutrient diffusing substrate (NDS) because these manipulate nutrients on substrates and not in the water column. We also excluded studies examining only nutrient uptake, which rely on measuring dissolved nutrient concentrations with the goal of characterizing in-stream processing (e.g., Newbold et al., 1983). From the included studies, we extracted or summarized the following information: study type, study duration, nutrient treatments, nutrients measured, inclusion of TN and/or TP response to nutrient additions, and a description of how results were reported in relation to the research-management mismatch, if it existed. Below is information on how the search was conducted: Search string used for Web of Science advanced search Search conducted on 27 September 2016. TS= (stream OR creek OR river* OR lotic OR brook OR headwater OR tributary) AND TS = (mesocosm OR flume OR "artificial stream" OR "experimental stream" OR "nutrient addition") AND TI= (nitrogen OR phosphorus OR nutrient OR enrichment OR fertilization OR eutrophication)

  2. l

    Data from: Where do engineering students really get their information? :...

    • opal.latrobe.edu.au
    • researchdata.edu.au
    pdf
    Updated Mar 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Clayton Bolitho (2025). Where do engineering students really get their information? : using reference list analysis to improve information literacy programs [Dataset]. http://doi.org/10.4225/22/59d45f4b696e4
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Mar 13, 2025
    Dataset provided by
    La Trobe
    Authors
    Clayton Bolitho
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundAn understanding of the resources which engineering students use to write their academic papers provides information about student behaviour as well as the effectiveness of information literacy programs designed for engineering students. One of the most informative sources of information which can be used to determine the nature of the material that students use is the bibliography at the end of the students’ papers. While reference list analysis has been utilised in other disciplines, few studies have focussed on engineering students or used the results to improve the effectiveness of information literacy programs. Gadd, Baldwin and Norris (2010) found that civil engineering students undertaking a finalyear research project cited journal articles more than other types of material, followed by books and reports, with web sites ranked fourth. Several studies, however, have shown that in their first year at least, most students prefer to use Internet search engines (Ellis & Salisbury, 2004; Wilkes & Gurney, 2009).PURPOSEThe aim of this study was to find out exactly what resources undergraduate students studying civil engineering at La Trobe University were using, and in particular, the extent to which students were utilising the scholarly resources paid for by the library. A secondary purpose of the research was to ascertain whether information literacy sessions delivered to those students had any influence on the resources used, and to investigate ways in which the information literacy component of the unit can be improved to encourage students to make better use of the resources purchased by the Library to support their research.DESIGN/METHODThe study examined student bibliographies for three civil engineering group projects at the Bendigo Campus of La Trobe University over a two-year period, including two first-year units (CIV1EP – Engineering Practice) and one-second year unit (CIV2GR – Engineering Group Research). All units included a mandatory library session at the start of the project where student groups were required to meet with the relevant faculty librarian for guidance. In each case, the Faculty Librarian highlighted specific resources relevant to the topic, including books, e-books, video recordings, websites and internet documents. The students were also shown tips for searching the Library catalogue, Google Scholar, LibSearch (the LTU Library’s research and discovery tool) and ProQuest Central. Subject-specific databases for civil engineering and science were also referred to. After the final reports for each project had been submitted and assessed, the Faculty Librarian contacted the lecturer responsible for the unit, requesting copies of the student bibliographies for each group. References for each bibliography were then entered into EndNote. The Faculty Librarian grouped them according to various facets, including the name of the unit and the group within the unit; the material type of the item being referenced; and whether the item required a Library subscription to access it. A total of 58 references were collated for the 2010 CIV1EP unit; 237 references for the 2010 CIV2GR unit; and 225 references for the 2011 CIV1EP unit.INTERIM FINDINGSThe initial findings showed that student bibliographies for the three group projects were primarily made up of freely available internet resources which required no library subscription. For the 2010 CIV1EP unit, all 58 resources used were freely available on the Internet. For the 2011 CIV1EP unit, 28 of the 225 resources used (12.44%) required a Library subscription or purchase for access, while the second-year students (CIV2GR) used a greater variety of resources, with 71 of the 237 resources used (29.96%) requiring a Library subscription or purchase for access. The results suggest that the library sessions had little or no influence on the 2010 CIV1EP group, but the sessions may have assisted students in the 2011 CIV1EP and 2010 CIV2GR groups to find books, journal articles and conference papers, which were all represented in their bibliographiesFURTHER RESEARCHThe next step in the research is to investigate ways to increase the representation of scholarly references (found by resources other than Google) in student bibliographies. It is anticipated that such a change would lead to an overall improvement in the quality of the student papers. One way of achieving this would be to make it mandatory for students to include a specified number of journal articles, conference papers, or scholarly books in their bibliographies. It is also anticipated that embedding La Trobe University’s Inquiry/Research Quiz (IRQ) using a constructively aligned approach will further enhance the students’ research skills and increase their ability to find suitable scholarly material which relates to their topic. This has already been done successfully (Salisbury, Yager, & Kirkman, 2012)CONCLUSIONS & CHALLENGESThe study shows that most students rely heavily on the free Internet for information. Students don’t naturally use Library databases or scholarly resources such as Google Scholar to find information, without encouragement from their teachers, tutors and/or librarians. It is acknowledged that the use of scholarly resources doesn’t automatically lead to a high quality paper. Resources must be used appropriately and students also need to have the skills to identify and synthesise key findings in the existing literature and relate these to their own paper. Ideally, students should be able to see the benefit of using scholarly resources in their papers, and continue to seek these out even when it’s not a specific assessment requirement, though it can’t be assumed that this will be the outcome.REFERENCESEllis, J., & Salisbury, F. (2004). Information literacy milestones: building upon the prior knowledge of first-year students. Australian Library Journal, 53(4), 383-396.Gadd, E., Baldwin, A., & Norris, M. (2010). The citation behaviour of civil engineering students. Journal of Information Literacy, 4(2), 37-49.Salisbury, F., Yager, Z., & Kirkman, L. (2012). Embedding Inquiry/Research: Moving from a minimalist model to constructive alignment. Paper presented at the 15th International First Year in Higher Education Conference, Brisbane. Retrieved from http://www.fyhe.com.au/past_papers/papers12/Papers/11A.pdfWilkes, J., & Gurney, L. J. (2009). Perceptions and applications of information literacy by first year applied science students. Australian Academic & Research Libraries, 40(3), 159-171.

  3. Z

    Dataset: Users' Software Description vs. Feedback: What Users Write About...

    • data.niaid.nih.gov
    Updated Nov 19, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Specht (2022). Dataset: Users' Software Description vs. Feedback: What Users Write About Existing Software [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7335224
    Explore at:
    Dataset updated
    Nov 19, 2022
    Dataset provided by
    Martin Obaidi
    Alexander Specht
    Kurt Schneider
    Michael Anders
    Barbara Paech
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In 2022, we conducted a study on the awareness of users concerning existing software, For this, we focused on the description and feedback given by the 100 participating users.

    This data set contain 100 participants' description and feedback about the Komoot hiking app. Both features and software aspects were manually coded in the data set.

  4. f

    Collection of example datasets used for the book - R Programming -...

    • figshare.com
    txt
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kingsley Okoye; Samira Hosseini (2023). Collection of example datasets used for the book - R Programming - Statistical Data Analysis in Research [Dataset]. http://doi.org/10.6084/m9.figshare.24728073.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 4, 2023
    Dataset provided by
    figshare
    Authors
    Kingsley Okoye; Samira Hosseini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This book is written for statisticians, data analysts, programmers, researchers, teachers, students, professionals, and general consumers on how to perform different types of statistical data analysis for research purposes using the R programming language. R is an open-source software and object-oriented programming language with a development environment (IDE) called RStudio for computing statistics and graphical displays through data manipulation, modelling, and calculation. R packages and supported libraries provides a wide range of functions for programming and analyzing of data. Unlike many of the existing statistical softwares, R has the added benefit of allowing the users to write more efficient codes by using command-line scripting and vectors. It has several built-in functions and libraries that are extensible and allows the users to define their own (customized) functions on how they expect the program to behave while handling the data, which can also be stored in the simple object system.For all intents and purposes, this book serves as both textbook and manual for R statistics particularly in academic research, data analytics, and computer programming targeted to help inform and guide the work of the R users or statisticians. It provides information about different types of statistical data analysis and methods, and the best scenarios for use of each case in R. It gives a hands-on step-by-step practical guide on how to identify and conduct the different parametric and non-parametric procedures. This includes a description of the different conditions or assumptions that are necessary for performing the various statistical methods or tests, and how to understand the results of the methods. The book also covers the different data formats and sources, and how to test for reliability and validity of the available datasets. Different research experiments, case scenarios and examples are explained in this book. It is the first book to provide a comprehensive description and step-by-step practical hands-on guide to carrying out the different types of statistical analysis in R particularly for research purposes with examples. Ranging from how to import and store datasets in R as Objects, how to code and call the methods or functions for manipulating the datasets or objects, factorization, and vectorization, to better reasoning, interpretation, and storage of the results for future use, and graphical visualizations and representations. Thus, congruence of Statistics and Computer programming for Research.

  5. Map of articles about "Teaching Open Science"

    • zenodo.org
    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Isabel Steinhardt; Isabel Steinhardt (2020). Map of articles about "Teaching Open Science" [Dataset]. http://doi.org/10.5281/zenodo.3371415
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Isabel Steinhardt; Isabel Steinhardt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This description is part of the blog post "Systematic Literature Review of teaching Open Science" https://sozmethode.hypotheses.org/839

    According to my opinion, we do not pay enough attention to teaching Open Science in higher education. Therefore, I designed a seminar to teach students the practices of Open Science by doing qualitative research.About this seminar, I wrote the article ”Teaching Open Science and qualitative methods“. For the article ”Teaching Open Science and qualitative methods“, I started to review the literature on ”Teaching Open Science“. The result of my literature review is that certain aspects of Open Science are used for teaching. However, Open Science with all its aspects (Open Access, Open Data, Open Methodology, Open Science Evaluation and Open Science Tools) is not an issue in publications about teaching.

    Based on this insight, I have started a systematic literature review. I realized quickly that I need help to analyse and interpret the articles and to evaluate my preliminary findings. Especially different disciplinary cultures of teaching different aspects of Open Science are challenging, as I myself, as a social scientist, do not have enough insight to be able to interpret the results correctly. Therefore, I would like to invite you to participate in this research project!

    I am now looking for people who would like to join a collaborative process to further explore and write the systematic literature review on “Teaching Open Science“. Because I want to turn this project into a Massive Open Online Paper (MOOP). According to the 10 rules of Tennant et al (2019) on MOOPs, it is crucial to find a core group that is enthusiastic about the topic. Therefore, I am looking for people who are interested in creating the structure of the paper and writing the paper together with me. I am also looking for people who want to search for and review literature or evaluate the literature I have already found. Together with the interested persons I would then define, the rules for the project (cf. Tennant et al. 2019). So if you are interested to contribute to the further search for articles and / or to enhance the interpretation and writing of results, please get in touch. For everyone interested to contribute, the list of articles collected so far is freely accessible at Zotero: https://www.zotero.org/groups/2359061/teaching_open_science. The figure shown below provides a first overview of my ongoing work. I created the figure with the free software yEd and uploaded the file to zenodo, so everyone can download and work with it:

    To make transparent what I have done so far, I will first introduce what a systematic literature review is. Secondly, I describe the decisions I made to start with the systematic literature review. Third, I present the preliminary results.

    Systematic literature review – an Introduction

    Systematic literature reviews “are a method of mapping out areas of uncertainty, and identifying where little or no relevant research has been done.” (Petticrew/Roberts 2008: 2). Fink defines the systematic literature review as a “systemic, explicit, and reproducible method for identifying, evaluating, and synthesizing the existing body of completed and recorded work produced by researchers, scholars, and practitioners.” (Fink 2019: 6). The aim of a systematic literature reviews is to surpass the subjectivity of a researchers’ search for literature. However, there can never be an objective selection of articles. This is because the researcher has for example already made a preselection by deciding about search strings, for example “Teaching Open Science”. In this respect, transparency is the core criteria for a high-quality review.

    In order to achieve high quality and transparency, Fink (2019: 6-7) proposes the following seven steps:

    1. Selecting a research question.
    2. Selecting the bibliographic database.
    3. Choosing the search terms.
    4. Applying practical screening criteria.
    5. Applying methodological screening criteria.
    6. Doing the review.
    7. Synthesizing the results.

    I have adapted these steps for the “Teaching Open Science” systematic literature review. In the following, I will present the decisions I have made.

    Systematic literature review – decisions I made

    1. Research question: I am interested in the following research questions: How is Open Science taught in higher education? Is Open Science taught in its full range with all aspects like Open Access, Open Data, Open Methodology, Open Science Evaluation and Open Science Tools? Which aspects are taught? Are there disciplinary differences as to which aspects are taught and, if so, why are there such differences?
    2. Databases: I started my search at the Directory of Open Science (DOAJ). “DOAJ is a community-curated online directory that indexes and provides access to high quality, open access, peer-reviewed journals.” (https://doaj.org/) Secondly, I used the Bielefeld Academic Search Engine (base). Base is operated by Bielefeld University Library and “one of the world’s most voluminous search engines especially for academic web resources” (base-search.net). Both platforms are non-commercial and focus on Open Access publications and thus differ from the commercial publication databases, such as Web of Science and Scopus. For this project, I deliberately decided against commercial providers and the restriction of search in indexed journals. Thus, because my explicit aim was to find articles that are open in the context of Open Science.
    3. Search terms: To identify articles about teaching Open Science I used the following search strings: “teaching open science” OR teaching “open science” OR teach „open science“. The topic search looked for the search strings in title, abstract and keywords of articles. Since these are very narrow search terms, I decided to broaden the method. I searched in the reference lists of all articles that appear from this search for further relevant literature. Using Google Scholar I checked which other authors cited the articles in the sample. If the so checked articles met my methodological criteria, I included them in the sample and looked through the reference lists and citations at Google Scholar. This process has not yet been completed.
    4. Practical screening criteria: I have included English and German articles in the sample, as I speak these languages (articles in other languages are very welcome, if there are people who can interpret them!). In the sample only journal articles, articles in edited volumes, working papers and conference papers from proceedings were included. I checked whether the journals were predatory journals – such articles were not included. I did not include blogposts, books or articles from newspapers. I only included articles that fulltexts are accessible via my institution (University of Kassel). As a result, recently published articles at Elsevier could not be included because of the special situation in Germany regarding the Project DEAL (https://www.projekt-deal.de/about-deal/). For articles that are not freely accessible, I have checked whether there is an accessible version in a repository or whether preprint is available. If this was not the case, the article was not included. I started the analysis in May 2019.
    5. Methodological criteria: The method described above to check the reference lists has the problem of subjectivity. Therefore, I hope that other people will be interested in this project and evaluate my decisions. I have used the following criteria as the basis for my decisions: First, the articles must focus on teaching. For example, this means that articles must describe how a course was designed and carried out. Second, at least one aspect of Open Science has to be addressed. The aspects can be very diverse (FOSS, repositories, wiki, data management, etc.) but have to comply with the principles of openness. This means, for example, I included an article when it deals with the use of FOSS in class and addresses the aspects of openness of FOSS. I did not include articles when the authors describe the use of a particular free and open source software for teaching but did not address the principles of openness or re-use.
    6. Doing the review: Due to the methodical approach of going through the reference lists, it is possible to create a map of how the articles relate to each other. This results in thematic clusters and connections between clusters. The starting point for the map were four articles (Cook et al. 2018; Marsden, Thompson, and Plonsky 2017; Petras et al. 2015; Toelch and Ostwald 2018) that I found using the databases and criteria described above. I used yEd to generate the network. „yEd is a powerful desktop application that can be used to quickly and effectively generate high-quality diagrams.” (https://www.yworks.com/products/yed) In the network, arrows show, which articles are cited in an article and which articles are cited by others as well. In addition, I made an initial rough classification of the content using colours. This classification is based on the contents mentioned in the articles’ title and abstract. This rough content classification requires a more exact, i.e., content-based subdivision and

  6. r

    A Messy Handwriting Dataset with Student Crossouts and Corrections...

    • researchdata.edu.au
    • research-repository.rmit.edu.au
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hiqmat Nisa (2023). A Messy Handwriting Dataset with Student Crossouts and Corrections (Line-version) [Dataset]. http://doi.org/10.25439/RMT.24419986.V1
    Explore at:
    Dataset updated
    Nov 20, 2023
    Dataset provided by
    RMIT University, Australia
    Authors
    Hiqmat Nisa
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This is the line version of student messy hand written dataset (SMHD) (Nisa, Hiqmat; Thom, James; ciesielski, Vic; Tennakoon, Ruwan (2023). Student Messy Handwritten Dataset (SMHD) . RMIT University. Dataset. https://doi.org/10.25439/rmt.24312715.v1).

    Within the central repository, there are subfolders of each document converted into lines. All images are in .png format. In the main folder there are three .txt files.

    1)SMHD.txt contain all the line level transcription in the form of
    image name, threshold value, label
    0001-000,178 Bombay Phenotype :-

    2) SMHD-Cross-outsandInsertions.txt contains all the line images from the dataset having crossed-out and inserted text.

    3)Class_Notes_SMHD.txt contains more complex cases with cross-outs, insertions and overwriting. This can be used as a test set. The images in this files does not included in the SMHD.txt.

    In the transcription files, any crossed-out content is denoted by the '#' symbol, facilitating the easy identification of files with or without such modifications.

    Dataset Description:

    We have incorporated contributions from more than 500 students to construct the dataset. Handwritten examination papers are primary sources in academic institutes to assess student learning. In our experience as academics, we have found that student examination papers tend to be messy with all kinds of insertions and corrections and would thus be a great source of documents for investigating HTR in the wild. Unfortunately, student examination papers are not available due to ethical considerations. So, we created an exam-like situation to collect handwritten samples from students. The corpus of the collected data is academic-based. Usually, in academia, handwritten papers have lines in them. For this purpose, we drew lines using light colors on white paper. The height of a line is 1.5 pt and the space between two lines is 40 pt. The filled handwritten documents were scanned at a resolution of 300 dpi at a grey-level resolution of 8 bits.

    Collection Process: The collection process was done in four different ways. In the first exercise, we asked participants to summarize a given text in their own words. We called it a summary-based dataset. In the summary writing task, we included 60 undergraduate students studying the English language as a subject. After getting their consent, we distributed printed text articles and we asked them to choose one article, read it and summarize it in a paragraph in 15 minutes. The corpus of the printed text articles given to the participants was collected from the Internet on different topics. The articles were related to current political situations, daily life activities, and the Covid-19 pandemic.

    In the second exercise, we asked participants to write an essay from a given list of topics, or they could write on any topic of their choice. We called it an essay-based dataset. This dataset is collected from 250 High school students. We gave them 30 minutes to think about the topic and write for this task.

    In the third exercise, we select participants from different subjects and ask them to write on a topic from their current study. We called it a subject-based dataset. For this study, we used undergraduate students from different subjects, including 33 students from Mathematics, 71 from Biological Sciences, 24 from Environmental Sciences, 17 from Physics, and more than 84 from English studies.

    Finally a class-notes dataset, we have collected class notes from almost 31 students on the same topic. We asked students to take notes of every possible sentence the speaker delivered during the lecture. After finishing the lesson in almost 10 minutes, we asked students to recheck their notes and compare them with other classmates. We did not impose any time restrictions for rechecking. We observed more cross-outs and corrections in class-notes compared to summary-based and academic-based collections.

    In all four exercises, we did not impose any rules on them, for example, spacing, usage of a pen, etc. We asked them to cross out the text if it seemed inappropriate. Although usually writers made corrections in a second read, we also gave an extra 5 minutes for correction purposes.

  7. r

    Student Messy Handwritten Dataset (SMHD)

    • research-repository.rmit.edu.au
    • researchdata.edu.au
    application/x-rar
    Updated Oct 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hiqmat Nisa; James Thom; Vic ciesielski; Ruwan Tennakoon (2023). Student Messy Handwritten Dataset (SMHD) [Dataset]. http://doi.org/10.25439/rmt.24312715.v1
    Explore at:
    application/x-rarAvailable download formats
    Dataset updated
    Oct 16, 2023
    Dataset provided by
    RMIT University
    Authors
    Hiqmat Nisa; James Thom; Vic ciesielski; Ruwan Tennakoon
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Within the central repository, there are subfolders of different categories. Each of these subfolders contains both images and their corresponding transcriptions, saved as .txt files. As an example, the folder 'summary-based-0001-0055' encompasses 55 handwritten image documents pertaining to the summary task, with the images ranging from 0001 to 0055 within this category. In the transcription files, any crossed-out content is denoted by the '#' symbol, facilitating the easy identification of files with or without such modifications.Moreover, there exists a document detailing the transcription rules utilized for transcribing the dataset. Following these guidelines will enable the seamless addition of more images.Dataset Description:We have incorporated contributions from more than 500 students to construct the dataset. Handwritten examination papers are primary sources in academic institutes to assess student learning. In our experience as academics, we have found that student examination papers tend to be messy with all kinds of insertions and corrections and would thus be a great source of documents for investigating HTR in the wild. Unfortunately, student examination papers are not available due to ethical considerations. So, we created an exam-like situation to collect handwritten samples from students. The corpus of the collected data is academic-based. Usually, in academia, handwritten papers have lines in them. For this purpose, we drew lines using light colors on white paper. The height of a line is 1.5 pt and the space between two lines is 40 pt. The filled handwritten documents were scanned at a resolution of 300 dpi at a grey-level resolution of 8 bits.Collection Process: The collection process was done in four different ways. In the first exercise, we asked participants to summarize a given text in their own words. We called it a summary-based dataset. In the summary writing task, we included 60 undergraduate students studying the English language as a subject. After getting their consent, we distributed printed text articles and we asked them to choose one article, read it and summarize it in a paragraph in 15 minutes. The corpus of the printed text articles given to the participants was collected from the Internet on different topics. The articles were related to current political situations, daily life activities, and the Covid-19 pandemic.In the second exercise, we asked participants to write an essay from a given list of topics, or they could write on any topic of their choice. We called it an essay-based dataset. This dataset is collected from 250 High school students. We gave them 30 minutes to think about the topic and write for this task.In the third exercise, we select participants from different subjects and ask them to write on a topic from their current study. We called it a subject-based dataset. For this study, we used undergraduate students from different subjects, including 33 students from Mathematics, 71 from Biological Sciences, 24 from Environmental Sciences, 17 from Physics, and more than 84 from English studies.Finally a class-notes dataset, we have collected class notes from almost 31 students on the same topic. We asked students to take notes of every possible sentence the speaker delivered during the lecture. After finishing the lesson in almost 10 minutes, we asked students to recheck their notes and compare them with other classmates. We did not impose any time restrictions for rechecking. We observed more cross-outs and corrections in class-notes compared to summary-based and academic-based collections.In all four exercises, we did not impose any rules on them, for example, spacing, usage of a pen, etc. We asked them to cross out the text if it seemed inappropriate. Although usually writers made corrections in a second read, we also gave an extra 5 minutes for correction purposes.

  8. Visitor Intake Processing Re-write Management Information

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Jul 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Social Security Administration (2025). Visitor Intake Processing Re-write Management Information [Dataset]. https://catalog.data.gov/dataset/visitor-intake-processing-re-write-management-information
    Explore at:
    Dataset updated
    Jul 4, 2025
    Dataset provided by
    Social Security Administrationhttp://ssa.gov/
    Description

    The data store houses detail information pertaining to visitors' wait times, visits, calls, and other customer relationship information relating to VIPR and CHIP. The data store is used for ad-hoc querying for management information.

  9. d

    Data from: Write Up! A Study of Copyright Information on Library-Published...

    • dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schlosser, Melanie (2023). Data from: Write Up! A Study of Copyright Information on Library-Published Journals [Dataset]. http://doi.org/10.7910/DVN/R36SVZ
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Schlosser, Melanie
    Description

    This data corresponds to the article "Write Up! A Studiy of Copyright Information on Library-Published Journals," and covers the copyright and licensing policies of library-published scholarly journals.

  10. h

    writingPromptAug

    • huggingface.co
    Updated Aug 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabricio Braz (2023). writingPromptAug [Dataset]. https://huggingface.co/datasets/fabraz/writingPromptAug
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 31, 2023
    Authors
    Fabricio Braz
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for Writing Prompt Augmentation Dataset

      Dataset Summary
    

    Writing Prompt Augmentation Dataset was built to expand samples of FAIR Writing Prompt Dataset, for feeding Open Assistant.

      Languages
    

    English

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    {"splitLineIndex":36888, "text":"User: write me a story about: Most responses on here have a twist , and all of them are fictional . Show us a piece of your actual life ; let the reader experience… See the full description on the dataset page: https://huggingface.co/datasets/fabraz/writingPromptAug.

  11. d

    Open Data Dictionary Template Individual

    • opendata.dc.gov
    • catalog.data.gov
    • +2more
    Updated Jan 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Washington, DC (2023). Open Data Dictionary Template Individual [Dataset]. https://opendata.dc.gov/documents/cb6a686b1e344eeb8136d0103c942346
    Explore at:
    Dataset updated
    Jan 5, 2023
    Dataset authored and provided by
    City of Washington, DC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This template covers section 2.5 Resource Fields: Entity and Attribute Information of the Data Discovery Form cited in the Open Data DC Handbook (2022). It completes documentation elements that are required for publication. Each field column (attribute) in the dataset needs a description clarifying the contents of the column. Data originators are encouraged to enter the code values (domains) of the column to help end-users translate the contents of the column where needed, especially when lookup tables do not exist.

  12. f

    Data from: So how do i describe it? - a discursive rhetorical model of the...

    • scielo.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Enrique Sologuren; René Venegas (2023). So how do i describe it? - a discursive rhetorical model of the project report genre in spanish: an approach to academic writing in civil engineering [Dataset]. http://doi.org/10.6084/m9.figshare.19969868.v1
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    SciELO journals
    Authors
    Enrique Sologuren; René Venegas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT The project report is a genre widely used in the processes of academic training in computer science civil engineering. Students write this genre in the university classroom guided by diverse academic and professional purposes. Despite its relevance, empirical rhetorical-discursive descriptions that highlight the value of student writing are still scarce in Spanish. In this way, we describe the rhetorical-discursive organization of the project report genre in this subdiscipline to gain more knowledge about the written production at the rhetorical level. For this purpose, we follow a methodological design anchored in Genre Analysis of Swalesian roots, and we analyze a corpus formed by 58 texts. The use of this method helped us to determine macromoves, moves and steps of the genre, communicative functions and text characteristics. The resulting rhetorical model shows 16 moves and 36 steps. This model proved high stability of this genre, its mesogeneric nature and interconnection role between the academic and the professional discourse. Finally, implications for genre theory, for description of the Spanish language and for genre pedagogy emerge from the results.

  13. Z

    Data from: Demography, education, and research trends in the...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sampson, Laura (2024). Demography, education, and research trends in the interdisciplinary field of disease ecology [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5812145
    Explore at:
    Dataset updated
    Jul 17, 2024
    Dataset provided by
    Becker, Daniel J
    Forbes, Kristian M
    Sampson, Laura
    Brandell, Ellen E
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description of Supporting Files

    Demography, education, and research trends in the interdisciplinary field of disease ecology

    Ellen E. Brandell, Daniel J. Becker, Laura Sampson, Kristian M. Forbes

    TopArticles_Inclusion.xlsx

    This Excel provides a list of influential articles written in by survey participants at least two times.

    Sheet “table”: just tabular information

    Sheet “withNotes”: includes notes about data, number of citations from survey participants, and percent inclusion calculations.

    Columns are:

    ‘INCLUDED’: if the article appeared in the corpus (1) or not (0)

    ‘COUNT’: the number of times survey participants wrote in the article

    ‘ARTICLE’: article citation Percent of articles included in the corpus are calculated for 4 or more write-ins, 3-write-ins, 2 write-ins, and across all articles written in twice.

    IRB_Correspondence_STUDY00010582.pdf

    Institutional Review Board correspondence and approval from Pennsylvania State University. Survey response data may be available upon request from the corresponding author. To protect participants, any potentially identifying information will be removed prior to filling a request. See the online Supporting Information for this article for extensive reporting of survey results prior to a request.

    FullSurvey.pdf

    A PDF of the full survey form.

    CorpusFrequencyAnalysis.ipynb

    This is the Python script used for corpus organization and the topic detection analysis. It includes some plot generation.

  14. Z

    Data articles in journals

    • data.niaid.nih.gov
    Updated Sep 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Balsa-Sanchez, Carlota (2023). Data articles in journals [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3753373
    Explore at:
    Dataset updated
    Sep 22, 2023
    Dataset provided by
    Loureiro, Vanesa
    Balsa-Sanchez, Carlota
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Version: 5

    Authors: Carlota Balsa-Sánchez, Vanesa Loureiro

    Date of data collection: 2023/09/05

    General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

    • data_articles_journal_list_v5.xlsx: full list of 140 academic journals in which data papers or/and software papers could be published
    • data_articles_journal_list_v5.csv: full list of 140 academic journals in which data papers or/and software papers could be published

    Relationship between files: both files have the same information. Two different formats are offered to improve reuse

    Type of version of the dataset: final processed version

    Versions of the files: 5th version - Information updated: number of journals, URL, document types associated to a specific journal.

    Version: 4

    Authors: Carlota Balsa-Sánchez, Vanesa Loureiro

    Date of data collection: 2022/12/15

    General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

    • data_articles_journal_list_v4.xlsx: full list of 140 academic journals in which data papers or/and software papers could be published
    • data_articles_journal_list_v4.csv: full list of 140 academic journals in which data papers or/and software papers could be published

    Relationship between files: both files have the same information. Two different formats are offered to improve reuse

    Type of version of the dataset: final processed version

    Versions of the files: 4th version - Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types - Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR), Scopus and Web of Science (WOS), Journal Master List.

    Version: 3

    Authors: Carlota Balsa-Sánchez, Vanesa Loureiro

    Date of data collection: 2022/10/28

    General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

    • data_articles_journal_list_v3.xlsx: full list of 124 academic journals in which data papers or/and software papers could be published
    • data_articles_journal_list_3.csv: full list of 124 academic journals in which data papers or/and software papers could be published

    Relationship between files: both files have the same information. Two different formats are offered to improve reuse

    Type of version of the dataset: final processed version

    Versions of the files: 3rd version - Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types - Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR).

    Erratum - Data articles in journals Version 3:

    Botanical Studies -- ISSN 1999-3110 -- JCR (JIF) Q2 Data -- ISSN 2306-5729 -- JCR (JIF) n/a Data in Brief -- ISSN 2352-3409 -- JCR (JIF) n/a

    Version: 2

    Author: Francisco Rubio, Universitat Politècnia de València.

    Date of data collection: 2020/06/23

    General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

    • data_articles_journal_list_v2.xlsx: full list of 56 academic journals in which data papers or/and software papers could be published
    • data_articles_journal_list_v2.csv: full list of 56 academic journals in which data papers or/and software papers could be published

    Relationship between files: both files have the same information. Two different formats are offered to improve reuse

    Type of version of the dataset: final processed version

    Versions of the files: 2nd version - Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types - Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Scimago Journal and Country Rank (SJR)

    Total size: 32 KB

    Version 1: Description

    This dataset contains a list of journals that publish data articles, code, software articles and database articles.

    The search strategy in DOAJ and Ulrichsweb was the search for the word data in the title of the journals. Acknowledgements: Xaquín Lores Torres for his invaluable help in preparing this dataset.

  15. r

    Writing Skills for Undergraduate Students in Fiji: Tackling Educational...

    • researchdata.edu.au
    Updated Mar 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iyengar Arvind; Ndhlovu Finex; Goundar Prashneel; Ravisan Goundar Prashneel; Ravisan Goundar Prashneel; Prashneel Goundar; Finex Ndhlovu; Arvind Iyengar (2023). Writing Skills for Undergraduate Students in Fiji: Tackling Educational Inequalities, Facilitating Epistemic Access - Dataset [Dataset]. http://doi.org/10.25952/2TMF-5M80
    Explore at:
    Dataset updated
    Mar 1, 2023
    Dataset provided by
    University of New England, Australia
    University of New England
    Authors
    Iyengar Arvind; Ndhlovu Finex; Goundar Prashneel; Ravisan Goundar Prashneel; Ravisan Goundar Prashneel; Prashneel Goundar; Finex Ndhlovu; Arvind Iyengar
    Description

    The fieldwork component of this study comprised of academic English language tests with 120 participants and 30 open-ended in-depth interviews with first year undergraduate university students in Fiji. To this end, the fieldwork involved administering academic English language tests, using writing interventions and using these to evaluate educational inequalities faced by the students. This process was aided by the use of open-ended questions. The participants were required to sit two academic English language writing tests, one at the beginning of their first year and one at the end of the first year. This research was carried out as a longitudinal study by administrating a writing test in the second week of the first year (beginning) of their university program, followed by a second test at the end of their first year, namely, in the final week of classes in semester two of the year. The test was conducted at the beginning and at the end of their first year which lasted 1 hour. There were three writing interventions and feedback was given throughout the yearlong study. The writing interventions were academic essays, paragraph writing and summary writing. Tasks in the writing intervention involved students to write and submit to the researcher in their leisure time. I provided feedback on each of the three interventions individually to the cohort after assessing them throughout the year. Feedback involved highlighting nonstandard forms of writing style or grammar, discussing ways of improving the writing pieces and suggesting resources on academic writing. A total of 30 interviews (30 - 40 minutes each) were conducted at the end of the participants' first year via Zoom and on Skype. Volunteers from the same cohort of 120 participants were recruited at random based on their performance in the tests, both high performers as well as low performers were interviewed. The interviews were conducted after the end of the students' one-year university program.

  16. Data outcomes mapping exercise

    • figshare.com
    docx
    Updated Apr 21, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heather Coates (2016). Data outcomes mapping exercise [Dataset]. http://doi.org/10.6084/m9.figshare.3168763.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    Apr 21, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Heather Coates
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This exercise is used in various workshops and trainings to help students, particularly graduate students, think through the ways in which their research design will affect data collection, data management, analysis, write-up/reporting, and reuse.

  17. Sharing data on request when contacting a corresponding author versus all...

    • osf.io
    url
    Updated Jun 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Petar Dolonga; Ruzica Bojcic; Mirko Gabelica; Livia Puljak (2024). Sharing data on request when contacting a corresponding author versus all authors: randomized controlled trial [Dataset]. http://doi.org/10.17605/OSF.IO/SG8V7
    Explore at:
    urlAvailable download formats
    Dataset updated
    Jun 3, 2024
    Dataset provided by
    Center for Open Sciencehttps://cos.io/
    Authors
    Petar Dolonga; Ruzica Bojcic; Mirko Gabelica; Livia Puljak
    Description

    A data availability statement (DAS) is part of a research manuscript that contains information about where the raw data from the study can be accessed. Many journals do not require authors to write a DAS, and then most authors will not include such a statement [1]. In journals that require authors to write DAS in their manuscripts, most authors write in the DAS that their data is available on request from the corresponding authors. However, it has been shared that the overwhelming majority of those corresponding authors do not even respond to a message with the data request, and very few share their data [2].

    Other than genuinely not wanting to share their data, other potential reasons for not even answering such messages are that the message requesting data ended up in a spam folder, that they are too busy, or that other team member(s) hold the raw data.

    The aim of this study is to assess whether more raw data can be accessed if the data sharing request is sent to all authors versus only requesting data from the corresponding author.

  18. f

    Data from: Developing Students’ Statistical Expertise Through Writing in the...

    • tandf.figshare.com
    pdf
    Updated Jun 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura S. DeLuca; Alex Reinhart; Gordon Weinberg; Michael Laudenbach; Sydney Miller; David West Brown (2025). Developing Students’ Statistical Expertise Through Writing in the Age of AI [Dataset]. http://doi.org/10.6084/m9.figshare.28883205.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 6, 2025
    Dataset provided by
    Taylor & Francis
    Authors
    Laura S. DeLuca; Alex Reinhart; Gordon Weinberg; Michael Laudenbach; Sydney Miller; David West Brown
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As large language models (LLMs) such as GPT have become more accessible, concerns about their potential effects on students’ learning have grown. In data science education, the specter of students’ turning to LLMs raises multiple issues, as writing is a means not just of conveying information but of developing their statistical reasoning. In our study, we engage with questions surrounding LLMs and their pedagogical impact by: (a) quantitatively and qualitatively describing how select LLMs write report introductions and complete data analysis reports; and (b) comparing patterns in texts authored by LLMs to those authored by students and by published researchers. Our results show distinct differences between machine-generated and human-generated writing, as well as between novice and expert writing. Those differences are evident in how writers manage information, modulate confidence, signal importance, and report statistics. The findings can help inform classroom instruction, whether that instruction is aimed at dissuading the use of LLMs or at guiding their use as a productivity tool. It also has implications for students’ development as statistical thinkers and writers. What happens when they offload the work of data science to a model that doesn’t write quite like a data scientist? Supplementary materials for this article are available online.

  19. Up-to-date mapping of COVID-19 treatment and vaccine development...

    • zenodo.org
    bin, csv, png
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomáš Wagner; Ivana Mišová; Ivana Mišová; Ján Frankovský; Ján Frankovský; Tomáš Wagner (2024). Up-to-date mapping of COVID-19 treatment and vaccine development (covid19-help.org data dump) [Dataset]. http://doi.org/10.5281/zenodo.4601446
    Explore at:
    csv, png, binAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Tomáš Wagner; Ivana Mišová; Ivana Mišová; Ján Frankovský; Ján Frankovský; Tomáš Wagner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The free database mapping COVID-19 treatment and vaccine development based on the global scientific research is available at https://covid19-help.org/.

    Files provided here are curated partial data exports in the form of .csv files or full data export as .sql script generated with pg_dump from our PostgreSQL 12 database. You can also find .png file with our ER diagram of tables in .sql file in this repository.

    Structure of CSV files

    *On our site, compounds are named as substances

    compounds.csv

    1. Id - Unique identifier in our database (unsigned integer)

    2. Name - Name of the Substance/Compound (string)

    3. Marketed name - The marketed name of the Substance/Compound (string)

    4. Synonyms - Known synonyms (string)

    5. Description - Description (HTML code)

    6. Dietary sources - Dietary sources where the Substance/Compound can be found (string)

    7. Dietary sources URL - Dietary sources URL (string)

    8. Formula - Compound formula (HTML code)

    9. Structure image URL - Url to our website with the structure image (string)

    10. Status - Status of approval (string)

    11. Therapeutic approach - Approach in which Substance/Compound works (string)

    12. Drug status - Availability of Substance/Compound (string)

    13. Additional data - Additional data in stringified JSON format with data as prescribing information and note (string)

    14. General information - General information about Substance/Compound (HTML code)

    references.csv

    1. Id - Unique identifier in our database (unsigned integer)

    2. Impact factor - Impact factor of the scientific article (string)

    3. Source title - Title of the scientific article (string)

    4. Source URL - URL link of the scientific article (string)

    5. Tested on species - What testing model was used for the study (string)

    6. Published at - Date of publication of the scientific article (Date in ISO 8601 format)

    clinical-trials.csv

    1. Id - Unique identifier in our database (unsigned integer)

    2. Title - Title of the clinical trial study (string)

    3. Acronym title - Acronym of title of the clinical trial study (string)

    4. Source id - Unique identifier in the source database

    5. Source id optional - Optional identifier in other databases (string)

    6. Interventions - Description of interventions (string)

    7. Study type - Type of the conducted study (string)

    8. Study results - Has results? (string)

    9. Phase - Current phase of the clinical trial (string)

    10. Url - URL to clinical trial study page on clinicaltrials.gov (string)

    11. Status - Status in which study currently is (string)

    12. Start date - Date at which study was started (Date in ISO 8601 format)

    13. Completion date - Date at which study was completed (Date in ISO 8601 format)

    14. Additional data - Additional data in the form of stringified JSON with data as locations of study, study design, enrollment, age, outcome measures (string)

    compound-reference-relations.csv

    1. Reference id - Id of a reference in our DB (unsigned integer)

    2. Compound id - Id of a substance in our DB (unsigned integer)

    3. Note - Id of a substance in our DB (unsigned integer)

    4. Is supporting - Is evidence supporting or contradictory (Boolean, true if supporting)

    compound-clinical-trial.csv

    1. Clinical trial id - Id of a clinical trial in our DB (unsigned integer)

    2. Compound id - Id of a Substance/Compound in our DB (unsigned integer)

    tags.csv

    1. Id - Unique identifier in our database (unsigned integer)

    2. Name - Name of the tag (string)

    tags-entities.csv

    1. Tag id - Id of a tag in our DB (unsigned integer)

    2. Reference id - Id of a reference in our DB (unsigned integer)

    API Specification

    Our project also has an Open API that gives you access to our data in a format suitable for processing, particularly in JSON format.

    https://covid19-help.org/api-specification

    Services are split into five endpoints:

    • Substances - /api/substances

    • References - /api/references

    • Substance-reference relations - /api/substance-reference-relations

    • Clinical trials - /api/clinical-trials

    • Clinical trials-substances relations - /api/clinical-trials-substances

    Method of providing data

    • All dates are text strings formatted in compliance with ISO 8601 as YYYY-MM-DD

    • If the syntax request is incorrect (missing or incorrectly formatted parameters) an HTTP 400 Bad Request response will be returned. The body of the response may include an explanation.

    • Data updated_at (used for querying changed-from) refers only to a particular entity and not its logical relations. Example: If a new substance reference relation is added, but the substance detail has not changed, this is reflected in the substance reference relation endpoint where a new entity with id and current dates in created_at and updated_at fields will be added, but in substances or references endpoint nothing has changed.

    The recommended way of sequential download

    • During the first download, it is possible to obtain all data by entering an old enough date in the parameter value changed-from, for example: changed-from=2020-01-01 It is important to write down the date on which the receiving the data was initiated let’s say 2020-10-20

    • For repeated data downloads, it is sufficient to receive only the records in which something has changed. It can therefore be requested with the parameter changed-from=2020-10-20 (example from the previous bullet). Again, it is important to write down the date when the updates were downloaded (eg. 2020-10-20). This date will be used in the next update (refresh) of the data.

    Services for entities

    List of endpoint URLs:

    Format of the request

    All endpoints have these parameters in common:

    • changed-from - a parameter to return only the entities that have been modified on a given date or later.

    • continue-after-id - a parameter to return only the entities that have a larger ID than specified in the parameter.

    • limit - a parameter to return only the number of records specified (up to 1000). The preset number is 100.

    Request example:

    /api/references?changed-from=2020-01-01&continue-after-id=1&limit=100

    Format of the response

    The response format is the same for all endpoints.

    • number_of_remaining_ids - the number of remaining entities that meet the specified criteria but are not displayed on the page. An integer of virtually unlimited size.

    • entities - an array of entity details in JSON format.

    Response example:

    {

    "number_of_remaining_ids" : 100,

    "entities" : [

    {

    "id": 3,

    "url": "https://www.ncbi.nlm.nih.gov/pubmed/32147628",

    "title": "Discovering drugs to treat coronavirus disease 2019 (COVID-19).",

    "impact_factor": "Discovering drugs to treat coronavirus disease 2019 (COVID-19).",

    "tested_on_species": "in silico",

    "publication_date": "2020-22-02",

    "created_at": "2020-30-03",

    "updated_at": "2020-31-03",

    "deleted_at": null

    },

    {

    "id": 4,

    "url": "https://www.ncbi.nlm.nih.gov/pubmed/32157862",

    "title": "CT Manifestations of Novel Coronavirus Pneumonia: A Case Report",

    "impact_factor": "CT Manifestations of Novel Coronavirus Pneumonia: A Case Report",

    "tested_on_species": "Patient",

    "publication_date": "2020-06-03",

    "created_at": "2020-30-03",

    "updated_at": "2020-30-03",

    "deleted_at": null

    },

    ]

    }

    Endpoint details

    Substances

    URL: /api/substances

    Substances

  20. f

    Data from: Workshop FAIR Data and Data Reuse for Environmental Science Group...

    • figshare.com
    zip
    Updated Nov 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    L. (Luc) Steinbuch; Cindy Quik (2022). Workshop FAIR Data and Data Reuse for Environmental Science Group Researchers [Dataset]. http://doi.org/10.4121/21399975.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 1, 2022
    Dataset provided by
    4TU.ResearchData
    Authors
    L. (Luc) Steinbuch; Cindy Quik
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    We designed and organized a one-day workshop, where in the context of FAIR the following themes were discussed and practiced: scientific transparency and reproducibility; how to write a README; data and code licenses; spatial data; programming code; examples of published datasets; data reuse; and discipline and motivation. The intended audience were researchers at the Environmental Science Group of Wageningen University and Research. All workshop materials were designed with further development and reuse in mind and are shared through this dataset.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Dataset 1: Studies included in literature review [Dataset]. https://catalog.data.gov/dataset/dataset-1-studies-included-in-literature-review
Organization logo

Dataset 1: Studies included in literature review

Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description

This dataset contains the results of a literature review of experimental nutrient addition studies to determine which nutrient forms were most often measured in the scientific literature. To obtain a representative selection of relevant studies, we searched Web of Science™ using a search string to target experimental studies in artificial and natural lotic systems while limiting irrelevant papers. We screened the titles and abstracts of returned papers for relevance (experimental studies in streams/stream mesocosms that manipulated nutrients). To supplement this search, we sorted the relevant articles from the Web of Science™ search alphabetically by author and sequentially examined the bibliographies for additional relevant articles (screening titles for relevance, and then screening abstracts of potentially relevant articles) until we had obtained a total of 100 articles. If we could not find a relevant article electronically, we moved to the next article in the bibliography. Our goal was not to be completely comprehensive, but to obtain a fairly large sample of published, peer-reviewed studies from which to assess patterns. We excluded any lentic or estuarine studies from consideration and included only studies that used mesocosms mimicking stream systems (flowing water or stream water source) or that manipulated nutrient concentrations in natural streams or rivers. We excluded studies that used nutrient diffusing substrate (NDS) because these manipulate nutrients on substrates and not in the water column. We also excluded studies examining only nutrient uptake, which rely on measuring dissolved nutrient concentrations with the goal of characterizing in-stream processing (e.g., Newbold et al., 1983). From the included studies, we extracted or summarized the following information: study type, study duration, nutrient treatments, nutrients measured, inclusion of TN and/or TP response to nutrient additions, and a description of how results were reported in relation to the research-management mismatch, if it existed. Below is information on how the search was conducted: Search string used for Web of Science advanced search Search conducted on 27 September 2016. TS= (stream OR creek OR river* OR lotic OR brook OR headwater OR tributary) AND TS = (mesocosm OR flume OR "artificial stream" OR "experimental stream" OR "nutrient addition") AND TI= (nitrogen OR phosphorus OR nutrient OR enrichment OR fertilization OR eutrophication)

Search
Clear search
Close search
Google apps
Main menu