100+ datasets found

r
Dataset for "Do LiU researchers publish data – and where? Dataset analysis...
researchdata.se
Updated Mar 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaori Hoshi Larsson (2025). Dataset for "Do LiU researchers publish data – and where? Dataset analysis using ODDPub" [Dataset]. http://doi.org/10.5281/ZENODO.15017715
Explore at:
Unique identifier
https://doi.org/10.5281/ZENODO.15017715
Dataset updated
Mar 18, 2025
Dataset provided by
Linköping University
Authors
Kaori Hoshi Larsson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains the results from the ODDPubb text mining algorithm and the findings from manual analysis. Full-text PDFs of all articles parallel-published by Linköping University in 2022 were extracted from the institute's repository, DiVA. These were analyzed using the ODDPubb (https://github.com/quest-bih/oddpub) text mining algorithm to determine the extent of data sharing and identify the repositories where the data was shared. In addition to the results from ODDPubb, manual analysis was conducted to confirm the presence of data sharing statements, assess data availability, and identify the repositories used.
s
Analysis of CBCS publications for Open Access, data availability statements...
figshare.scilifelab.se
researchdata.se
txt
Updated Jan 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Theresa Kieselbach (2025). Analysis of CBCS publications for Open Access, data availability statements and persistent identifiers for supplementary data [Dataset]. http://doi.org/10.17044/scilifelab.23641749.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.17044/scilifelab.23641749.v1
Dataset updated
Jan 15, 2025
Dataset provided by
Umeå University
Authors
Theresa Kieselbach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General descriptionThis dataset contains some markers of Open Science in the publications of the Chemical Biology Consortium Sweden (CBCS) between 2010 and July 2023. The sample of CBCS publications during this period consists of 188 articles. Every publication was visited manually at its DOI URL to answer the following questions.1. Is the research article an Open Access publication?2. Does the research article have a Creative Common license or a similar license?3. Does the research article contain a data availability statement?4. Did the authors submit data of their study to a repository such as EMBL, Genbank, Protein Data Bank PDB, Cambridge Crystallographic Data Centre CCDC, Dryad or a similar repository?5. Does the research article contain supplementary data?6. Do the supplementary data have a persistent identifier that makes them citable as a defined research output?VariablesThe data were compiled in a Microsoft Excel 365 document that includes the following variables.1. DOI URL of research article2. Year of publication3. Research article published with Open Access4. License for research article5. Data availability statement in article6. Supplementary data added to article7. Persistent identifier for supplementary data8. Authors submitted data to NCBI or EMBL or PDB or Dryad or CCDCVisualizationParts of the data were visualized in two figures as bar diagrams using Microsoft Excel 365. The first figure displays the number of publications during a year, the number of publications that is published with open access and the number of publications that contain a data availability statement (Figure 1). The second figure shows the number of publication sper year and how many publications contain supplementary data. This figure also shows how many of the supplementary datasets have a persistent identifier (Figure 2).File formats and softwareThe file formats used in this dataset are:.csv (Text file).docx (Microsoft Word 365 file).jpg (JPEG image file).pdf/A (Portable Document Format for archiving).png (Portable Network Graphics image file).pptx (Microsoft Power Point 365 file).txt (Text file).xlsx (Microsoft Excel 365 file)All files can be opened with Microsoft Office 365 and work likely also with the older versions Office 2019 and 2016. MD5 checksumsHere is a list of all files of this dataset and of their MD5 checksums.1. Readme.txt (MD5: 795f171be340c13d78ba8608dafb3e76)2. Manifest.txt (MD5: 46787888019a87bb9d897effdf719b71)3. Materials_and_methods.docx (MD5: 0eedaebf5c88982896bd1e0fe57849c2),4. Materials_and_methods.pdf (MD5: d314bf2bdff866f827741d7a746f063b),5. Materials_and_methods.txt (MD5: 26e7319de89285fc5c1a503d0b01d08a),6. CBCS_publications_until_date_2023_07_05.xlsx (MD5: 532fec0bd177844ac0410b98de13ca7c),7. CBCS_publications_until_date_2023_07_05.csv (MD5: 2580410623f79959c488fdfefe8b4c7b),8. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.xlsx (MD5: 9c67dd84a6b56a45e1f50a28419930e5),9. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.csv (MD5: fb3ac69476bfc57a8adc734b4d48ea2b),10. Aggregated_data_from_CBCS_publications_until_2023_07_05.xlsx (MD5: 6b6cbf3b9617fa8960ff15834869f793),11. Aggregated_data_from_CBCS_publications_until_2023_07_05.csv (MD5: b2b8dd36ba86629ed455ae5ad2489d6e),12. Figure_1_CBCS_publications_until_2023_07_05_Open_Access_and_data_availablitiy_statement.xlsx (MD5: 9c0422cf1bbd63ac0709324cb128410e),13. Figure_1.pptx (MD5: 55a1d12b2a9a81dca4bb7f333002f7fe),14. Image_of_figure_1.jpg (MD5: 5179f69297fbbf2eaaf7b641784617d7),15. Image_of_figure_1.png (MD5: 8ec94efc07417d69115200529b359698),16. Figure_2_CBCS_publications_until_2023_07_05_supplementary_data_and_PID_for_supplementary_data.xlsx (MD5: f5f0d6e4218e390169c7409870227a0a),17. Figure_2.pptx (MD5: 0fd4c622dc0474549df88cf37d0e9d72),18. Image_of_figure_2.jpg (MD5: c6c68b63b7320597b239316a1c15e00d),19. Image_of_figure_2.png (MD5: 24413cc7d292f468bec0ac60cbaa7809)
Secondary Data from Insights from Publishing Open Data in Industry-Academia...
zenodo.org
bin, json +2
Updated Sep 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Per Erik Strandberg; Per Erik Strandberg; Philipp Peterseil; Philipp Peterseil; Julian Karoliny; Julian Karoliny; Johanna Kallio; Johanna Kallio; Johannes Peltola; Johannes Peltola (2024). Secondary Data from Insights from Publishing Open Data in Industry-Academia Collaboration [Dataset]. http://doi.org/10.5281/zenodo.13767153
Explore at:
json, text/x-python, bin, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13767153
Dataset updated
Sep 16, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Per Erik Strandberg; Per Erik Strandberg; Philipp Peterseil; Philipp Peterseil; Julian Karoliny; Julian Karoliny; Johanna Kallio; Johanna Kallio; Johannes Peltola; Johannes Peltola
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Secondary Data from Insights from Publishing Open Data in Industry-Academia Collaboration

Authors

Per Erik Strandberg [1], Philipp Peterseil [2], Julian Karoliny [3], Johanna Kallio [4], and Johannes Peltola [4].

[1] Westermo Network Technologies AB (Sweden).
[2] Johannes Kepler University Linz (Austria)
[3] Silicon Austria Labs GmbH (Austria).
[4] VTT Technical Research Centre of Finland Ltd. (Finland).

Description

This data is to accompany a paper submitted to Elsevier's data in brief in 2024, with the title Insights from Publishing Open Data in Industry-Academia Collaboration.

Tentative Abstract: Effective data management and sharing are critical success factors in industry-academia collaboration. This paper explores the motivations and lessons learned from publishing open data sets in such collaborations. Through a survey of participants in a European research project that published 13 data sets, and an analysis of metadata from almost 281 thousand datasets in Zenodo, we collected qualitative and quantitative results on motivations, achievements, research questions, licences and file types. Through inductive reasoning and statistical analysis we found that planning the data collection is essential, and that only few datasets (2.4%) had accompanying scripts for improved reuse. We also found that authors are not well aware of the importance of licences or which licence to choose. Finally, we found that data with a synthetic origin, collected with simulations and potentially mixed with real measurements, can be very meaningful, as predicted by Gartner and illustrated by many datasets collected in our research project.

Secondary data from Survey

The file survey.txt contains secondary data from a survey of participants that published open data sets in the 3-year European research project InSecTT.

Secondary data from Zenodo

The file secondary_data_zenodo.json contains secondary data from an analysis of data sets published in Zenodo. It is accompanied with a py-file and a ipynb-file to serve as examples.

License

This data is licenced with the Creative Commons Attribution 4.0 International license. You are free to use the data if you attribute the authors. Read the license text for details.
w
Dataset of books called Data analysis with SPSS : a first course in applied...
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Data analysis with SPSS : a first course in applied statistics [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Data+analysis+with+SPSS+%3A+a+first+course+in+applied+statistics
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 2 rows and is filtered where the book is Data analysis with SPSS : a first course in applied statistics. It features 7 columns including author, publication date, language, and book publisher.
Data from: Analysis of shared research data in Spanish scientific papers...
zenodo.org
explore.openaire.eu
Updated Sep 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roxana Cerda-Cosme; Roxana Cerda-Cosme; Eva Méndez; Eva Méndez (2022). Analysis of shared research data in Spanish scientific papers about COVID-19: a first approach [Dataset]. http://doi.org/10.5281/zenodo.7125642
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.7125642
Dataset updated
Sep 30, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Roxana Cerda-Cosme; Roxana Cerda-Cosme; Eva Méndez; Eva Méndez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction: During the coronavirus pandemic, changes in the way science is done and shared occurred, which motivates meta-research to help understand science communication in crises and improve its effectiveness. Objective: To study how many Spanish scientific papers on COVID-19 published during 2020 share their research data. Methodology: Qualitative and descriptive study applying nine attributes: (1) availability, (2) accessibility, (3) format, (4) licensing, (5) linkage, (6) funding, (7) editorial policy, (8) content and (9) statistics. Results: We analyzed 1340 papers, 1173 (87.5%) did not have research data. 12.5% share their research data of which 2.1% share their data in repositories, 5% share their data through a simple request, 0.2% do not have permission to share their data and 5.2% share their data as supplementary material. Conclusions: There is a small percentage that shares their research data, however it demonstrates the researchers' poor knowledge on how to properly share their research data and their lack of knowledge on what is research data.
o
Data and program files associated with the publication: Effective Programs...
openicpsr.org
delimited, zip
Updated Jan 4, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marta Pellegrini; Cynthia Lake; Amanda Neitzel; Robert E. Slavin (2021). Data and program files associated with the publication: Effective Programs in Elementary Mathematics: A Meta-Analysis [Dataset]. http://doi.org/10.3886/E130284V2
Explore at:
zip, delimitedAvailable download formats
Unique identifier
https://doi.org/10.3886/E130284V2
Dataset updated
Jan 4, 2021
Dataset provided by
University of Florence
Johns Hopkins University
Authors
Marta Pellegrini; Cynthia Lake; Amanda Neitzel; Robert E. Slavin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data include information about 85 rigorous experimental studies that evaluated 64 programs in grades K-5 mathematics. These data were collected by the research team from studies included in a systematic review of programs for elementary mathematics. The data contain study and finding level information to examine what types of programs are most effective.
PAH Published Dataset Data In Brief
catalog.data.gov
s.cnmilf.com
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). PAH Published Dataset Data In Brief [Dataset]. https://catalog.data.gov/dataset/pah-published-dataset-data-in-brief
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
PAH method development and sample collection. This dataset is associated with the following publication: Wallace, M., J. Pleil, D. Whitaker, and K. Oliver. Dataset of polycyclic aromatic hydrocarbon recoveries from a selection of sorbent tubes for thermal desorption-gas chromatography/mass spectrometry analysis. Data in Brief. Elsevier B.V., Amsterdam, NETHERLANDS, 29: 105252, (2020).
f
Data from: FLiPPR: A Processor for Limited Proteolysis (LiP) Mass...
acs.figshare.com
xlsx
Updated May 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Edgar Manriquez-Sandoval; Joy Brewer; Gabriela Lule; Samanta Lopez; Stephen D. Fried (2024). FLiPPR: A Processor for Limited Proteolysis (LiP) Mass Spectrometry Data Sets Built on FragPipe [Dataset]. http://doi.org/10.1021/acs.jproteome.3c00887.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jproteome.3c00887.s002
Dataset updated
May 24, 2024
Dataset provided by
ACS Publications
Authors
Edgar Manriquez-Sandoval; Joy Brewer; Gabriela Lule; Samanta Lopez; Stephen D. Fried
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Here, we present FLiPPR, or FragPipe LiP (limited proteolysis) Processor, a tool that facilitates the analysis of data from limited proteolysis mass spectrometry (LiP-MS) experiments following primary search and quantification in FragPipe. LiP-MS has emerged as a method that can provide proteome-wide information on protein structure and has been applied to a range of biological and biophysical questions. Although LiP-MS can be carried out with standard laboratory reagents and mass spectrometers, analyzing the data can be slow and poses unique challenges compared to typical quantitative proteomics workflows. To address this, we leverage FragPipe and then process its output in FLiPPR. FLiPPR formalizes a specific data imputation heuristic that carefully uses missing data in LiP-MS experiments to report on the most significant structural changes. Moreover, FLiPPR introduces a data merging scheme and a protein-centric multiple hypothesis correction scheme, enabling processed LiP-MS data sets to be more robust and less redundant. These improvements strengthen statistical trends when previously published data are reanalyzed with the FragPipe/FLiPPR workflow. We hope that FLiPPR will lower the barrier for more users to adopt LiP-MS, standardize statistical procedures for LiP-MS data analysis, and systematize output to facilitate eventual larger-scale integration of LiP-MS data.
d
Replication Data for \"Upcoming issues, new methods: using Interactive...
search.dataone.org
Updated Nov 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Behling, Gustavo; Lenzi, Fernando César; Rossetto, Carlos Ricardo (2023). Replication Data for \"Upcoming issues, new methods: using Interactive Qualitative Analysis (IQA) in Management Research\" published by RAC. Revista de Administração Contemporânea [Dataset]. http://doi.org/10.7910/DVN/LTULNQ
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/LTULNQ
Dataset updated
Nov 12, 2023
Dataset provided by
Harvard Dataverse
Authors
Behling, Gustavo; Lenzi, Fernando César; Rossetto, Carlos Ricardo
Description
These data refer to the paper “Upcoming issues, new methods: using Interactive Qualitative Analysis (IQA) in Management Research”. This article is a guide to the application of the IQA method in management research and the files available refer to: 1. 1-Affinities, definitions, and cards produced by focus group.docx: all cards, affinities and definitions create by focus group session.docx 2. 2-Step-by-step - Analysis procedures.docx: detailed data analysis procedures.docx 3. 3-Axial Coding Tables – Individual Interviews.docx: detailed axial coding procedures.docx 4. 4-Theoretical Coding Table – Individual Interviews.docx: detailed theoretical coding procedures.docx
D
COVID-19 Data Tracker Publishing & Privacy Guidelines
data.sfgov.org
gimi9.com
+1more
application/rdfxml +5
Updated Jul 3, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Public Health (2020). COVID-19 Data Tracker Publishing & Privacy Guidelines [Dataset]. https://data.sfgov.org/COVID-19/COVID-19-Data-Tracker-Publishing-Privacy-Guideline/9aj4-um47
Explore at:
csv, tsv, application/rssxml, xml, json, application/rdfxmlAvailable download formats
Dataset updated
Jul 3, 2020
Dataset authored and provided by
Department of Public Health
Description
A. SUMMARY It is the policy of the San Francisco Department of Public Health to comply with patient/client/resident rights regarding Protected Health Information (PHI) as set forth in the Health Insurance Portability and Accountability Act of 1996 (HIPAA). These guidelines exists to provide guidance only as it relates to the public release of COVID-19 data through the tracker webpages, so that public reporting of de-identified information of residents’ health status, demographic and other characteristics, and geographical information reflect consistent reporting practices and meaningful differences in health outcomes, conditions that impact health, and delivery of services while safeguarding patient/client/resident rights regarding PHI.

COVID-19 related data will be released routinely in a variety of data products related to the tracker, including datasets through SF OpenData. Some data products may include data by county or smaller analysis unit such as ZIP code, neighborhood, or census tract.

Download the attached PDF for the policy.
Data and Code for: Methods Matter: P-Hacking and Publication Bias in Causal...
openicpsr.org
Updated Mar 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abel Brodeur; Anthony Heyes; Nikolai Cook (2022). Data and Code for: Methods Matter: P-Hacking and Publication Bias in Causal Analysis in Economics: Reply [Dataset]. http://doi.org/10.3886/E165621V1
Explore at:
Unique identifier
https://doi.org/10.3886/E165621V1
Dataset updated
Mar 21, 2022
Dataset provided by
American Economic Associationhttp://www.aeaweb.org/
Authors
Abel Brodeur; Anthony Heyes; Nikolai Cook
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the replication package of Methods Matter: P-Hacking and Publication Bias in Causal Analysis in Economics: Reply.
m
World’s Top 2% of Scientists list by Stanford University: An Analysis of its...
data.mendeley.com
Updated Nov 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
JOHN Philip (2023). World’s Top 2% of Scientists list by Stanford University: An Analysis of its Robustness [Dataset]. http://doi.org/10.17632/td6tdp4m6t.1
Explore at:
Unique identifier
https://doi.org/10.17632/td6tdp4m6t.1
Dataset updated
Nov 17, 2023
Authors
JOHN Philip
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
John Ioannidis and co-authors [1] created a publicly available database of top-cited scientists in the world. This database, intended to address the misuse of citation metrics, has generated a lot of interest among the scientific community, institutions, and media. Many institutions used this as a yardstick to assess the quality of researchers. At the same time, some people look at this list with skepticism citing problems with the methodology used. Two separate databases are created based on career-long and, single recent year impact. This database is created using Scopus data from Elsevier[1-3]. The Scientists included in this database are classified into 22 scientific fields and 174 sub-fields. The parameters considered for this analysis are total citations from 1996 to 2022 (nc9622), h index in 2022 (h22), c-score, and world rank based on c-score (Rank ns). Citations without self-cites are considered in all cases (indicated as ns). In the case of a single-year case, citations during 2022 (nc2222) instead of Nc9622 are considered.

To evaluate the robustness of c-score-based ranking, I have done a detailed analysis of the matrix parameters of the last 25 years (1998-2022) of Nobel laureates of Physics, chemistry, and medicine, and compared them with the top 100 rank holders in the list. The latest career-long and single-year-based databases (2022) were used for this analysis. The details of the analysis are presented below: Though the article says the selection is based on the top 100,000 scientists by c-score (with and without self-citations) or a percentile rank of 2% or above in the sub-field, the actual career-based ranking list has 204644 names[1]. The single-year database contains 210199 names. So, the list published contains ~ the top 4% of scientists. In the career-based rank list, for the person with the lowest rank of 4809825, the nc9622, h22, and c-score were 41, 3, and 1.3632, respectively. Whereas for the person with the No.1 rank in the list, the nc9622, h22, and c-score were 345061, 264, and 5.5927, respectively. Three people on the list had less than 100 citations during 96-2022, 1155 people had an h22 less than 10, and 6 people had a C-score less than 2.
In the single year-based rank list, for the person with the lowest rank (6547764), the nc2222, h22, and c-score were 1, 1, and 0. 6, respectively. Whereas for the person with the No.1 rank, the nc9622, h22, and c-score were 34582, 68, and 5.3368, respectively. 4463 people on the list had less than 100 citations in 2022, 71512 people had an h22 less than 10, and 313 people had a C-score less than 2. The entry of many authors having single digit H index and a very meager total number of citations indicates serious shortcomings of the c-score-based ranking methodology. These results indicate shortcomings in the ranking methodology.
Z
Data from: Large Landing Trajectory Data Set for Go-Around Analysis
data.niaid.nih.gov
zenodo.org
Updated Dec 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Manuel Waltert (2022). Large Landing Trajectory Data Set for Go-Around Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7148116
Explore at:
Dataset updated
Dec 16, 2022
Dataset provided by
Raphael Monstein
Manuel Waltert
Benoit Figuet
Timothé Krauth
Marcel Dettling
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.

If you use this data for a scientific publication, please consider citing our paper.

The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:

go_arounds_minimal.csv.gz

Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:

Column name Type Description time date time UTC time of landing or first GA attempt icao24 string Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned callsign string Aircraft identifier in air-ground communications airport string ICAO airport code where the aircraft is landing runway string Runway designator on which the aircraft landed has_ga string "True" if at least one GA was performed, otherwise "False" n_approaches integer Number of approaches identified for this flight n_rwy_approached integer Number of unique runways approached by this flight

The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.

go_arounds_augmented.csv.gz

Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:

Column name Type Description time date time UTC time of landing or first GA attempt icao24 string Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned callsign string Aircraft identifier in air-ground communications airport string ICAO airport code where the aircraft is landing runway string Runway designator on which the aircraft landed has_ga string "True" if at least one GA was performed, otherwise "False" n_approaches integer Number of approaches identified for this flight n_rwy_approached integer Number of unique runways approached by this flight registration string Aircraft registration typecode string Aircraft ICAO typecode icaoaircrafttype string ICAO aircraft type wtc string ICAO wake turbulence category glide_slope_angle float Angle of the ILS glide slope in degrees has_intersection

string

Boolean that is true if the runway has an other runway intersecting it, otherwise false rwy_length float Length of the runway in kilometre airport_country string ISO Alpha-3 country code of the airport airport_region string Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania) operator_country string ISO Alpha-3 country code of the operator operator_region string Geographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania) wind_speed_knts integer METAR, surface wind speed in knots wind_dir_deg integer METAR, surface wind direction in degrees wind_gust_knts integer METAR, surface wind gust speed in knots visibility_m float METAR, visibility in m temperature_deg integer METAR, temperature in degrees Celsius press_sea_level_p float METAR, sea level pressure in hPa press_p float METAR, QNH in hPA weather_intensity list METAR, list of present weather codes: qualifier - intensity weather_precipitation list METAR, list of present weather codes: weather phenomena - precipitation weather_desc list METAR, list of present weather codes: qualifier - descriptor weather_obscuration list METAR, list of present weather codes: weather phenomena - obscuration weather_other list METAR, list of present weather codes: weather phenomena - other

This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.

go_arounds_agg.csv.gz

Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:

Column name Type Description airport string ICAO airport code where the aircraft is landing runway string Runway designator on which the aircraft landed n_landings integer Total number of landings observed on this runway in 2019 ga_rate float Go-around rate, per 1000 landings glide_slope_angle float Angle of the ILS glide slope in degrees has_intersection string Boolean that is true if the runway has an other runway intersecting it, otherwise false rwy_length float Length of the runway in kilometres airport_country string ISO Alpha-3 country code of the airport airport_region string Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)

This aggregated data set is used in the paper for the generalized linear regression model.

Downloading the trajectories

Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:

import datetime from tqdm.auto import tqdm import pandas as pd from traffic.data import opensky from traffic.core import Traffic

load minimum data set

df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False) df["time"] = pd.to_datetime(df["time"])

select London City Airport, go-arounds, and 2019-01-04

airport = "EGLC" start = datetime.datetime(year=2019, month=1, day=4).replace( tzinfo=datetime.timezone.utc ) stop = datetime.datetime(year=2019, month=1, day=5).replace( tzinfo=datetime.timezone.utc )

df_selection = df.query("airport==@airport & has_ga & (@start <= time <= @stop)")

iterate over flights and pull the data from OpenSky Network

flights = [] delta_time = pd.Timedelta(minutes=10) for _, row in tqdm(df_selection.iterrows(), total=df_selection.shape[0]): # take at most 10 minutes before and 10 minutes after the landing or go-around start_time = row["time"] - delta_time stop_time = row["time"] + delta_time

# fetch the data from OpenSky Network flights.append( opensky.history( start=start_time.strftime("%Y-%m-%d %H:%M:%S"), stop=stop_time.strftime("%Y-%m-%d %H:%M:%S"), callsign=row["callsign"], return_flight=True, ) )

The flights can be converted into a Traffic object

Traffic.from_flights(flights)

Additional files

Additional files are available to check the quality of the classification into GA/not GA and the selection of the landing runway. These are:

validation_table.xlsx: This Excel sheet was manually completed during the review of the samples for each runway in the data set. It provides an estimate of the false positive and false negative rate of the go-around classification. It also provides an estimate of the runway misclassification rate when the airport has two or more parallel runways. The columns with the headers highlighted in red were filled in manually, the rest is generated automatically.

validation_sample.zip: For each runway, 8 batches of 500 randomly selected trajectories (or as many as available, if fewer than 4000) classified as not having a GA and up to 8 batches of 10 random landings, classified as GA, are plotted. This allows the interested user to visually inspect a random sample of the landings and go-arounds easily.
h
Data publication: Matlab scripts for PSD measurments and rate spectra...
rodare.hzdr.de
Updated Jan 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhou, Wenyu; Fischer, Cornelius (2025). Data publication: Matlab scripts for PSD measurments and rate spectra analysis [Dataset]. http://doi.org/10.14278/rodare.3394
Explore at:
Unique identifier
https://doi.org/10.14278/rodare.3394
Dataset updated
Jan 21, 2025
Dataset provided by
Helmholtz-Zentrum Dresden-Rossendorf
Authors
Zhou, Wenyu; Fischer, Cornelius
Description
The open datasets provide the Matlab scripts for the calculation of Power Spectra Density and Rate Spectra in the manuscirpt 'How crystal surface reactivity controls the evolution of surface microtopography during dissolution'.
w
Dataset of books called Longitudinal data analysis
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Longitudinal data analysis [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Longitudinal+data+analysis
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Longitudinal data analysis. It features 7 columns including author, publication date, language, and book publisher.
f
Data from: Analysis of Bibliographic Production on Problem-Based Learning...
scielo.figshare.com
jpeg
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ana Neiline Cavalcante; Geison Vasconcelos Lira; Pedro Gomes Cavalcante Neto; Roberta Cavalcante Muniz Lira (2023). Analysis of Bibliographic Production on Problem-Based Learning (PBL) in Four Selected Journals [Dataset]. http://doi.org/10.6084/m9.figshare.5979958.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5979958.v1
Dataset updated
Jun 6, 2023
Dataset provided by
SciELO journals
Authors
Ana Neiline Cavalcante; Geison Vasconcelos Lira; Pedro Gomes Cavalcante Neto; Roberta Cavalcante Muniz Lira
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ABSTRACT The present study is an integrative review. The aim was to analyze the bibliometric characteristics of empirical scientific literature on PBL in reference medical education journals in Brazil and in the world from 2005 to 2014, and recommend directions for future research. The following data were extracted: year of publication, study setting, study goals, study type, key findings and directions for future research indicated by the authors. The settings in the articles published in the three international journals were Medical Schools in all continents, where n = 6 studies were multicentric. When we analyze the data for the distribution of national studies among states, we can see that they are concentrated in the South and Southeast of Brazil. As regards the objectives of the studies analyzed, we observed that a predominance of attempts to verify the effectiveness of PBL, that is, looking at results in terms of student performance, whether in training or medical practice, and by comparing such results to those obtained via traditional methods. As regards the trend in the number of articles on PBL published in the last 10 years, we observed a significant reduction in the number of articles published in the three international journals from 2005 to 2010. In the case of the national periodical, our data suggest that scientific literature on PBL in Brazil remains at an incipient stage. As regards the research methods used in the studies published in the four selected journals, we noted the predominance of quantitative methods, primarily through the use of surveys (n = 26). The main conclusions of the studies follow the same line of goals. They show the positive results of both PBL in medical education and their implications for professional practice. In terms of guidelines for future research, we see that there is an inclination to carry out further studies to investigate the effectiveness of PBL, as well as more comparative studies. In conclusion we can say that research on PBL is still in its infancy, we still need to move forward in further studies that seek to answer more theoretical and methodological issues and employ epistemological methods. The quality of research carried out also requires attention, because the higher the quality the better the research will support decision-making. A very promising finding is that PBL hás aroused the interest of researchers all over the world. This shows that the method, despite its rigorous steps, can adapt to different cultures and different educational contexts. However, we cannot fail to mention the limitations of this study. To enable the research, the search was reduced to a limited number of journals, and a timeframe of 10 years. Thus it is recommended that further research be conducted, encompassing more publications and a longer period of time, in order to draw a more complete picture of scientific literature on PBL.
Z
Data from: Data accessibility in the chemical sciences: an analysis of...
data.niaid.nih.gov
zenodo.org
Updated Oct 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bloodworth, Sally (2024). Data accessibility in the chemical sciences: an analysis of recent practice in organic chemistry journals [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11068278
Explore at:
Dataset updated
Oct 14, 2024
Dataset provided by
Willoughby, Cerys
Coles, Simon J.
Bloodworth, Sally
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data is the analysis of the data outputs of 240 randomly selected research papers from 12 top-ranked journals published in early 2023. We investigate author compliance with recommended (but not compulsory) data policies, whether there is evidence to suggest that authors apply FAIR data guidance in their data publishing, and if the existence of specific recommendations for publishing NMR data by some journals encourages compliance. Files in the data package have been provided in both human and machine-readable forms. The main dataset is available in the Excel file Data worksheet.XLSX, the contents of which can also be found in Main_dataset.CSV, Data_types.CSV, and Article_selection.CSV with explanations of the variable coding used in the studies in Variable_names.CSV, Codes.CSV, and FAIR_variable_coding.CSV. The R code used for the article selection can be found in Article_selection.R. Data about article types from the journals that contain original research data is in Article_types.CSV. Data collected for analysis in our sister paper[4] can be found in Extended_Adherence.CSV, Extended_Crystallography.CSV, Extended_DAS.CSV, Extended_File_Types.CSV, and Extended_Submission_Process.CSV. A full list of files in the data package and a short description for each is given in README.TXT.
w
Dataset of books called Applied missing data analysis
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Applied missing data analysis [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Applied+missing+data+analysis
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Applied missing data analysis. It features 7 columns including author, publication date, language, and book publisher.
w
Dataset of author, BNB id, book publisher, and publication date of Microsoft...
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of author, BNB id, book publisher, and publication date of Microsoft Excel 2013 : data analysis and business modeling [Dataset]. https://www.workwithdata.com/datasets/books?col=author%2Cbnb_id%2Cbook%2Cbook%2Cbook_publisher%2Cpublication_date&f=1&fcol0=book&fop0=%3D&fval0=Microsoft+Excel+2013+%3A+data+analysis+and+business+modeling
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Microsoft Excel 2013 : data analysis and business modeling. It features 5 columns: author, publication date, book publisher, and BNB id.
s
Global Publication Support Services Market Technological Advancements...
statsndata.org
excel, pdf
Updated Jun 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats N Data (2025). Global Publication Support Services Market Technological Advancements 2025-2032 [Dataset]. https://www.statsndata.org/report/publication-support-services-market-130419
Explore at:
pdf, excelAvailable download formats
Dataset updated
Jun 2025
Authors
Stats N Data
License
https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order
Area covered
Global
Description
The Publication Support Services market has witnessed significant growth and evolution over the years, driven by the increasing demand for high-quality publishing solutions in various industries such as academia, healthcare, and business. As organizations seek to disseminate information effectively, the need for com

Facebook

Twitter

Click to copy link

Link copied

Cite

Kaori Hoshi Larsson (2025). Dataset for "Do LiU researchers publish data – and where? Dataset analysis using ODDPub" [Dataset]. http://doi.org/10.5281/ZENODO.15017715

Dataset for "Do LiU researchers publish data – and where? Dataset analysis using ODDPub"

Explore at:

Unique identifier

https://doi.org/10.5281/ZENODO.15017715

Dataset updated

Mar 18, 2025

Dataset provided by

Linköping University

Authors

Kaori Hoshi Larsson

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset contains the results from the ODDPubb text mining algorithm and the findings from manual analysis. Full-text PDFs of all articles parallel-published by Linköping University in 2022 were extracted from the institute's repository, DiVA. These were analyzed using the ODDPubb (https://github.com/quest-bih/oddpub) text mining algorithm to determine the extent of data sharing and identify the repositories where the data was shared. In addition to the results from ODDPubb, manual analysis was conducted to confirm the presence of data sharing statements, assess data availability, and identify the repositories used.

Clear search

Close search

Google apps

Main menu

Dataset for "Do LiU researchers publish data – and where? Dataset analysis...

Analysis of CBCS publications for Open Access, data availability statements...

Secondary Data from Insights from Publishing Open Data in Industry-Academia...

Secondary Data from Insights from Publishing Open Data in Industry-Academia Collaboration

Authors

Description

Secondary data from Survey

Secondary data from Zenodo

License

Dataset of books called Data analysis with SPSS : a first course in applied...

Data from: Analysis of shared research data in Spanish scientific papers...

Data and program files associated with the publication: Effective Programs...

PAH Published Dataset Data In Brief

Data from: FLiPPR: A Processor for Limited Proteolysis (LiP) Mass...

Replication Data for \"Upcoming issues, new methods: using Interactive...

COVID-19 Data Tracker Publishing & Privacy Guidelines

Data and Code for: Methods Matter: P-Hacking and Publication Bias in Causal...

World’s Top 2% of Scientists list by Stanford University: An Analysis of its...

Data from: Large Landing Trajectory Data Set for Go-Around Analysis

load minimum data set

select London City Airport, go-arounds, and 2019-01-04

iterate over flights and pull the data from OpenSky Network

The flights can be converted into a Traffic object

Data publication: Matlab scripts for PSD measurments and rate spectra...

Dataset of books called Longitudinal data analysis

Data from: Analysis of Bibliographic Production on Problem-Based Learning...

Data from: Data accessibility in the chemical sciences: an analysis of...

Dataset of books called Applied missing data analysis

Dataset of author, BNB id, book publisher, and publication date of Microsoft...

Global Publication Support Services Market Technological Advancements...

Dataset for "Do LiU researchers publish data – and where? Dataset analysis using ODDPub"