Facebook
Twitterkpriyanshu256/dwl-data-fix-t6 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Datafix is a dataset for object detection tasks - it contains Signlanguage annotations for 2,898 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Replication Package of the paper "From Reports to Bug-Fix Commits: A 10 Years Dataset of Bug-Fixing Activity from 55 Apache's Open Source Projects"ABSTRACT:Bugs appear in almost any software development. Solving all or at least a large part of them requires a great deal of time, effort, and budget. Software projects typically use issue tracking systems as a way to report and monitor bug-fixing tasks. In recent years, several researchers have been conducting bug tracking analysis to better understand the problem and thus provide means to reduce costs and improve the efficiency of the bug-fixing task. In this paper, we introduce a new dataset composed of more than 70,000 bug-fix reports from 10 years of bug-fixing activity of 55 projects from the Apache Software Foundation, distributed in 9 categories. We have mined this information from Jira issue track system concerning two different perspectives of reports with closed/resolved status: static (the latest version of reports) and dynamic (the changes that have occurred in reports over time). We also extract information from the commits (if they exist) that fix such bugs from their respective version-control system (Git).We also provide a change analysis that occurs in the reports as a way of illustrating and characterizing the proposed dataset. Once the data extraction process is an error-prone nontrivial task, we believe such initiatives like this could be useful to support researchers in further more detailed investigations.You can find the full paper at: https://doi.org/10.1145/3345629.3345639If you use this dataset for your research, please reference the following paper:@inproceedings{Vieira:2019:RBC:3345629.3345639, author = {Vieira, Renan and da Silva, Ant^{o}nio and Rocha, Lincoln and Gomes, Jo~{a}o Paulo}, title = {From Reports to Bug-Fix Commits: A 10 Years Dataset of Bug-Fixing Activity from 55 Apache's Open Source Projects}, booktitle = {Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering}, series = {PROMISE'19}, year = {2019}, isbn = {978-1-4503-7233-6}, location = {Recife, Brazil}, pages = {80--89}, numpages = {10}, url = {http://doi.acm.org/10.1145/3345629.3345639}, doi = {10.1145/3345629.3345639}, acmid = {3345639}, publisher = {ACM}, address = {New York, NY, USA}, keywords = {Bug-Fix Dataset, Mining Software Repositories, Software Traceability}, } P.S: We added a new dataset version (v1.0.1). In this version, we fix the git commit features that track the src and test files. More info can be found in the fix-script.py file.
Facebook
TwitterThis dataset was created by Thanh Vu
Facebook
Twitternot available
Facebook
TwitterThis dataset was created by Fellonnisa
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comprehensive dataset containing 11 verified Fix locations in United States with complete contact information, ratings, reviews, and location data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comprehensive dataset containing 26 verified Fix On Road locations in United States with complete contact information, ratings, reviews, and location data.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
In our work, we have designed and implemented a novel workflow with several heuristic methods to combine state-of-the-art methods related to CVE fix commits gathering. As a consequence of our improvements, we have been able to gather the largest programming language-independent real-world dataset of CVE vulnerabilities with the associated fix commits. Our dataset containing 29,203 unique CVEs coming from 7,238 unique GitHub projects is, to the best of our knowledge, by far the biggest CVE vulnerability dataset with fix commits available today. These CVEs are associated with 35,276 unique commits as sql and 39,931 patch commit files that fixed those vulnerabilities(some patch files can't be saved as sql due to several techincal reasons) Our larger dataset thus substantially improves over the current real-world vulnerability datasets and enables further progress in research on vulnerability detection and software security. We used NVD(nvd.nist.gov) and Github Secuirty advisory Database as the main sources of our pipeline.
We release to the community a 16GB PostgreSQL database that contains information on CVEs up to 2024-09-26, CWEs of each CVE, files and methods changed by each commit, and repository metadata. Additionally, patch files related to the fix commits are available as a separate package. Furthermore, we make our dataset collection tool also available to the community.
cvedataset-patches.zip file contains fix patches, and postgrescvedumper.sql.zip contains a postgtesql dump of fixes, together with several other fields such as CVEs, CWEs, repository meta-data, commit data, file changes, method changed, etc.
MoreFixes data-storage strategy is based on CVEFixes to store CVE commits fixes from open-source repositories, and uses a modified version of Porspector(part of ProjectKB from SAP) as a module to detect commit fixes of a CVE. Our full methodology is presented in the paper, with the title of "MoreFixes: A Large-Scale Dataset of CVE Fix Commits Mined through Enhanced Repository Discovery", which will be published in the Promise conference (2024).
For more information about usage and sample queries, visit the Github repository: https://github.com/JafarAkhondali/Morefixes
If you are using this dataset, please be aware that the repositories that we mined contain different licenses and you are responsible to handle any licesnsing issues. This is also the similar case with CVEFixes.
This product uses the NVD API but is not endorsed or certified by the NVD.
This research was partially supported by the Dutch Research Council (NWO) under the project NWA.1215.18.008 Cyber Security by Integrated Design (C-SIDe).
To restore the dataset, you can use the docker-compose file available at the gitub repository. Dataset default credentials after restoring dump:
POSTGRES_USER=postgrescvedumper POSTGRES_DB=postgrescvedumper POSTGRES_PASSWORD=a42a18537d74c3b7e584c769152c3d
Please use this for citation:
title={MoreFixes: A large-scale dataset of CVE fix commits mined through enhanced repository discovery},
author={Akhoundali, Jafar and Nouri, Sajad Rahim and Rietveld, Kristian and Gadyatskaya, Olga},
booktitle={Proceedings of the 20th International Conference on Predictive Models and Data Analytics in Software Engineering},
pages={42--51},
year={2024}
}
Facebook
TwitterGPS radio collar (Telonics Model #TGW-3790) programmed to record a location every hour from 15 May through 15 November. Collars contained an ultrahigh frequency (UHF) radio transmitter and locations were downloaded using an airplane fitted with a UHF receiver. Collars also collect activity data synchronous with each GPS location with an onboard mercury switch. Activity levels were expressed as the percentage of time being active during the data collection interval. We screened GPS locations for accuracy and removed relocations with a positional dilution of precision (PDOP) greater than 10 [50]. We restricted bear locations to the period of 10 July through the end of August each year to coincide with the annual sockeye salmon run. A sample of 51 female bears.
Facebook
TwitterThis dataset provides information about the number of properties, residents, and average property values for Fix Street cross streets in Three Springs, PA.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Data Skripsi Fix Bismillah is a dataset for object detection tasks - it contains Rice Diseases Pest annotations for 3,409 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Activity Title: "Fix the Gaps: Data Hospital Simulation" (This activity is created for students to practice techniques to handle missing data)
Description: Provide each team with a “broken patient record” dataset (incomplete entries with NaNs or blanks). Teams act as data doctors: • Diagnose the type of missingness (MCAR, MAR, MNAR) • Choose suitable imputation techniques (mean, median, KNN, regression) • Compare outcomes from different methods
Tools: Jupyter notebook / Pandas
Outcome: Group presentation on the impact of imputation and justification of the method used.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comprehensive dataset containing 2,798 verified Fix Price locations in Russia with complete contact information, ratings, reviews, and location data.
Facebook
TwitterThis dataset provides information about the number of properties, residents, and average property values for Fix Court cross streets in San Ysidro, CA.
Facebook
TwitterThis dataset contains the predicted prices of the asset Glitch Fix over the next 16 years. This data is calculated initially using a default 5 percent annual growth rate, and after page load, it features a sliding scale component where the user can then further adjust the growth rate to their own positive or negative projections. The maximum positive adjustable growth rate is 100 percent, and the minimum adjustable growth rate is -100 percent.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comprehensive dataset containing 3 verified Phone Fix locations in New York, United States with complete contact information, ratings, reviews, and location data.
Facebook
TwitterReload to correct some errors.
Facebook
TwitterRaw radio tracking data used to determine the precise distance to Venus (and improve knowledge of the Astronomical Unit) from the Galileo flyby on 10 February 1990.
Facebook
TwitterNon-traditional data signals from social media and employment platforms for FIX stock analysis
Facebook
Twitterkpriyanshu256/dwl-data-fix-t6 dataset hosted on Hugging Face and contributed by the HF Datasets community