100+ datasets found

h
javascript-dataset
huggingface.co
Updated Sep 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akshay Nambiar (2024). javascript-dataset [Dataset]. https://huggingface.co/datasets/axay/javascript-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 3, 2024
Authors
Akshay Nambiar
Description
axay/javascript-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Z
Developer Expertise Dataset on JavaScript Libraries
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Montandon, João Eduardo (2020). Developer Expertise Dataset on JavaScript Libraries [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1484497
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Valente, Marco Tulio
Montandon, João Eduardo
Silva, Luciana Lourdes
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains an anonymized list of surveyed developers who provided their expertise level on three popular JavaScript libraries:

ReactJS, a library for building enriched web interfaces

MongoDB, a driver for accessing MongoDB databased

Socket.IO, a library for realtime communication
h
code-text-javascript
huggingface.co
Updated Jul 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semeru Lab (2023). code-text-javascript [Dataset]. https://huggingface.co/datasets/semeru/code-text-javascript
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 18, 2023
Dataset authored and provided by
Semeru Lab
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset is imported from CodeXGLUE and pre-processed using their script.

Where to find in Semeru:

The dataset can be found at /nfs/semeru/semeru_datasets/code_xglue/code-to-text/javascript in Semeru

CodeXGLUE -- Code-To-Text Task Definition

The task is to generate natural language comments for a code, and evaluted by smoothed bleu-4 score.

Dataset

The dataset we use comes from CodeSearchNet and we filter the dataset as the following:… See the full description on the dataset page: https://huggingface.co/datasets/semeru/code-text-javascript.
Z
Enhanced Bug Prediction in JavaScript Programs with Hybrid Call-Graph Based...
data.niaid.nih.gov
zenodo.org
Updated Nov 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tóth, Zoltán Gábor (2020). Enhanced Bug Prediction in JavaScript Programs with Hybrid Call-Graph Based Invocation Metrics (Training Dataset) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4281475
Explore at:
Dataset updated
Nov 21, 2020
Dataset provided by
Tóth, Zoltán Gábor
Antal, Gábor
Hegedűs, Péter
Ferenc, Rudolf
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset consists of multiple files which contain bug prediction training data.

The entries in the dataset are JavaScript functions either being buggy or non-buggy. Bug related information was obtained from the project EsLint contained in BugsJS (https://github.com/BugsJS/eslint). The buggy instances were collected throughout the lifetime of the project, however we added non-buggy entries from the latest version which is tagged as fix (entries which were previously included as buggy were not included as non-buggy later on).

The dataset is based on hybrid call graphs which are constructed by https://github.com/sed-szeged/hcg-js-framework. The result of this tool is a call graph where the edges are associated with a confidence level which shows how likely the given edge is a valid call edge.

We used different threshold values from which we considered the edges to be valid. The following threshold values were used:

0.00

0.05

0.20

0.30

The prefix in the dataset file names are coming from the used threshold. The the datasets include coupling metrics NII (Nubmer of Incoming Invocations) and NOI (Number of Outgoing Invocations) which were calculated by a static source code analyzer called SourceMeter. Hybrid counterparts of these metrics (HNII and HNOI) are based on the given threshold values.

There are four variants for all of these datasets:

Both static (NII, NOi) and hybrid (HNII, HNOI) coupling metrics are included with additional static source code metrics and information about the entries (file without any postfix). Column contained only in this dataset are:

ID

Name

Longname

Parent ID

Component ID

Path

Line

Column

EndLine

EndColumn

Both static (NII, NOi) and hybrid (HNII, HNOI) coupling metrics are included with additional static source code metrics (file with '_h+s' postfix)

Only static (NII, NOI) coupling metrics are included with additional static source code metrics (file with '_s' postfix)

Only hybrid (HNII, HNOI) coupling metrics are included with additional static source code metrics (file with '_h' postfix)

Static source code metrics which are contained in all dataset are the following:

McCC - McCabe Cyclomatic Complexity

NL - Nesting Level

NLE - Nesting Level Else If

CD - Comment Density

CLOC - Comment Lines of Code

DLOC - Documentation Lines of Code

TCD - Total Comment Density (Comment Lines in an emedded function will be also considered)

TCLOC - Total Comment Lines of Code (Comment Lines in an emedded function will be also considered)

LLOC - Logical Lines of Code (Comment and empty lines not counted)

LOC - Lines of Code (Comment and empty lines are counted)

NOS - Number of Statements

NUMPAR - Number of Parameters

TLLOC - Logical Lines of Code (Lines in embedded functions are also counted)

TLOC - Lines of Code (Lines in embedded functions are also counted)

TNOS - Total Number of Statements (Statements in embedded functions are also counted)
Z
Annotated UI Element Dataset for Desktop Environments
data.niaid.nih.gov
Updated Sep 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
González Enríquez, José (2024). Annotated UI Element Dataset for Desktop Environments [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10822751
Explore at:
Dataset updated
Sep 9, 2024
Dataset provided by
Martínez-Rojas, Antonio
Jiménez-Ramírez, Andrés
González Enríquez, José
Rodríguez-Ruíz, Antonio
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introducing a specialized dataset containing high-resolution screenshots from various desktop environments, focusing on annotating individual UI components. This dataset is designed to enhance the accuracy of UI element identification and classification within desktop applications, enabling the extraction of hierarchical structures.

Desktop UI Detection Dataset.zip

This resource contains a set of 100 general-purpose screenshots intended for the training set.

Test Desktop UI Detection Dataset.zip

This resource is aimed at evaluating the models trained with the previous screenshots, using a set of captures from a specific business process.

The images have been organized into six groups (G), each representing a unique set of screenshots from the same type of application. These groups are distinguished by their levels of complexity and the depth of their UI hierarchies:

G1: PDF Reader. A combination of web and native applications for interacting with PDF documents.

G2: Public Administration Courses Manager. A web-based application used in the student admission process for public learning programs (these images are not provided due to privacy issues concerning the business process).

G3: Customer Relationship Management (CRM) System. A web-based application for managing business data.

G4: Email Client. A mix of web and native applications used for managing email communications.

G5: File Explorer. A native application for navigating the file system in Windows.

G6: Learning Management System (LMS). A web-based application used to manage courses and students for a given educational institution.

Two folders are provided, each containing all the groups mentioned. These folders represent captures with the application from the corresponding group in fullscreen, or captures with the application from the corresponding group overlapping another application randomly selected from one of the remaining groups (Overlapped).
w
Dataset of books called Reliable JavaScript
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Reliable JavaScript [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Reliable+JavaScript
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Reliable JavaScript. It features 7 columns including author, publication date, language, and book publisher.
h
dataset-JavaScript-general-coding
huggingface.co
Updated Feb 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Meldrum (2025). dataset-JavaScript-general-coding [Dataset]. https://huggingface.co/datasets/dmeldrum6/dataset-JavaScript-general-coding
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 17, 2025
Authors
David Meldrum
Description
Dataset Card for dataset-JavaScript-general-coding

This dataset has been created with distilabel.

Dataset Summary

This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/dmeldrum6/dataset-JavaScript-general-coding/raw/main/pipeline.yaml"

or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/dmeldrum6/dataset-JavaScript-general-coding.
Dataset Collected by JSObserver
zenodo.org
zip
Updated Jun 4, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mingxue Zhang; Wei Meng; Mingxue Zhang; Wei Meng (2020). Dataset Collected by JSObserver [Dataset]. http://doi.org/10.5281/zenodo.3874944
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3874944
Dataset updated
Jun 4, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mingxue Zhang; Wei Meng; Mingxue Zhang; Wei Meng
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a sampled dataset collected by JSObserver on Alexa top 100K websites. We analyze the log files to identify JavaScript global identifier conflicts, i.e., variable value conflicts, variable type conflicts and function definition conflicts.

We release the log files on websites where we detect the above conflicts, and split the whole dataset into 10 subsets, i.e., 1-50K-0.zip ~ 50K-100K-4.zip.

The writes to a memory location in JavaScript are saved in [rank].[main/sub].[frame_cnt].asg (e.g., 1.main.0.asg) files.

JavaScript global function definitions are saved in [rank].[main/sub].[frame_cnt].func (e.g., 1.main.0.func) files.

The maps from script IDs to script URLs are saved in [rank].[main/sub].[frame_cnt].id2url (e.g., 1.main.0.id2url) files.

The source code of scripts are saved in [rank].[main/sub].[frame_cnt].[script_ID].script (e.g., 1.main.0.17.script) files.

We also sample 100 websites on which we did not detect any conflicts. The log files collected on those websites are available in sampled_no_conflict.zip
f
High-temperature multi-element 2021 (HME21) dataset
figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
So Takamoto; Chikashi Shinagawa; Daisuke Motoki; Kosuke Nakago; Wenwen Li; Iori Kurata; Taku Watanabe; Yoshihiro Yayama; Hiroki Iriguchi; Yusuke Asano; Tasuku Onodera; Takafumi Ishii; Takao Kudo; Hideki Ono; Ryohto Sawada; Ryuichiro Ishitani; Marc Ong; Taiki Yamaguchi; Toshiki Kataoka; Akihide Hayashi; Nontawat Charoenphakdee; Takeshi Ibuka (2023). High-temperature multi-element 2021 (HME21) dataset [Dataset]. http://doi.org/10.6084/m9.figshare.19658538.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19658538.v2
Dataset updated
May 30, 2023
Dataset provided by
figshare
Authors
So Takamoto; Chikashi Shinagawa; Daisuke Motoki; Kosuke Nakago; Wenwen Li; Iori Kurata; Taku Watanabe; Yoshihiro Yayama; Hiroki Iriguchi; Yusuke Asano; Tasuku Onodera; Takafumi Ishii; Takao Kudo; Hideki Ono; Ryohto Sawada; Ryuichiro Ishitani; Marc Ong; Taiki Yamaguchi; Toshiki Kataoka; Akihide Hayashi; Nontawat Charoenphakdee; Takeshi Ibuka
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
HME21 is the atomic structure dataset aimed for the neural network potential development. It was created in the development of PFP, a universal neural network potential for material discovery [1]. It contains multiple elements in a single structure and was sampled through a high-temperature molecular dynamics simulation. There are a total of 37 elements in the HME21 dataset, i.e., H, Li, C, N, O, F, Na, Mg, Al, Si, P, S, Cl, K, Ca, Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, Mo, Ru, Rh, Pd, Ag, In, Sn, Ba, Ir, Pt, Au, and Pb. They are calculated by Spin-polarized DFT calculations using PBE exchange-correlation functional implemented in VASP [2] version 5.4.4. All structures are under periodic boundary conditions. For the details of DFT calculation conditions and structure sampling method, please see the reference [1]. Please cite the reference [1] if you use this dataset. Files HME21 consists of three files with extxyz format:

train.xyz: 19956 structures valid.xyz: 2498 structures test.xyz 2495 structures

The structures were randomly split into training, validation, and test sub-datasets at a ratio of 8:1:1. They are used as training, validation, and test dataset for the benchmark of neural network potentials [1]. The target values are energy and atomic forces. The energy is shifted such that the energy of a single atom located in a vacuum becomes zero. The length is in angstroms (10^−10 m), and the energy is in electronvolts (eV). For supplementary, vasp_shift_energies.json which corresponds to the reference energy of single atom for each element is also included.
R
Logical Element Dataset
universe.roboflow.com
zip
Updated Mar 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MySpace (2025). Logical Element Dataset [Dataset]. https://universe.roboflow.com/myspace-4sicm/components-pose-logical-element/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Mar 18, 2025
Dataset authored and provided by
MySpace
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Logical Elements
Description
Logical Element

## Overview Logical Element is a dataset for computer vision tasks - it contains Logical Elements annotations for 460 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Z
Data from: Mining Rule Violations in JavaScript Code Snippets
data.niaid.nih.gov
explore.openaire.eu
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bonifácio, Rodrigo (2020). Mining Rule Violations in JavaScript Code Snippets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2593817
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Moraes, João Pedro
Ferreira Campos, Uriel
Pinto, Gustavo
Bonifácio, Rodrigo
Smethurst, Guilherme
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Content of this repository This is the repository that contains the scripts and dataset for the MSR 2019 mining challenge

Github Repository with the software used : here.

DATASET The dataset was retrived utilizing google bigquery and dumped to a csv file for further processing, this original file with no treatment is called jsanswers.csv, here we can find the following information : 1. The Id of the question (PostId) 2. The Content (in this case the code block) 3. the lenght of the code block 4. the line count of the code block 5. The score of the post 6. The title

A quick look at this files, one can notice that a postID can have multiple rows related to it, that's how multiple codeblocks are saved in the database.

Filtered Dataset:

Extracting code from CSV We used a python script called "ExtractCodeFromCSV.py" to extract the code from the original csv and merge all the codeblocks in their respective javascript file with the postID as name, this resulted in 336 thousand files.

Running ESlint Due to the single threaded nature of ESlint, we needed to create a script to run ESlint because it took a huge toll on the machine to run it on 336 thousand files, this script is named "ESlintRunnerScript.py", it splits the files in 20 evenly distributed parts and runs 20 processes of esLinter to generate the reports, as such it generates 20 json files.

Number of Violations per Rule This information was extracted using the script named "parser.py", it generated the file named "NumberofViolationsPerRule.csv" which contains the number of violations per rule used in the linter configuration in the dataset.

Number of violations per Category As a way to make relevant statistics of the dataset, we generated the number of violations per rule category as defined in the eslinter website, this information was extracted using the same "parser.py" script.

Individual Reports This information was extracted from the json reports, it's a csv file with PostID and violations per rule.

Rules The file Rules with categories contains all the rules used and their categories.
R
Windows Element Dataset
universe.roboflow.com
zip
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HAIE LAB (2023). Windows Element Dataset [Dataset]. https://universe.roboflow.com/haie-lab/windows-element/dataset/2
Explore at:
zipAvailable download formats
Dataset updated
Oct 10, 2023
Dataset authored and provided by
HAIE LAB
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Window Element Bounding Boxes
Description
Windows Element

## Overview Windows Element is a dataset for object detection tasks - it contains Window Element annotations for 273 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
w
Dataset of books called Eloquent JavaScript : a modern introduction to...
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Eloquent JavaScript : a modern introduction to programming [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Eloquent+JavaScript+%3A+a+modern+introduction+to+programming
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Eloquent JavaScript : a modern introduction to programming. It features 7 columns including author, publication date, language, and book publisher.
i
Finite element simulation dataset
ieee-dataport.org
Updated Oct 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pengfei Zhang (2022). Finite element simulation dataset [Dataset]. https://ieee-dataport.org/documents/finite-element-simulation-dataset
Explore at:
Dataset updated
Oct 4, 2022
Authors
Pengfei Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
S
Housing Element Annual Progress Report (APR) Data by Jurisdiction and Year
data.ca.gov
catalog.data.gov
csv, docx
Updated Aug 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Housing and Community Development (2025). Housing Element Annual Progress Report (APR) Data by Jurisdiction and Year [Dataset]. https://data.ca.gov/dataset/housing-element-annual-progress-report-apr-data-by-jurisdiction-and-year
Explore at:
csv(6821), csv(1953), csv(57957396), csv(29286), csv(316897), csv(151937987), docx(25091), docx(22168), docx(24264), docx(21179), csv(959397), csv(50592), docx(26167), csv(44189160), csv(52186), docx(26988), csv(1172005), docx(27660), docx(32505), docx(23077), docx(22688), docx(29410)Available download formats
Dataset updated
Aug 1, 2025
Dataset provided by
California Department of Housing & Community Developmenthttps://hcd.ca.gov/
Authors
California Department of Housing and Community Development
Description
Government Code section 65400 requires that each city, county, or city and county, including charter cities, prepare an annual progress report (APR) on the status of the housing element of its general plan and progress in its implementation. This dataset includes information reported to the Department of Housing and Community Development (HCD) by local jurisdictions on their APR form. Additional information about annual progress reports (APR), including the form, instructions, and definition can be found on HCD’s website here: https://www.hcd.ca.gov/planning-and-community-development/annual-progress-reports.
R
Pcb Element Dataset
universe.roboflow.com
zip
Updated Jul 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
project (2024). Pcb Element Dataset [Dataset]. https://universe.roboflow.com/project-6p64l/pcb-element/dataset/2
Explore at:
zipAvailable download formats
Dataset updated
Jul 24, 2024
Dataset authored and provided by
project
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Electronic Elements Bounding Boxes
Description
PCB Element

## Overview PCB Element is a dataset for object detection tasks - it contains Electronic Elements annotations for 504 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
w
Dataset of books called The joy of JavaScript
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called The joy of JavaScript [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=The+joy+of+JavaScript
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is The joy of JavaScript. It features 7 columns including author, publication date, language, and book publisher.
Dataset - Towards a Prototype Based Explainable JavaScript Vulnerability...
zenodo.org
csv
Updated May 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Balázs Mosolygó; Norbert Vándor; Gábor Antal; Péter Hegedűs; Rudolf Ferenc; Balázs Mosolygó; Norbert Vándor; Gábor Antal; Péter Hegedűs; Rudolf Ferenc (2021). Dataset - Towards a Prototype Based Explainable JavaScript Vulnerability Prediction Model [Dataset]. http://doi.org/10.5281/zenodo.4742139
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4742139
Dataset updated
May 7, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Balázs Mosolygó; Norbert Vándor; Gábor Antal; Péter Hegedűs; Rudolf Ferenc; Balázs Mosolygó; Norbert Vándor; Gábor Antal; Péter Hegedűs; Rudolf Ferenc
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the dataset we used in our paper entitled "Towards a Prototype Based Explainable JavaScript Vulnerability Prediction Model". The manually validated dataset contains various several static source code metrics along with vulnerability fixing hashes for numerous vulnerabilities. For more details, you can read the paper here.

Security has become a central and unavoidable aspect of today’s software development. Practitioners and researchers have proposed many code analysis tools and techniques to mitigate security risks. These tools apply static and dynamic analysis or, more recently, machine learning. Machine learning models can achieve impressive results in finding and forecasting possible security issues in programs. However, there are at least two areas where most of the current approaches fall short of developer demands: explainability and granularity of predictions. In this paper, we propose a novel and simple yet, promising approach to identify potentially vulnerable source code in JavaScript programs. The model improves the state-of-the-art in terms of explainability and prediction granularity as it gives results at the level of individual source code lines, which is fine-grained enough for developers to take immediate actions. Additionally, the model explains each predicted line (i.e., provides the most similar vulnerable line from the training set) using a prototype-based approach. In a study of 186 real-world and confirmed JavaScript vulnerability fixes of 91 projects, the approach could flag 60% of the known vulnerable lines on average by marking only 10% of the code-base, but in certain cases the model identified 100% of the vulnerable code lines while flagging only 8.72% of the code-base.

If you wish to use our dataset, please cite this dataset, or the corresponding paper:

@inproceedings{mosolygo2021towards, title={Towards a Prototype Based Explainable JavaScript Vulnerability Prediction Model}, author={Mosolyg{\'o}, Bal{\'a}zs and V{\'a}ndor, Norbert and Antal, G{\'a}bor and Heged{\H{u}}s, P{\'e}ter and Ferenc, Rudolf}, booktitle={2021 International Conference on Code Quality (ICCQ)}, pages={15--25}, year={2021}, organization={IEEE} }
w
Dataset of books called D3.js 4.x data visualization : learn to visualize...
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called D3.js 4.x data visualization : learn to visualize your data with JavaScript [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=D3.js+4.x+data+visualization+%3A+learn+to+visualize+your+data+with+JavaScript
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is D3.js 4.x data visualization : learn to visualize your data with JavaScript. It features 7 columns including author, publication date, language, and book publisher.
f
Data from: BreCaHAD: A Dataset for Breast Cancer Histopathological...
figshare.com
png
Updated Jan 28, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alper Aksac; Douglas J. Demetrick; Tansel Özyer; Reda Alhajj (2019). BreCaHAD: A Dataset for Breast Cancer Histopathological Annotation and Diagnosis [Dataset]. http://doi.org/10.6084/m9.figshare.7379186.v3
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7379186.v3
Dataset updated
Jan 28, 2019
Dataset provided by
figshare
Authors
Alper Aksac; Douglas J. Demetrick; Tansel Özyer; Reda Alhajj
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset consists of 1 .xlsx file, 2 .png files, 1 .json file and 1 .zip file:annotation_details.xlsx: The distribution of annotations in the previously mentioned six classes (mitosis, apoptosis, tumor nuclei, non-tumor nuclei, tubule, and non-tubule) is presented in a Excel spreadsheet.original.png: The input image.annotated.png: An example from the dataset. In the annotated image, blue circles indicate the tumor nuclei, pink circles show non-tumor nuclei such as blood cells, stroma nuclei, and lymphocytes; orange and green circles are mitosis and apoptosis, respectively; light blue circles are true lumen for tubules, and yellow circles represent white regions (non-lumen) such as fat, blood vessel, and broken tissues.data.json: The annotations for the BreCaHAD dataset are provided in JSON (JavaScript Object Notation) format. In the given example, the JSON file (ground truth) contains two mitosis and only one tumor nuclei annotations. Here, x and y are the coordinates of the centroid of the annotated object, and the values are between 0, 1.BreCaHAD.zip: An archive file containing dataset. Three folders are included: images (original images), groundTruth (json files), and groundTruth_display (groundTruth applied on original images)

Facebook

Twitter

Click to copy link

Link copied

Cite

Akshay Nambiar (2024). javascript-dataset [Dataset]. https://huggingface.co/datasets/axay/javascript-dataset

javascript-dataset

axay/javascript-dataset

Explore at:

152 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Sep 3, 2024

Authors

Akshay Nambiar

Description

axay/javascript-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

Clear search

Close search

Google apps

Main menu

javascript-dataset

Developer Expertise Dataset on JavaScript Libraries

code-text-javascript

Enhanced Bug Prediction in JavaScript Programs with Hybrid Call-Graph Based...

Annotated UI Element Dataset for Desktop Environments

Dataset of books called Reliable JavaScript

dataset-JavaScript-general-coding

Dataset Collected by JSObserver

High-temperature multi-element 2021 (HME21) dataset

Logical Element Dataset

Logical Element

Data from: Mining Rule Violations in JavaScript Code Snippets

Github Repository with the software used : here.

Windows Element Dataset

Windows Element

Dataset of books called Eloquent JavaScript : a modern introduction to...

Finite element simulation dataset

Housing Element Annual Progress Report (APR) Data by Jurisdiction and Year

Pcb Element Dataset

PCB Element

Dataset of books called The joy of JavaScript

Dataset - Towards a Prototype Based Explainable JavaScript Vulnerability...

Dataset of books called D3.js 4.x data visualization : learn to visualize...

Data from: BreCaHAD: A Dataset for Breast Cancer Histopathological...

javascript-datasetSee More Versions

axay/javascript-dataset

javascript-dataset