94 datasets found

h
javascript-dataset
huggingface.co
Updated Sep 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akshay Nambiar (2024). javascript-dataset [Dataset]. https://huggingface.co/datasets/axay/javascript-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 3, 2024
Authors
Akshay Nambiar
Description
axay/javascript-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Z
Developer Expertise Dataset on JavaScript Libraries
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Montandon, João Eduardo (2020). Developer Expertise Dataset on JavaScript Libraries [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1484497
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Valente, Marco Tulio
Montandon, João Eduardo
Silva, Luciana Lourdes
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains an anonymized list of surveyed developers who provided their expertise level on three popular JavaScript libraries:

ReactJS, a library for building enriched web interfaces

MongoDB, a driver for accessing MongoDB databased

Socket.IO, a library for realtime communication
h
axay-javascript-dataset-pn
huggingface.co
Updated Oct 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Israel Antonio Rosales Laguan (2024). axay-javascript-dataset-pn [Dataset]. https://huggingface.co/datasets/israellaguan/axay-javascript-dataset-pn
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 4, 2024
Authors
Israel Antonio Rosales Laguan
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
DPO JavaScript Dataset

This repository contains a modified version of the JavaScript dataset originally sourced from axay/javascript-dataset-pn. The dataset has been adapted to fit the DPO (Dynamic Programming Object) format, making it compatible with the LLaMA-Factory project.

License

This dataset is licensed under the Apache 2.0 License.

Dataset Overview

The dataset consists of JavaScript code snippets that have been restructured and enhanced for use in… See the full description on the dataset page: https://huggingface.co/datasets/israellaguan/axay-javascript-dataset-pn.
Z
Enhanced Bug Prediction in JavaScript Programs with Hybrid Call-Graph Based...
data.niaid.nih.gov
zenodo.org
Updated Nov 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hegedűs, Péter (2020). Enhanced Bug Prediction in JavaScript Programs with Hybrid Call-Graph Based Invocation Metrics (Training Dataset) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4281475
Explore at:
Dataset updated
Nov 21, 2020
Dataset provided by
Antal, Gábor
Hegedűs, Péter
Ferenc, Rudolf
Tóth, Zoltán Gábor
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset consists of multiple files which contain bug prediction training data.

The entries in the dataset are JavaScript functions either being buggy or non-buggy. Bug related information was obtained from the project EsLint contained in BugsJS (https://github.com/BugsJS/eslint). The buggy instances were collected throughout the lifetime of the project, however we added non-buggy entries from the latest version which is tagged as fix (entries which were previously included as buggy were not included as non-buggy later on).

The dataset is based on hybrid call graphs which are constructed by https://github.com/sed-szeged/hcg-js-framework. The result of this tool is a call graph where the edges are associated with a confidence level which shows how likely the given edge is a valid call edge.

We used different threshold values from which we considered the edges to be valid. The following threshold values were used:

0.00

0.05

0.20

0.30

The prefix in the dataset file names are coming from the used threshold. The the datasets include coupling metrics NII (Nubmer of Incoming Invocations) and NOI (Number of Outgoing Invocations) which were calculated by a static source code analyzer called SourceMeter. Hybrid counterparts of these metrics (HNII and HNOI) are based on the given threshold values.

There are four variants for all of these datasets:

Both static (NII, NOi) and hybrid (HNII, HNOI) coupling metrics are included with additional static source code metrics and information about the entries (file without any postfix). Column contained only in this dataset are:

ID

Name

Longname

Parent ID

Component ID

Path

Line

Column

EndLine

EndColumn

Both static (NII, NOi) and hybrid (HNII, HNOI) coupling metrics are included with additional static source code metrics (file with '_h+s' postfix)

Only static (NII, NOI) coupling metrics are included with additional static source code metrics (file with '_s' postfix)

Only hybrid (HNII, HNOI) coupling metrics are included with additional static source code metrics (file with '_h' postfix)

Static source code metrics which are contained in all dataset are the following:

McCC - McCabe Cyclomatic Complexity

NL - Nesting Level

NLE - Nesting Level Else If

CD - Comment Density

CLOC - Comment Lines of Code

DLOC - Documentation Lines of Code

TCD - Total Comment Density (Comment Lines in an emedded function will be also considered)

TCLOC - Total Comment Lines of Code (Comment Lines in an emedded function will be also considered)

LLOC - Logical Lines of Code (Comment and empty lines not counted)

LOC - Lines of Code (Comment and empty lines are counted)

NOS - Number of Statements

NUMPAR - Number of Parameters

TLLOC - Logical Lines of Code (Lines in embedded functions are also counted)

TLOC - Lines of Code (Lines in embedded functions are also counted)

TNOS - Total Number of Statements (Statements in embedded functions are also counted)
P
CodeSearchNet Dataset
paperswithcode.com
opendatalab.com
Updated Dec 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hamel Husain; Ho-Hsiang Wu; Tiferet Gazit; Miltiadis Allamanis; Marc Brockschmidt (2024). CodeSearchNet Dataset [Dataset]. https://paperswithcode.com/dataset/codesearchnet
Explore at:
Dataset updated
Dec 30, 2024
Authors
Hamel Husain; Ho-Hsiang Wu; Tiferet Gazit; Miltiadis Allamanis; Marc Brockschmidt
Description
The CodeSearchNet Corpus is a large dataset of functions with associated documentation written in Go, Java, JavaScript, PHP, Python, and Ruby from open source projects on GitHub. The CodeSearchNet Corpus includes: * Six million methods overall * Two million of which have associated documentation (docstrings, JavaDoc, and more) * Metadata that indicates the original location (repository or line number, for example) where the data was found
Dataset Collected by JSObserver
zenodo.org
zip
Updated Jun 4, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mingxue Zhang; Wei Meng; Mingxue Zhang; Wei Meng (2020). Dataset Collected by JSObserver [Dataset]. http://doi.org/10.5281/zenodo.3874944
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3874944
Dataset updated
Jun 4, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mingxue Zhang; Wei Meng; Mingxue Zhang; Wei Meng
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a sampled dataset collected by JSObserver on Alexa top 100K websites. We analyze the log files to identify JavaScript global identifier conflicts, i.e., variable value conflicts, variable type conflicts and function definition conflicts.

We release the log files on websites where we detect the above conflicts, and split the whole dataset into 10 subsets, i.e., 1-50K-0.zip ~ 50K-100K-4.zip.

The writes to a memory location in JavaScript are saved in [rank].[main/sub].[frame_cnt].asg (e.g., 1.main.0.asg) files.

JavaScript global function definitions are saved in [rank].[main/sub].[frame_cnt].func (e.g., 1.main.0.func) files.

The maps from script IDs to script URLs are saved in [rank].[main/sub].[frame_cnt].id2url (e.g., 1.main.0.id2url) files.

The source code of scripts are saved in [rank].[main/sub].[frame_cnt].[script_ID].script (e.g., 1.main.0.17.script) files.

We also sample 100 websites on which we did not detect any conflicts. The log files collected on those websites are available in sampled_no_conflict.zip
w
Dataset of books called Reliable JavaScript
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Reliable JavaScript [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Reliable+JavaScript
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Reliable JavaScript. It features 7 columns including author, publication date, language, and book publisher.
Data from: Towards a Prototype Based Explainable JavaScript Vulnerability...
zenodo.org
data.niaid.nih.gov
csv
Updated May 7, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Balázs Mosolygó; Norbert Vándor; Gábor Antal; Péter Hegedűs; Rudolf Ferenc; Balázs Mosolygó; Norbert Vándor; Gábor Antal; Péter Hegedűs; Rudolf Ferenc (2021). Towards a Prototype Based Explainable JavaScript Vulnerability Prediction Model [Dataset]. http://doi.org/10.5281/zenodo.4742161
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4742161
Dataset updated
May 7, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Balázs Mosolygó; Norbert Vándor; Gábor Antal; Péter Hegedűs; Rudolf Ferenc; Balázs Mosolygó; Norbert Vándor; Gábor Antal; Péter Hegedűs; Rudolf Ferenc
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the dataset we used in our paper entitled "Towards a Prototype Based Explainable JavaScript Vulnerability Prediction Model". The manually validated dataset contains various several static source code metrics along with vulnerability fixing hashes for numerous vulnerabilities. For more details, you can read the paper here.

Security has become a central and unavoidable aspect of today’s software development. Practitioners and researchers have proposed many code analysis tools and techniques to mitigate security risks. These tools apply static and dynamic analysis or, more recently, machine learning. Machine learning models can achieve impressive results in finding and forecasting possible security issues in programs. However, there are at least two areas where most of the current approaches fall short of developer demands: explainability and granularity of predictions. In this paper, we propose a novel and simple yet, promising approach to identify potentially vulnerable source code in JavaScript programs. The model improves the state-of-the-art in terms of explainability and prediction granularity as it gives results at the level of individual source code lines, which is fine-grained enough for developers to take immediate actions. Additionally, the model explains each predicted line (i.e., provides the most similar vulnerable line from the training set) using a prototype-based approach. In a study of 186 real-world and confirmed JavaScript vulnerability fixes of 91 projects, the approach could flag 60% of the known vulnerable lines on average by marking only 10% of the code-base, but in certain cases the model identified 100% of the vulnerable code lines while flagging only 8.72% of the code-base.

If you wish to use our dataset, please cite this dataset, or the corresponding paper:

@inproceedings{mosolygo2021towards, title={Towards a Prototype Based Explainable JavaScript Vulnerability Prediction Model}, author={Mosolyg{\'o}, Bal{\'a}zs and V{\'a}ndor, Norbert and Antal, G{\'a}bor and Heged{\H{u}}s, P{\'e}ter and Ferenc, Rudolf}, booktitle={2021 International Conference on Code Quality (ICCQ)}, pages={15--25}, year={2021}, organization={IEEE} }
h
code-text-javascript
huggingface.co
Updated Jul 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semeru Lab (2023). code-text-javascript [Dataset]. https://huggingface.co/datasets/semeru/code-text-javascript
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 18, 2023
Dataset authored and provided by
Semeru Lab
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset is imported from CodeXGLUE and pre-processed using their script.

Where to find in Semeru:

The dataset can be found at /nfs/semeru/semeru_datasets/code_xglue/code-to-text/javascript in Semeru

CodeXGLUE -- Code-To-Text Task Definition

The task is to generate natural language comments for a code, and evaluted by smoothed bleu-4 score.

Dataset

The dataset we use comes from CodeSearchNet and we filter the dataset as the following:… See the full description on the dataset page: https://huggingface.co/datasets/semeru/code-text-javascript.
Z
Data from: Mining Rule Violations in JavaScript Code Snippets
data.niaid.nih.gov
explore.openaire.eu
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Smethurst, Guilherme (2020). Mining Rule Violations in JavaScript Code Snippets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2593817
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Moraes, João Pedro
Bonifácio, Rodrigo
Smethurst, Guilherme
Ferreira Campos, Uriel
Pinto, Gustavo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Content of this repository This is the repository that contains the scripts and dataset for the MSR 2019 mining challenge

Github Repository with the software used : here.

DATASET The dataset was retrived utilizing google bigquery and dumped to a csv file for further processing, this original file with no treatment is called jsanswers.csv, here we can find the following information : 1. The Id of the question (PostId) 2. The Content (in this case the code block) 3. the lenght of the code block 4. the line count of the code block 5. The score of the post 6. The title

A quick look at this files, one can notice that a postID can have multiple rows related to it, that's how multiple codeblocks are saved in the database.

Filtered Dataset:

Extracting code from CSV We used a python script called "ExtractCodeFromCSV.py" to extract the code from the original csv and merge all the codeblocks in their respective javascript file with the postID as name, this resulted in 336 thousand files.

Running ESlint Due to the single threaded nature of ESlint, we needed to create a script to run ESlint because it took a huge toll on the machine to run it on 336 thousand files, this script is named "ESlintRunnerScript.py", it splits the files in 20 evenly distributed parts and runs 20 processes of esLinter to generate the reports, as such it generates 20 json files.

Number of Violations per Rule This information was extracted using the script named "parser.py", it generated the file named "NumberofViolationsPerRule.csv" which contains the number of violations per rule used in the linter configuration in the dataset.

Number of violations per Category As a way to make relevant statistics of the dataset, we generated the number of violations per rule category as defined in the eslinter website, this information was extracted using the same "parser.py" script.

Individual Reports This information was extracted from the json reports, it's a csv file with PostID and violations per rule.

Rules The file Rules with categories contains all the rules used and their categories.
h
javascript-github-code
huggingface.co
Updated Dec 13, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Angelica Chen (2022). javascript-github-code [Dataset]. https://huggingface.co/datasets/angie-chen55/javascript-github-code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 13, 2022
Authors
Angelica Chen
Description
angie-chen55/javascript-github-code dataset hosted on Hugging Face and contributed by the HF Datasets community
P
TFix's Code Patches Data Dataset
paperswithcode.com
opendatalab.com
Updated Jul 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Berkay Berabi; Jingxuan He; Veselin Raychev; Martin Vechev (2021). TFix's Code Patches Data Dataset [Dataset]. https://paperswithcode.com/dataset/tfix-s-code-patch-data
Explore at:
Dataset updated
Jul 17, 2021
Authors
Berkay Berabi; Jingxuan He; Veselin Raychev; Martin Vechev
Description
The dataset contains more than 100k code patch pairs extracted from open source projects on GitHub. Each pair comes with the erroneous and the fixed version of the corresponding code snippet. Instead of the whole file, the code snippets are extracted to focus on the problematic region (error line + other lines around it). For each sample, the repository name, the commit id, and the file names are provided so that one can access the complete files in case of interest.

The dataset only has JavaScript programs and the error are detected by the popular static code analyzer ESLint. The dataset can be used in the fields of: program repair, code generation, bug finding, transfer learning and many more fields related to machine learning for code
Data from: Dynamic Security Analysis of JavaScript: Are We There Yet?:...
zenodo.org
bin
Updated Feb 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stefano Calzavara; Stefano Calzavara; Samuele Casarin; Samuele Casarin; Riccardo Focardi; Riccardo Focardi (2025). Dynamic Security Analysis of JavaScript: Are We There Yet?: Dataset [Dataset]. http://doi.org/10.5281/zenodo.14774184
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14774184
Dataset updated
Feb 3, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Stefano Calzavara; Stefano Calzavara; Samuele Casarin; Samuele Casarin; Riccardo Focardi; Riccardo Focardi
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description

This dataset was employed to systematically evaluate dynamic security analysis tools for JavaScript. It includes compatibility data for deployed scripts, as well as various details about their execution on both the analysis tools and regular browsers. Data collection was conducted on the top 10k domains from the Tranco ranking generated on September 27, 2024.
w
Dataset of books called The joy of JavaScript
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called The joy of JavaScript [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=The+joy+of+JavaScript
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is The joy of JavaScript. It features 7 columns including author, publication date, language, and book publisher.
Dataset collected by JSIsolate
zenodo.org
zip
Updated Aug 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mingxue Zhang; Wei Meng; Mingxue Zhang; Wei Meng (2021). Dataset collected by JSIsolate [Dataset]. http://doi.org/10.5281/zenodo.5242976
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5242976
Dataset updated
Aug 26, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mingxue Zhang; Wei Meng; Mingxue Zhang; Wei Meng
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains: 1) the object access logs, 2) script isolation policies and 3) script write conflicts collected by JSIsolate on Alexa top 1K websites. We analyze the access logs to generate the conflict summary files and script isolation policies that assign static scripts to an execution context.

We split the whole dataset of object access logs into 10 subsets, i.e., access-0.zip ~ access-9.zip.

The isolation policies are released in url-level-policies.zip and domain-level-policies.zip.

The object accesses (i.e., reads and writes) are saved in [rank].[main/sub].[frame_cnt].access (e.g., 1.main.0.access) files.

The URLs of frames (i.e., main frames and iframes) are saved in [rank].[main/sub].[frame_cnt].frame (e.g., 1.main.0.frame) files.

The maps from script IDs to script URLs are saved in [rank].[main/sub].[frame_cnt].id2url (e.g., 1.main.0.id2url) files.

The maps from script IDs to their parent script (script that includes it,

The source code of scripts are saved in [rank].[main/sub].[frame_cnt].[script_ID].script (e.g., 1.main.0.17.script) files.

Note that we perform monkey testing during the data collection, which may cause the page to navigate to a different URL. Therefore, there could be multiple main frame files.

The conflicts are dumped to [rank].conflicts (e.g., 1.conflicts) files.

The isolation policies are dumped to [rank].configs (e.g., 1.configs) and [rank].configs-simple (e.g., 1.configs-simple) files.

Note that the *.configs files also include the read/write operations that cause JSIsolate to assign a script from third-party domain to the first-party context.
P
HumanEval-X Dataset
paperswithcode.com
opendatalab.com
Updated Mar 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qinkai Zheng; Xiao Xia; Xu Zou; Yuxiao Dong; Shan Wang; Yufei Xue; Zihan Wang; Lei Shen; Andi Wang; Yang Li; Teng Su; Zhilin Yang; Jie Tang (2023). HumanEval-X Dataset [Dataset]. https://paperswithcode.com/dataset/humaneval-x
Explore at:
Dataset updated
Mar 31, 2023
Authors
Qinkai Zheng; Xiao Xia; Xu Zou; Yuxiao Dong; Shan Wang; Yufei Xue; Zihan Wang; Lei Shen; Andi Wang; Yang Li; Teng Su; Zhilin Yang; Jie Tang
Description
HumanEval-X is a benchmark for evaluating the multilingual ability of code generative models. It consists of 820 high-quality human-crafted data samples (each with test cases) in Python, C++, Java, JavaScript, and Go, and can be used for various tasks, such as code generation and translation.
f
Data from: BreCaHAD: A Dataset for Breast Cancer Histopathological...
figshare.com
png
Updated Jan 28, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alper Aksac; Douglas J. Demetrick; Tansel Özyer; Reda Alhajj (2019). BreCaHAD: A Dataset for Breast Cancer Histopathological Annotation and Diagnosis [Dataset]. http://doi.org/10.6084/m9.figshare.7379186.v3
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7379186.v3
Dataset updated
Jan 28, 2019
Dataset provided by
figshare
Authors
Alper Aksac; Douglas J. Demetrick; Tansel Özyer; Reda Alhajj
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset consists of 1 .xlsx file, 2 .png files, 1 .json file and 1 .zip file:annotation_details.xlsx: The distribution of annotations in the previously mentioned six classes (mitosis, apoptosis, tumor nuclei, non-tumor nuclei, tubule, and non-tubule) is presented in a Excel spreadsheet.original.png: The input image.annotated.png: An example from the dataset. In the annotated image, blue circles indicate the tumor nuclei, pink circles show non-tumor nuclei such as blood cells, stroma nuclei, and lymphocytes; orange and green circles are mitosis and apoptosis, respectively; light blue circles are true lumen for tubules, and yellow circles represent white regions (non-lumen) such as fat, blood vessel, and broken tissues.data.json: The annotations for the BreCaHAD dataset are provided in JSON (JavaScript Object Notation) format. In the given example, the JSON file (ground truth) contains two mitosis and only one tumor nuclei annotations. Here, x and y are the coordinates of the centroid of the annotated object, and the values are between 0, 1.BreCaHAD.zip: An archive file containing dataset. Three folders are included: images (original images), groundTruth (json files), and groundTruth_display (groundTruth applied on original images)
w
Dataset of books called D3.js 4.x data visualization : learn to visualize...
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called D3.js 4.x data visualization : learn to visualize your data with JavaScript [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=D3.js+4.x+data+visualization+%3A+learn+to+visualize+your+data+with+JavaScript
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is D3.js 4.x data visualization : learn to visualize your data with JavaScript. It features 7 columns including author, publication date, language, and book publisher.
P
CodeQA Dataset
paperswithcode.com
Updated Dec 29, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chenxiao Liu; Xiaojun Wan (2023). CodeQA Dataset [Dataset]. https://paperswithcode.com/dataset/codeqa
Explore at:
Dataset updated
Dec 29, 2023
Authors
Chenxiao Liu; Xiaojun Wan
Description
CodeQA is a free-form question answering dataset for the purpose of source code comprehension: given a code snippet and a question, a textual answer is required to be generated. CodeQA contains a Java dataset with 119,778 question-answer pairs and a Python dataset with 70,085 question-answer pairs.

Description from: CodeQA: A Question Answering Dataset for Source Code Comprehension
Z
The Klarna Product-Page Dataset
data.niaid.nih.gov
researchdata.se
+1more
Updated Nov 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Moradi, Aref (2024). The Klarna Product-Page Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_12605479
Explore at:
Dataset updated
Nov 7, 2024
Dataset provided by
Risuleo, Riccardo Sven
Magureanu, Stefan
Moradi, Aref
Lagergren, Jens
Hotti, Alexandra
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Description

The Klarna Product Page Dataset is a dataset of publicly available pages corresponding to products sold online on various e-commerce websites. The dataset contains offline snapshots of 51,701 product pages collected from 8,175 distinct merchants across 8 different markets (US, GB, SE, NL, FI, NO, DE, AT) between 2018 and 2019. On each page, analysts labelled 5 elements of interest: the price of the product, its image, its name and the add-to-cart and go-to-cart buttons (if found). These labels are present in the HTML code as an attribute called klarna-ai-label taking one of the values: Price, Name, Main picture, Add to cart and Cart.

The snapshots are available in 3 formats: as MHTML files (~24GB), as WebTraversalLibrary (WTL) snapshots (~7.4GB), and as screeshots (~8.9GB). The MHTML format is less lossy, a browser can render these pages though any Javascript on the page is lost. The WTL snapshots are produced by loading the MHTML pages into a chromium-based browser. To keep the WTL dataset compact, the screenshots of the rendered MTHML are provided separately; here we provide the HTML of the rendered DOM tree and additional page and element metadata with rendering information (bounding boxes of elements, font sizes etc.). The folder structure of the screenshot dataset is identical to the one the WTL dataset and can be used to complete the WTL snapshots with image information. For convenience, the datasets are provided with a train/test split in which no merchants in the test set are present in the training set.

Corresponding Publication

For more information about the contents of the datasets (statistics etc.) please refer to the following TMLR paper.

GitHub Repository

The code needed to re-run the experiments in the publication accompanying the dataset can be accessed here.

Citing

If you found this dataset useful in your research, please cite the paper as follows:

@article{hotti2024the, title={The Klarna Product Page Dataset: Web Element Nomination with Graph Neural Networks and Large Language Models}, author={Alexandra Hotti and Riccardo Sven Risuleo and Stefan Magureanu and Aref Moradi and Jens Lagergren}, journal={Transactions on Machine Learning Research}, issn={2835-8856}, year={2024}, url={https://openreview.net/forum?id=zz6FesdDbB}, note={} }

Facebook

Twitter

Click to copy link

Link copied

Cite

Akshay Nambiar (2024). javascript-dataset [Dataset]. https://huggingface.co/datasets/axay/javascript-dataset

javascript-dataset

axay/javascript-dataset

Explore at:

145 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Sep 3, 2024

Authors

Akshay Nambiar

Description

axay/javascript-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

Clear search

Close search

Google apps

Main menu

javascript-dataset

Developer Expertise Dataset on JavaScript Libraries

axay-javascript-dataset-pn

Enhanced Bug Prediction in JavaScript Programs with Hybrid Call-Graph Based...

CodeSearchNet Dataset

Dataset Collected by JSObserver

Dataset of books called Reliable JavaScript

Data from: Towards a Prototype Based Explainable JavaScript Vulnerability...

code-text-javascript

Data from: Mining Rule Violations in JavaScript Code Snippets

Github Repository with the software used : here.

javascript-github-code

TFix's Code Patches Data Dataset

Data from: Dynamic Security Analysis of JavaScript: Are We There Yet?:...

Dataset of books called The joy of JavaScript

Dataset collected by JSIsolate

HumanEval-X Dataset

Data from: BreCaHAD: A Dataset for Breast Cancer Histopathological...

Dataset of books called D3.js 4.x data visualization : learn to visualize...

CodeQA Dataset

The Klarna Product-Page Dataset

javascript-datasetSee More Versions

axay/javascript-dataset

javascript-dataset