Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This was a dataset generated to pull public repository attributes about a the top 10,000 most popular NPM packages for analysis. The initial goal was numeric prediction and nominal classification around security advisories to identify key attributes for consideration in just-in-time supply chain analysis of third party dependencies. Ultimately there were know key public attributes that were statistically significant enough to build confidence to a packages overall security posture.
The data was generated using various APIs and then aggregated, including Deps.dev - OSI, GitHub API, and npms.io. Each API was chosen for the data consistency and reliability and that the data was fed downstream from overlapping sources. Here is how the scripts put this together.
The dataset represents a point in time of the security advisories and is not a complete picture of the security health of overall project. Further only 13% of the entire dataset has an advisories. As such it's really difficult to draw any accurate conclusions and the data poorly correlates.
I think this set serves as a good starting point for others who are interested in deriving more information about repositories such as if there are other attributes in the dataset that correlate or if additional APIs could bring more attributes into the fold. For exmaple, topic tagging wold be ideal. Potential problems in the future might be around assessing project health and DevOps related to repository popularity.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset for use in Node Code Query, contains package information in a tab separated csv file. The unzipped size is ~700MB.
You do not need to manually download this file for use in NCQ, the setup scripts will handle this for you automatically.
The dataset contains the following fields:
Mined from the NPM registry:
Package name
Description
Keywords
License
repositoryUrl
timeModified
Derived from data on the NPM registry:
Array of Node.js code snippets extracted from the package README using https://github.com/Brittany-Reid/npm-code-snippets
Number of markdown code blocks in the README (number may be larger than node.js snippets, these are non-filtered)
Number of lines in the README
If an install example exists in the README (if a code block exists with npm install or a install header exists)
If a run example exists in the README (if a code block exists with npm run or a usage header exists)
Mined from GitHub for packages with a GitHub repository (values will be 0 or false for packages missing this data)
Number of stars
Is a fork?
Number of forks
Number of watchers
If a test directory exists (if the top level directory contains a folder called test or tests)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
signal data
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Open-source Dataset
This dataset contains the NPM packages that we built using our tool-chain. It consists of the diffoscope outputs, the versions built by our tool-chain, and the pre-built packages present on the npmjs registry.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This zip file contains a JSON file of package and repository information for 12 million NPM packages, sourced from libaries.io (10.5281/zenodo.3626071, https://libraries.io/data, January 12, 2020). The file is 2.39GB uncompressed.
The file was generated from the 'Projects with related Repository fields' csv file, filtering for NPM packages only. The original files contains inconsistent columns between the first row, containing labels, and the subsequent rows, containing data, however the additional, unlabelled column is empty for all NPM packages so it has been ignored, and the subsequent rows shifted up into their correct positions.
This dataset has not modified otherwise.
Includes data from Libraries.io.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data contains the results of best practice violations in NodeJS projects
Facebook
TwitterA NPM package for get data of Lëtzebuerger Online Dictionnaire (LOD) from data.public.lu. Repo on Github : https://github.com/robertoentringer/lod-opendata Npm package : https://www.npmjs.com/package/lod-opendata
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Developers often share their code snippets by packaging them and making them available to others through software packages. How much a package does and how big it is can be seen as positive or negative. Recent studies showed that many packages that exist in the npm ecosystem are trivial and may introduce high dependency overhead.
Hence, one question that arises is why developers choose to publish these trivial packages. Therefore, in this paper, we perform a developer-centered study to empirically examine why developers choose to publish such trivial packages. Specifically, we ask 1) why developers publish trivial packages, 2) what they believe to be the possible negative impacts of these packages, and 3) how such negative issues can be mitigated. The survey response of 59 JavaScript developers who publish trivial npm packages showed that the main reasons for publishing these trivial packages are to provide reusable components, testing & documentation, and separation of concerns. Even the developers who publish these trivial packages admitted to having issues when they publish such packages, which include the maintenance of multiple packages, dependency hell, finding the right package, and the increase of duplicated packages in the ecosystems. Furthermore, we found that the majority of the developers suggested grouping these trivial packages to cope with the problems associated with publishing them. Then, to quantitatively investigate the impact of these trivial packages on the npm ecosystem and its users, we examine grouping these trivial packages. We found that if trivial packages that are always used together are grouped, the ecosystem can reduce the number of dependencies by approximately 13%. Our findings shed light on the impact of publishing trivial packages and show that ecosystems and developer communities need to rethink their publishing policies since it can negatively impact the developers and the entire ecosystem.
The published data set contains the following:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Nowadays, developing software would be unthinkable without the use of third-party packages. Although such code reuse helps to achieve rapid continuous delivery of software to end-users, blindly reusing code has its pitfalls. For example, prior work has investigated the rationale for using packages that implement simple functionalities, known as trivial packages. This prior work showed that although these trivial packages were simple, they were popular and prevalent in the npm ecosystem. This popularity and prevalence of trivial packages peaked our interest in questioning the ‘triviality of trivial packages’.
To better understand and examine the triviality of trivial packages, we mine a large set of JavaScript projects that use trivial npm packages and evaluate their relative centrality. Specifically, we evaluate the triviality from two complementary points of view: based on project usage and ecosystem usage of these trivial packages. Our result shows that trivial packages are being used in central JavaScript files of a software project. Additionally, by analyzing all external package API calls in these JavaScript files, we found that a high percentage of these API calls are attributed to trivial packages. Therefore, these packages play a significant role in JavaScript files. Furthermore, in the package dependency network, we observed that 16.8% packages are trivial and in some cases removing a trivial package can impact approximately 29% of the ecosystem. Overall, our finding indicates that although smaller in size and complexity, trivial packages are highly depended on packages by JavaScript projects. Additionally, our study shows that although they might be called trivial, nothing about trivial packages is trivial.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This package contains the material used in the manuscript.
For more information on how to understand the package structure, please read the README.md.
Facebook
TwitterFollowing an outbreak of violence on 25 August 2017 in Rakhine State, Myanmar, a new massive influx of Rohingya refugees to Cox’s Bazar, Bangladesh started in late August 2017. Most of the Rohingya refugees settled in Ukhia and Teknaf Upazilas of Cox’s Bazar, a district bordering Myanmar identified as the main entry area for border crossings.
These datasets present the result of the NPM Round 10 Baseline and Site Assessment exercises, which collected information related to the Rohingya population distribution and needs during the months of April and May 2018.
The data collection for NPM baseline survey was conducted between 1 and 17 April 2018: this provides an update about the population distribution and movements; The data collection for NPM Site Assessment survey was conducted between 1 and 20 May 2018: in addition to an update about the population figures, this includeds a multi-sectoral needs assessment.
The full maps and GIS packages by camp produced based on NPM Baseline and Site Assessment 10 are available at the links below:
Rohingya refugee population distribution by para in Teknaf upazila. Data collected during NPM Site Assessment 10 between 1 and 20 May 2018.
Bangladesh
Observation data/ratings [obs]
Facebook
TwitterAccess updated Panasonic Npm import data India with HS Code, price, importers list, Indian ports, exporting countries, and verified Panasonic Npm buyers in India.
Facebook
TwitterNpm Impex Solution Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Facebook
TwitterSubscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Facebook
TwitterNpm Process Equipments Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Facebook
TwitterThis dataset contains the predicted prices of the asset $NPM Hack Threatens JavaScript Ecosystem over the next 16 years. This data is calculated initially using a default 5 percent annual growth rate, and after page load, it features a sliding scale component where the user can then further adjust the growth rate to their own positive or negative projections. The maximum positive adjustable growth rate is 100 percent, and the minimum adjustable growth rate is -100 percent.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Following an outbreak of violence on 25 August 2017 in Rakhine State, Myanmar, a new massive influx of Rohingya refugees to Cox’s Bazar, Bangladesh started in late August 2017. Most of the Rohingya refugees settled in Ukhia and Teknaf Upazilas of Cox’s Bazar, a district bordering Myanmar identified as the main entry area for border crossings.
This dataset presents the result of the NPM Round 11 exercise, which collected information related to the Rohingya refugee population distribution and needs during the months of June and July 2018.
The full maps and GIS packages by camp produced based on NPM Baseline and Site Assessment 11 are available at the links below:
Rohingya refugee population distribution by para in Teknaf upazila. - Please click here.
Facebook
TwitterNpm C And O Europe Ou Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Blockchain data query: Aerodrome npm
Facebook
TwitterNpm Silmet O Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This was a dataset generated to pull public repository attributes about a the top 10,000 most popular NPM packages for analysis. The initial goal was numeric prediction and nominal classification around security advisories to identify key attributes for consideration in just-in-time supply chain analysis of third party dependencies. Ultimately there were know key public attributes that were statistically significant enough to build confidence to a packages overall security posture.
The data was generated using various APIs and then aggregated, including Deps.dev - OSI, GitHub API, and npms.io. Each API was chosen for the data consistency and reliability and that the data was fed downstream from overlapping sources. Here is how the scripts put this together.
The dataset represents a point in time of the security advisories and is not a complete picture of the security health of overall project. Further only 13% of the entire dataset has an advisories. As such it's really difficult to draw any accurate conclusions and the data poorly correlates.
I think this set serves as a good starting point for others who are interested in deriving more information about repositories such as if there are other attributes in the dataset that correlate or if additional APIs could bring more attributes into the fold. For exmaple, topic tagging wold be ideal. Potential problems in the future might be around assessing project health and DevOps related to repository popularity.