http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Distributed micro-services based applications are typically accessed via APIs. These APIs are used either by apps or they can be accessed directly by programmatic means. Many a time API access is abused by attackers trying to exploit the business logic exposed by these APIs. The way normal users access these APIs is different from how the attackers access these APIs. Many applications have 100s of APIs that are called in specific order and depending on various factors such as browser refreshes, session refreshes, network errors, or programmatic access these behaviors are not static and can vary for the same user. API calls in long running sessions form access graphs that need to be analysed in order to discover attack patterns and anomalies. Graphs dont lend themselves to numerical computation. We address this issue and provide a dataset where user access behavior is qualified as numerical features. In addition we provide a dataset where raw API call graphs are provided. Supporting the use of these datasets two notebooks on classification, node embeddings and clustering are also provided.
There are 4 files provided. Two files are in CSV format and two files are in JSON format. The files in CSV format are user behavior graphs represented as behavior metrics. The JSON files are the actual API call graphs. The two datasets can be joined on a key so that those who want to combine graphs with metrics could do so in novel ways.
This data set captures API access patterns in terms of behavior metrics. Behaviors are captured by tracking users' API call graphs which are then summarized in terms of metrics. In some sense a categorical sequence of entities has been reduced to numerical metrics.
There are two files provided. One called supervised_dataset.csv
has behaviors labeled as normal
or outlier
. The second file called remaining_behavior_ext.csv
has a larger number of samples that are not labeled but has additional insights as well as a classification created by another algorithm.
Each row is one instance of an observed behavior that has been manually classified as normal or outlier
There are two files provided to correspond to the two CSV files
Each item has an _id field that can be used to join against the CSV data sets. Then we have the API behavior graph represented as a list of edges.
classification
label with a skewed distribution of normal and abnormal cases and with very few labeled samples available. Use supervised_dataset.csv
remaining_behavior_ext.csv
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Number of Nights Classification
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This list contains the government API cases collected, cleaned and analysed in the APIs4DGov study "Web API landscape: relevant general purpose ICT standards, technical specifications and terms".
The list does not represent a complete list of all government cases in Europe, as it is built to support the goals of the study and is limited to the analysis and data gathered from the following sources:
The EU open data portal
The European data portal
The INSPIRE catalogue
JoinUp: The API cases collected from the European Commission JoinUp platform
Literature-document review: the API cases gathered from the research activities of the study performed till the end of 2019
ProgrammableWeb: the ProgrammableWeb API directory
Smart 2015/0041: the database of 395 cases created by the study ‘The project Towards faster implementation and uptake of open government’ (SMART 2015/0041).
Workshops/meetings/interviews: a list of API cases collected in the workshops, surveys and interviews organised within the APIs4DGov
Each API case is classified accordingly to the following rationale:
Unique id: a unique key of each case, obtained by concatenating the following fields: (Country Code) + (Governmental level) + (Name Id) + (Type of API)
API Country or type of provider: the country in which the API case has been published
API provider: the specific provider that published and maintain the API case
Name Id: an acronym of the name of the API case (it can be not unique)
Short description
Type of API: (i) API registry, a set, catalogue, registry or directory of APIs; (ii) API platform: a platform that supports the use of APIs; (iii) API tool: a tool used to manage APIs; (iv) API standard: a set of standards related to government APIs; (v) Data catalogue, an API published to access metadata of datasets, normally published by a data catalogue; (vi) Specific API, a unique (can have many endpoints) API built for a specific purpose
Number of APIs: normally only one, in the case of API registry, the number of APIs published by the registry at the 31/12/2019
Theme: list of domains related to the API case (controlled vocabulary)
Governmental level: the geographical scope of the API (city, regional, national or international)
Country code: the country two letters internal code
Source: the source (among the ones listed in the previous) from where the API case has been gathered
DomainIQ is a comprehensive global Domain Name dataset for organizations that want to build cyber security, data cleaning and email marketing applications. The dataset consists of the DNS records for over 267 million domains, updated daily, representing more than 90% of all public domains in the world.
The data is enriched by over thirty unique data points, including identifying the mailbox provider for each domain and using AI based predictive analytics to identify elevated risk domains from both a cyber security and email sending reputation perspective.
DomainIQ from Datazag offers layered intelligence through a highly flexible API and as a dataset, available for both cloud and on-premises applications. Standard formats include CSV, JSON, Parquet, and DuckDB.
Custom options are available for any other file or database format. With daily updates and constant research from Datazag, organizations can develop their own market leading cyber security, data cleaning and email marketing applications supported by comprehensive and accurate data from Datazag. Data updates available on a daily, weekly and monthly basis. API data is updated on a daily basis.
https://www.ine.es/aviso_legalhttps://www.ine.es/aviso_legal
Table of INEBase Number of titles by Autonomous Communities and Autonomous Cities, subject categories and type of publication (simplified UNESCO classification). Autonomous Communities and Autonomous Cities where they were edited. Statistics on Book Publishing Production
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Issue tracking systems enable users and developers to comment on problems plaguing a software system. Empirical Software Engineering (ESE) researchers study (open-source) project issues and the comments and threads within to discover---among others---challenges developers face when, e.g., incorporating new technologies, platforms, and programming language constructs. However, issue discussion threads accumulate over time and thus can become unwieldy, hindering any insight that researchers may gain. While existing approaches alleviate this burden by classifying issue thread comments, there is a gap between searching popular open-source software repositories (e.g., those on GitHub) for issues containing particular keywords and feeding the results into a classification model. In this paper, we demonstrate a research infrastructure tool called QuerTCI that bridges this gap by integrating the GitHub issue comment search API with the classification models found in existing approaches. Using queries, ESE researchers can retrieve GitHub issues containing particular keywords, e.g., those related to a certain programming language construct, and subsequently classify the kinds of discussions occurring in those issues. Using our tool, our hope is that ESE researchers can uncover challenges related to particular technologies using certain keywords through popular open-source repositories more seamlessly than previously possible. A tool demonstration video may be found at: https://youtu.be/fADKSxn0QUk.
Malware calls are classified and labeled '1' and benign software calls are labeled '0'. The calls are presented in sequential order. CSDM_API_Train.csv contains 388 logs. CSDM_API_TestData.csv contains 378 unclassified logs. CSDM_API_TestLable.csv contains the classifications for CSDM_API_TestData.csv. This data was collected by API monitors during a data mining competition at the International Conference on Neural Information Processing (ICNIP) in Sydney, Austrailia 2010.
[Metadata] This dataset contains those marine Water Quality Standards Classifications surrounding the main Hawaiian Islands as specified in DOH Administrative Rules, specifically Hawaii Administrative Rules Title 11, Department of Health, Chapter 54 Water Quality Standards (see p. 19). June 2024: Hawaii Statewide GIS Program staff removed extraneous fields that had been added as part of a 2016 GIS database conversion and were no longer needed. For additional information, please refer to metadata at https://files.hawaii.gov/dbedt/op/gis/data/water_qual_class.pdf or contact Hawaii Statewide GIS Program, Office of Planning, State of Hawaii; PO Box 2359, Honolulu, Hi. 96804; (808) 587-2846; email: gis@hawaii.gov; Website: https://planning.hawaii.gov/gis.
The Vermont Water Quality Standards (VTWQS) are rules intended to achieve the goals of the Vermont Surface Water Strategy, as well as the objective of the federal Clean Water Act which is to restore and maintain the chemical, physical, and biological integrity of the Nation's water. The classification of waters is in included in the VTWQS. The classification of all waters has been established by a combination of legislative acts and by classification or reclassification decisions issued by the Water Resources Board or Secretary pursuant to 10 V.S.A. � 1253. Those waters reclassified by the Secretary to Class A(1), A(2), or B(1) for any use shall include all waters within the entire watershed of the reclassified waters unless expressly provided otherwise in the rule. All waters above 2,500 feet altitude, National Geodetic Vertical Datum, are designated Class A(1) for all uses, unless specifically designated Class A(2) for use as a public water source. All waters at or below 2,500 feet altitude, National Geodetic Vertical Datum, are designated Class B(2) for all uses, unless specifically designated as Class A(1), A(2), or B(1) for any use.
The Census data API provides access to the most comprehensive set of data on current month and cumulative year-to-date exports using the North American Industry Classification System (NAICS). The NAICS endpoint in the Census data API also provides value, shipping weight, and method of transportation totals at the district level for all U.S. trading partners. The Census data API will help users research new markets for their products, establish pricing structures for potential export markets, and conduct economic planning. If you have any questions regarding U.S. international trade data, please call us at 1(800)549-0595 option #4 or email us at eid.international.trade.data@census.gov.
This dataset was created by swagatron
Table of INEBase Number of copies (Books and leaflets) (thousands) by subject categories (UNESCO classification) and period. National. Statistics on Book Publishing Production
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for andrewmonostate/test-api-auth-real
Dataset Description
This dataset was generated using Vibe Data Director, a tool for creating and curating text classification datasets.
Dataset Summary
Session ID: session_39190326 Generated: 2025-08-22T23:32:41.745101 Total Samples: 2 Classes: test Styles: none
Dataset Structure
Data Fields
text (string): The text content of the sample class (string): The classification label style… See the full description on the dataset page: https://huggingface.co/datasets/andrewmonostate/test-api-auth-real.
https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
BASE YEAR | 2024 |
HISTORICAL DATA | 2019 - 2024 |
REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
MARKET SIZE 2023 | 53.27(USD Billion) |
MARKET SIZE 2024 | 57.35(USD Billion) |
MARKET SIZE 2032 | 103.46(USD Billion) |
SEGMENTS COVERED | Engine Type ,API Classification ,Viscosity Grade ,Application ,Technology ,Regional |
COUNTRIES COVERED | North America, Europe, APAC, South America, MEA |
KEY MARKET DYNAMICS | Growing demand for energyefficient vehicles Stringent emission regulations Technological advancements Rising preference for highperformance vehicles Expansion of automotive industry |
MARKET FORECAST UNITS | USD Billion |
KEY COMPANIES PROFILED | Pennzoil ,ExxonMobil ,Fuchsoils ,Castrol ,Motul ,Mobil 1 ,Shell ,Valvoline ,Liqui Moly ,Idemitsu ,Chevron ,Eneos ,BP ,Amsoil ,Total |
MARKET FORECAST PERIOD | 2025 - 2032 |
KEY MARKET OPPORTUNITIES | 1 Growing demand for fuelefficient vehicles 2 Increasing environmental regulations 3 Rising disposable income 4 Technological advancements in synthetic motor oils 5 Expansion of the automotive aftermarket |
COMPOUND ANNUAL GROWTH RATE (CAGR) | 7.65% (2025 - 2032) |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Full list available in additional materials.
Classifying trees from point cloud data is useful in applications such as high-quality 3D basemap creation, urban planning, and forestry workflows. Trees have a complex geometrical structure that is hard to capture using traditional means. Deep learning models are highly capable of learning these complex structures and giving superior results.Using the modelFollow the guide to use the model. The model can be used with the 3D Basemaps solution and ArcGIS Pro's Classify Point Cloud Using Trained Model tool. Before using this model, ensure that the supported deep learning frameworks libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.InputThe model accepts unclassified point clouds with the attributes: X, Y, Z, and Number of Returns.Note: This model is trained to work on unclassified point clouds that are in a projected coordinate system, where the units of X, Y, and Z are based on the metric system of measurement. If the dataset is in degrees or feet, it needs to be re-projected accordingly. The provided deep learning model was trained using a training dataset with the full set of points. Therefore, it is important to make the full set of points available to the neural network while predicting - allowing it to better discriminate points of 'class of interest' versus background points. It is recommended to use 'selective/target classification' and 'class preservation' functionalities during prediction to have better control over the classification.This model was trained on airborne lidar datasets and is expected to perform best with similar datasets. Classification of terrestrial point cloud datasets may work but has not been validated. For such cases, this pre-trained model may be fine-tuned to save on cost, time and compute resources while improving accuracy. When fine-tuning this model, the target training data characteristics such as class structure, maximum number of points per block, and extra attributes should match those of the data originally used for training this model (see Training data section below).OutputThe model will classify the point cloud into the following 2 classes with their meaning as defined by the American Society for Photogrammetry and Remote Sensing (ASPRS) described below: 0 Background 5 Trees / High-vegetationApplicable geographiesThis model is expected to work well in all regions globally, with an exception of mountainous regions. However, results can vary for datasets that are statistically dissimilar to training data.Model architectureThis model uses the PointCNN model architecture implemented in ArcGIS API for Python.Accuracy metricsThe table below summarizes the accuracy of the predictions on the validation dataset. Class Precision Recall F1-score Trees / High-vegetation (5) 0.975374 0.965929 0.970628Training dataThis model is trained on a subset of UK Environment Agency's open dataset. The training data used has the following characteristics: X, Y and Z linear unit meter Z range -19.29 m to 314.23 m Number of Returns 1 to 5 Intensity 1 to 4092 Point spacing 0.6 ± 0.3 Scan angle -23 to +23 Maximum points per block 8192 Extra attributes Number of Returns Class structure [0, 5]Sample resultsHere are a few results from the model.
Roadway Functional Classification consists of linear features which specifically show the functional classification of public roadways in the State of Maryland. Roadway Functional Classification is defined as the role each roadway plays in moving vehicles throughout a network of highways. Roadway Functional Classification is primarily used for general planning purposes, and for Federal Highway Administration (FHWA) Highway Performance Monitoring System (HPMS) annual submission & coordination. The Maryland Department of Transportation State Highway Administration (MDOT SHA) currently reports this data only on the inventory direction (generally North or East) side of the roadway. Roadway Functional Classification data is not a complete representation of all roadway geometry.Maryland's roadway system is a vast network that connects places and people within and across county borders. Planners and engineers have developed elements of this network with particular travel objectives in mind. These objectives range from serving long-distance passenger and freight needs to serving neighborhood travel from residential developments to nearby shopping centers. The functional classification of roadways defines the role each element of the roadway network plays in serving these travel needs. Over the years, functional classification has come to assume additional significance beyond its purpose as a framework for identifying the particular role of a roadway in moving vehicles through a network of highways. Functional classification carries with it expectations about roadway design, including its speed, capacity and relationship to existing and future land use development. Federal legislation continues to use functional classification in determining eligibility for funding under the Federal-aid program. Transportation agencies describe roadway system performance, benchmarks and targets by functional classification. As agencies continue to move towards a more performance-based management approach, functional classification will be an increasingly important consideration in setting expectations and measuring outcomes for preservation, mobility and safety.Roadway Functional Classification data is developed as part of the Highway Performance Monitoring System (HPMS) which maintains and reports transportation related information to the Federal Highway Administration (FHWA) on an annual basis. HPMS is maintained by the Maryland Department of Transportation State Highway Administration (MDOT SHA), under the Office of Planning and Preliminary Engineering (OPPE) Data Services Division (DSD). This data is used by various business units throughout MDOT, as well as many other Federal, State and local government agencies. Roadway Functional Classification data is key to understanding the role each roadway plays in moving vehicles throughout Maryland's network of highways.Roadway Functional Classification data is updated and published on an annual basis for the prior year. This data is for the year 2017. View the most current Roadway Functional Classification data in the MDOT SHA Roadway Functional Classes Application For additional information, contact the MDOT SHA Geospatial TechnologiesEmail: GIS@mdot.state.md.usFor additional information related to the Maryland Department of Transportation (MDOT):https://www.mdot.maryland.gov/For additional information related to the Maryland Department of Transportation State Highway Administration (MDOT SHA):https://roads.maryland.gov/Home.aspxMDOT SHA Geospatial Data Legal Disclaimer:The Maryland Department of Transportation State Highway Administration (MDOT SHA) makes no warranty, expressed or implied, as to the use or appropriateness of geospatial data, and there are no warranties of merchantability or fitness for a particular purpose or use. The information contained in geospatial data is from publicly available sources, but no representation is made as to the accuracy or completeness of geospatial data. MDOT SHA shall not be subject to liability for human error, error due to software conversion, defect, or failure of machines, or any material used in the connection with the machines, including tapes, disks, CD-ROMs or DVD-ROMs and energy. MDOT SHA shall not be liable for any lost profits, consequential damages, or claims against MDOT SHA by third parties.This is a MD iMAP hosted service layer. Find more information at https://imap.maryland.gov.Map Service Link:https://mdgeodata.md.gov/imap/rest/services/Transportation/MD_HighwayPerformanceMonitoringSystem/MapServer/2
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Number of labels returned.
Land cover describes the surface of the earth. Land cover maps are useful in urban planning, resource management, change detection, agriculture, and a variety of other applications in which information related to earth surface is required. Land cover classification is a complex exercise and is hard to capture using traditional means. Deep learning models are highly capable of learning these complex semantics and can produce superior results.Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.Fine-tuning the modelThis model can be fine-tuned using the Train Deep Learning Model tool. Follow the guide to fine-tune this model.Input8-bit, 3-band high-resolution (80 - 100 cm) imagery.OutputClassified raster with the same classes as in the Chesapeake Bay Landcover dataset (2013/2014). By default, the output raster contains 9 classes. A simpler classification with 6 classes can be performed by setting the the 'detailed_classes' model argument to false.Note: The output classified raster will not contain 'Aberdeen Proving Ground' class. Find class descriptions here.Applicable geographiesThis model is applicable in the United States and is expected to produce best results in the Chesapeake Bay Region.Model architectureThis model uses the UNet model architecture implemented in ArcGIS API for Python.Accuracy metricsThis model has an overall accuracy of 86.5% for classification into 9 land cover classes and 87.86% for 6 classes. The table below summarizes the precision, recall and F1-score of the model on the validation dataset, for classification into 9 land cover classes:ClassPrecisionRecallF1 ScoreWater0.936140.930460.93329Wetlands0.816590.759050.78677Tree Canopy0.904770.931430.91791Shrubland0.516250.186430.27394Low Vegetation0.859770.866760.86325Barren0.671650.509220.57927Structures0.80510.848870.82641Impervious Surfaces0.735320.685560.70957Impervious Roads0.762810.812380.78682The table below summarizes the precision, recall and F1-score of the model on the validation dataset, for classification into 6 land cover classes: ClassPrecisionRecallF1 ScoreWater0.950.940.95Tree Canopy and Shrubs0.910.920.92Low Vegetation0.850.850.85Barren0.790.690.74Impervious Surfaces0.840.840.84Impervious Roads0.820.830.82Training dataThis model has been trained on the Chesapeake Bay high-resolution 2013/2014 NAIP Landcover dataset (produced by Chesapeake Conservancy with their partners University of Vermont Spatial Analysis Lab (UVM SAL), and Worldview Solutions, Inc. (WSI)) and other high resolution imagery. Find more information about the dataset here.Sample resultsHere are a few results from the model.
Functional Classification , Streets, Adopted 5/28/2015 by Bill No 12-15 Item Updated: 11-14-2019 12:58 PM
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Distributed micro-services based applications are typically accessed via APIs. These APIs are used either by apps or they can be accessed directly by programmatic means. Many a time API access is abused by attackers trying to exploit the business logic exposed by these APIs. The way normal users access these APIs is different from how the attackers access these APIs. Many applications have 100s of APIs that are called in specific order and depending on various factors such as browser refreshes, session refreshes, network errors, or programmatic access these behaviors are not static and can vary for the same user. API calls in long running sessions form access graphs that need to be analysed in order to discover attack patterns and anomalies. Graphs dont lend themselves to numerical computation. We address this issue and provide a dataset where user access behavior is qualified as numerical features. In addition we provide a dataset where raw API call graphs are provided. Supporting the use of these datasets two notebooks on classification, node embeddings and clustering are also provided.
There are 4 files provided. Two files are in CSV format and two files are in JSON format. The files in CSV format are user behavior graphs represented as behavior metrics. The JSON files are the actual API call graphs. The two datasets can be joined on a key so that those who want to combine graphs with metrics could do so in novel ways.
This data set captures API access patterns in terms of behavior metrics. Behaviors are captured by tracking users' API call graphs which are then summarized in terms of metrics. In some sense a categorical sequence of entities has been reduced to numerical metrics.
There are two files provided. One called supervised_dataset.csv
has behaviors labeled as normal
or outlier
. The second file called remaining_behavior_ext.csv
has a larger number of samples that are not labeled but has additional insights as well as a classification created by another algorithm.
Each row is one instance of an observed behavior that has been manually classified as normal or outlier
There are two files provided to correspond to the two CSV files
Each item has an _id field that can be used to join against the CSV data sets. Then we have the API behavior graph represented as a list of edges.
classification
label with a skewed distribution of normal and abnormal cases and with very few labeled samples available. Use supervised_dataset.csv
remaining_behavior_ext.csv