43 datasets found

A
Mapping incident locations from a CSV file in a web map (video)
data.amerigeoss.org
esri rest, html
Updated Mar 17, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ESRI (2020). Mapping incident locations from a CSV file in a web map (video) [Dataset]. https://data.amerigeoss.org/zh_CN/dataset/mapping-incident-locations-from-a-csv-file-in-a-web-map-video
Explore at:
esri rest, htmlAvailable download formats
Dataset updated
Mar 17, 2020
Dataset provided by
ESRI
Description
Mapping incident locations from a CSV file in a web map (YouTube video).

View this short demonstration video to learn how to geocode incident locations from a spreadsheet in ArcGIS Online. In this demonstration, the presenter drags a simple .csv file into a browser-based Web Map and maps the appropriate address fields to display incident points allowing different types of spatial overlays and analysis.

_

Communities around the world are taking strides in mitigating the threat that COVID-19 (coronavirus) poses. Geography and location analysis have a crucial role in better understanding this evolving pandemic.

When you need help quickly, Esri can provide data, software, configurable applications, and technical support for your emergency GIS operations. Use GIS to rapidly access and visualize mission-critical information. Get the information you need quickly, in a way that’s easy to understand, to make better decisions during a crisis.

Esri’s Disaster Response Program (DRP) assists with disasters worldwide as part of our corporate citizenship. We support response and relief efforts with GIS technology and expertise.

More information...
w
Bulk Domain Names CSV Files -- varocarbas.com -- weekly updated
data.wu.ac.at
csv
Updated Jan 30, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
varocarbas (2015). Bulk Domain Names CSV Files -- varocarbas.com -- weekly updated [Dataset]. https://data.wu.ac.at/schema/datahub_io/MzIyYTdkOGMtOWJhZS00OGZmLTk2YmUtMjc2MjU1ZmFiMmVm
Explore at:
csvAvailable download formats
Dataset updated
Jan 30, 2015
Dataset provided by
varocarbas
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
List of valid web domain names collected by the (bulk) crawling bots (stage-1 bots) running on varocarbas.com.

These bots perform a blind recursive analysis of links, based on the "everything is connected" idea. That is: they started in a given webpage and are expected to retrieve a relevant proportion of all the existing domain names.
a
Skills Building - Add a CSV file to a map
resources-gisinschools-nz.hub.arcgis.com
Updated Jun 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GIS in Schools - Teaching Materials - New Zealand (2020). Skills Building - Add a CSV file to a map [Dataset]. https://resources-gisinschools-nz.hub.arcgis.com/documents/c45f392466254ce4a24be98a15c8193c
Explore at:
Dataset updated
Jun 2, 2020
Dataset authored and provided by
GIS in Schools - Teaching Materials - New Zealand
Description
Instructions on how to create a layer containing recent earthquakes from a CSV file downloaded from GNS Sciences GeoNet website to a Web Map.The CSV file must contain latitude and longitude fields for the earthquake location for it to be added to a Web Map as a point layer.Document designed to support the Natural Hazards - Earthquakes story map
f
CSV file of all links in the summary network
figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emily Griffiths (2023). CSV file of all links in the summary network [Dataset]. http://doi.org/10.6084/m9.figshare.978526.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.978526.v2
Dataset updated
May 30, 2023
Dataset provided by
figshare
Authors
Emily Griffiths
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Semicolon delimited text file equivalent of the Rdata file. See the Rdata file for a description of the data in each column.
Semantic links between selected CSV datasets harvested by the European Data...
zenodo.org
eprints.soton.ac.uk
zip
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luis-Daniel Ibanez; Luis-Daniel Ibanez (2024). Semantic links between selected CSV datasets harvested by the European Data Portal and the DBpedia knowledge graph [Dataset]. http://doi.org/10.5281/zenodo.3837721
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3837721
Dataset updated
Jul 19, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Luis-Daniel Ibanez; Luis-Daniel Ibanez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These dataset contains the results of the interlinking process between selected csv datasets harvested by the European DAta Portal and the DBpedia knowledge graph.

We aim at answering the following questions:
What are the more popular column types? This will provide hindsight about what the datasets hold and how they can be joined. It will also provide hindsight on what specific linking schemes could be applied in future elements.
What datasets have columns of the same type? This will suggest datasets that may be similar or related.
What entities appear in most datasets (co-referent entities)? This will suggest entities for which more data is published.
What datasets share a particular entity? This will suggest datasets that may be joined, or are related through that particular entity

Results are provided as augmented tables, that contain the columns of the original csv, plus a metadata file in JSON-LD format. The metadata files can be loaded in an RDF-store and queried.

Refer to the accompanying report of activities for more details on the methodolog and how to query the dataset.
m
Network traffic and code for machine learning classification
data.mendeley.com
Updated Feb 20, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Víctor Labayen (2020). Network traffic and code for machine learning classification [Dataset]. http://doi.org/10.17632/5pmnkshffm.2
Explore at:
Unique identifier
https://doi.org/10.17632/5pmnkshffm.2
Dataset updated
Feb 20, 2020
Authors
Víctor Labayen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset is a set of network traffic traces in pcap/csv format captured from a single user. The traffic is classified in 5 different activities (Video, Bulk, Idle, Web, and Interactive) and the label is shown in the filename. There is also a file (mapping.csv) with the mapping of the host's IP address, the csv/pcap filename and the activity label.

Activities:

Interactive: applications that perform real-time interactions in order to provide a suitable user experience, such as editing a file in google docs and remote CLI's sessions by SSH. Bulk data transfer: applications that perform a transfer of large data volume files over the network. Some examples are SCP/FTP applications and direct downloads of large files from web servers like Mediafire, Dropbox or the university repository among others. Web browsing: contains all the generated traffic while searching and consuming different web pages. Examples of those pages are several blogs and new sites and the moodle of the university. Vídeo playback: contains traffic from applications that consume video in streaming or pseudo-streaming. The most known server used are Twitch and Youtube but the university online classroom has also been used. Idle behaviour: is composed by the background traffic generated by the user computer when the user is idle. This traffic has been captured with every application closed and with some opened pages like google docs, YouTube and several web pages, but always without user interaction.

The capture is performed in a network probe, attached to the router that forwards the user network traffic, using a SPAN port. The traffic is stored in pcap format with all the packet payload. In the csv file, every non TCP/UDP packet is filtered out, as well as every packet with no payload. The fields in the csv files are the following (one line per packet): Timestamp, protocol, payload size, IP address source and destination, UDP/TCP port source and destination. The fields are also included as a header in every csv file.

The amount of data is stated as follows:

Bulk : 19 traces, 3599 s of total duration, 8704 MBytes of pcap files Video : 23 traces, 4496 s, 1405 MBytes Web : 23 traces, 4203 s, 148 MBytes Interactive : 42 traces, 8934 s, 30.5 MBytes Idle : 52 traces, 6341 s, 0.69 MBytes

The code of our machine learning approach is also included. There is a README.txt file with the documentation of how to use the code.
Selected data.gov.au web analytics
data.gov.au
cloud.csiss.gmu.edu
+1more
csv
Updated Sep 7, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Digital Transformation Agency (2018). Selected data.gov.au web analytics [Dataset]. https://data.gov.au/data/dataset/selected-data-gov-au-web-analytics
Explore at:
csv, csv(1181419), csv(259167)Available download formats
Dataset updated
Sep 7, 2018
Dataset provided by
Digital Transformation Agencyhttp://dta.gov.au/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Australia
Description
A selection of analytics metrics for the data.gov.au service. Starting from January 2015 these metrics are aggregated by month and include;

sessions,

average session duration,

bounce rate,

page views, and

unique users.

If you have suggestions for additional analytics please send an email to data@pmc.gov.au for consideration.
z
Data from: Census of the Ecosystem of Decentralized Autonomous Organizations...
zenodo.org
produccioncientifica.ucm.es
+2more
pdf, zip
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrea Peña-Calvin; Andrea Peña-Calvin; Javier Arroyo; Javier Arroyo; Andrew Schwartz; Andrew Schwartz; Samer Hassan; Samer Hassan; David Davó; David Davó (2024). Census of the Ecosystem of Decentralized Autonomous Organizations [Dataset]. http://doi.org/10.5281/zenodo.10794916
Explore at:
zip, pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10794916
Dataset updated
Jul 6, 2024
Dataset provided by
Universidad Complutense de Madrid
Authors
Andrea Peña-Calvin; Andrea Peña-Calvin; Javier Arroyo; Javier Arroyo; Andrew Schwartz; Andrew Schwartz; Samer Hassan; Samer Hassan; David Davó; David Davó
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset includes data from various Decentralized Autonomous Organizations (DAOs) platforms, namely Aragon, DAOHaus, DAOstack, Realms, Snapshot and Tally. DAOs are a new form of self-governed online communities deployed in the blockchain. DAO members typically use governance tokens to participate in the DAO decision-making process, often through a voting system where members submit proposals and vote on them.

The description of the methods used for the generation of data, for processing it and the quality-assurance procedures performed on the data can be found here:
https://doi.org/10.1145/3589335.3651481

Recommended citation for this dataset:
Peña-Calvin, A., Arroyo, J., Schwartz, A., & Hassan, S. (2024). Concentration of Power and Participation in Online Governance: the Ecosystem of Decentralized Autonomous Organizations. Companion Proceedings of the ACM Web Conference, 13–17, 2024, Singapore, doi: https://doi.org/10.1145/3589335.3651481

The dataset comprises three CSV files: deployments.csv, proposals.csv, and votes.csv, each containing essential information regarding DAOs deployments, their
proposals, and the corresponding votes.

The file deployments.csv provides insights into the general aspects of DAO deployments, including the platform it is deployed in, the number of proposals, unique voters, votes cast, and estimated voting power.

The proposals.csv file contains comprehensive information about all proposals associated with the deployments, including their date, the number of votes they received, and the total voting power voters employed on that proposal.

In votes.csv, data regarding the votes cast for the deployment proposals is recorded. It includes the voter's blockchain address, the vote's weight in voting power, and the day it was cast.
Selected NationalMap web analytics
data.gov.au
data.wu.ac.at
csv
Updated Dec 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Digital Transformation Agency (2018). Selected NationalMap web analytics [Dataset]. https://data.gov.au/data/dataset/selected-nationalmap-web-analytics
Explore at:
csvAvailable download formats
Dataset updated
Dec 18, 2018
Dataset provided by
Digital Transformation Agencyhttp://dta.gov.au/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A selection of analytics metrics for the NationalMap service. Starting from September 2015 these metrics are aggregated by month and include;

sessions,

average session duration,

bounce rate,

page views, and

unique users.

If you have suggestions for additional analytics please send an email to data@pmc.gov.au for consideration.
The BriQ Data set
figshare.com
txt
Updated Jan 30, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yusra Ibrahim; Mirek Riedewald; Gerhard Weikum; Demetrios Zeinalipour Yazti (2019). The BriQ Data set [Dataset]. http://doi.org/10.6084/m9.figshare.7642055.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7642055.v1
Dataset updated
Jan 30, 2019
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Yusra Ibrahim; Mirek Riedewald; Gerhard Weikum; Demetrios Zeinalipour Yazti
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This folder contains the annonated corpus in CSV format organized as follows: * page-data.csv : contains all the annotated web pages with their HTML content. * document-data.csv : contains the documents extracted from the web pages, where each document contains a single paragraph and have a set of related tables. * table-data.csv : contains the tables related to each document. It also contains the HTML content of the table extracted from the web page. * mention-data.csv : contains all the quantity mentions with ground truth mapping extracted from the documents. * mention_table-data.csv : contains the related table for each mention. * annotations-GT.csv : contains the collected ground truth annotations.
MaRV Scripts and Dataset
zenodo.org
zip
Updated Dec 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous; Anonymous (2024). MaRV Scripts and Dataset [Dataset]. http://doi.org/10.5281/zenodo.14450098
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14450098
Dataset updated
Dec 15, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anonymous; Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The MaRV dataset consists of 693 manually evaluated code pairs extracted from 126 GitHub Java repositories, covering four types of refactoring. The dataset also includes metadata describing the refactored elements. Each code pair was assessed by two reviewers selected from a pool of 40 participants. The MaRV dataset is continuously evolving and is supported by a web-based tool for evaluating refactoring representations. This dataset aims to enhance the accuracy and reliability of state-of-the-art models in refactoring tasks, such as refactoring candidate identification and code generation, by providing high-quality annotated data.

Our dataset is located at the path dataset/MaRV.json

The guidelines for replicating the study are provided below:

Requirements

1. Software Dependencies:

Python 3.10+ with packages in requirements.txt

Git: Required to clone repositories.

Java 17: RefactoringMiner requires Java 17 to perform the analysis.

PHP 8.0: Required to host the Web tool.

MySQL 8: Required to store the Web tool data.

2. Environment Variables:

Create a .env file based on .env.example in the src folder and set the variables:

CSV_PATH: Path to the CSV file containing the list of repositories to be processed.

CLONE_DIR: Directory where repositories will be cloned.

JAVA_PATH: Path to the Java executable.

REFACTORING_MINER_PATH: Path to RefactoringMiner.

Refactoring Technique Selection

1. Environment Setup:

Ensure all dependencies are installed. Install the required Python packages with:
pip install -r requirements.txt

2. Configuring the Repositories CSV:

The CSV file specified in CSV_PATH should contain a column named name with GitHub repository names (format: username/repo).

3. Executing the Script:

Configure the environment variables in the .env file and set up the repositories CSV, then run:
python3 src/run_rm.py

The RefactoringMiner output from the 126 repositories of our study is available at:
https://zenodo.org/records/14395034

4. Script Behavior:

The script clones each repository listed in the CSV file into the directory specified by CLONE_DIR, retrieves the default branch, and runs RefactoringMiner to analyze it.

Results and Logs:

Analysis results from RefactoringMiner are saved as .json files in CLONE_DIR.

Logs for each repository, including error messages, are saved as .log files in the same directory.

5. Count Refactorings:

To count instances for each refactoring technique, run:
python3 src/count_refactorings.py

The output CSV file, named refactoring_count_by_type_and_file, shows the number of refactorings for each technique, grouped by repository.

Data Gathering

To collect snippets before and after refactoring and their metadata, run:

python3 src/diff.py '[refactoring technique]'

Replace [refactoring technique] with the desired technique name (e.g., Extract Method).

The script creates a directory for each repository and subdirectories named with the commit SHA. Each commit may have one or more refactorings.

Dataset Availability:

The snippets and metadata from the 126 repositories of our study are available in the dataset directory.

To generate the SQL file for the Web tool, run:

python3 src/generate_refactorings_sql.py

Web Tool for Manual Evaluation

The Web tool scripts are available in the web directory.

Populate the data/output/snippets folder with the output of src/diff.py.

Run the sql/create_database.sql script in your database.

Import the SQL file generated by src/generate_refactorings_sql.py.

Run dataset.php to generate the MaRV dataset file.

The MaRV dataset, generated by the Web tool, is available in the dataset directory of the replication package.
Z
DoH-Gen-F-CCDDD
data.niaid.nih.gov
Updated Feb 5, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hynek, Karel (2022). DoH-Gen-F-CCDDD [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5957420
Explore at:
Dataset updated
Feb 5, 2022
Dataset provided by
Jeřábek, Kamil
Čejka, Tomáš
Hynek, Karel
Ryšavý, Ondřej
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset of DNS over HTTPS traffic from Firefox (Comcast, CZNIC, DNSForge, DSNSB, DOHli) The dataset contains DoH and HTTPS traffic that was captured in a virtualized environment (Docker) and generated automatically by Firefox browser with enabled DoH towards 5 different DoH servers (Comcast, CZNIC, DNSForge, DSNSB, DOHli) and a web page loads towards a sample of web pages taken from Majestic Million dataset. The data are provided in the form of PCAP files. However, we also provided TLS enriched flow data that are generated with opensource ipfixprobe flow exporter. Other than TLS related information is not relevant since the dataset comprises only encrypted TLS traffic. The TLS enriched flow data are provided in the form of CSV files with the following columns:

Column Name Column Description DST_IP Destination IP address SRC_IP Source IP address BYTES The number of transmitted bytes from Source to Destination BYTES_REV The number of transmitted bytes from Destination to Source TIME_FIRST Timestamp of the first packet in the flow in format YYYY-MM-DDTHH-MM-SS TIME_LAST Timestamp of the last packet in the flow in format YYYY-MM-DDTHH-MM-SS PACKETS The number of packets transmitted from Source to Destination PACKETS_REV The number of packets transmitted from Destination to Source DST_PORT Destination port SRC_PORT Source port PROTOCOL The number of transport protocol TCP_FLAGS Logic OR across all TCP flags in the packets transmitted from Source to Destination TCP_FLAGS_REV Logic OR across all TCP flags in the packets transmitted from Destination to Source TLS_ALPN The Value of Application Protocol Negotiation Extension sent from Server TLS_JA3 The JA3 fingerprint TLS_SNI The value of Server Name Indication Extension sent by Client

The DoH resolvers in the dataset can be identified by IP addresses written in doh_resolver_ip.csv file.

The main part of the dataset is located in DoH-Gen-F-CCDDD.tar.gz and has the following structure:

. └─── data | - Main directory with data └── generated | - Directory with generated captures ├── pcap | - Generated PCAPs │ └── firefox └── tls-flow-csv | - Generated CSV flow data └── firefox

Total stats of generated data:

Name Value Total Data Size 40.2 GB Total files 10 DoH extracted tls flows ~100 K Non-DoH extracted tls flows ~315 K

DoH Server information

Name Provider DoH query url Comcast https://corporate.comcast.com https://doh.xfinity.com/dns-query CZNIC https://www.nic.cz https://odvr.nic.cz/doh DNSForge https://dnsforge.de https://dnsforge.de/dns-query DNSSB https://dns.sb/doh/ https://doh.dns.sb/dns-query DOHli https://doh.li https://doh.li/dns-query
f
Stable C and N isotope changes in LML food web (.csv file)
figshare.com
txt
Updated Aug 11, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mariah Taylor (2018). Stable C and N isotope changes in LML food web (.csv file) [Dataset]. http://doi.org/10.6084/m9.figshare.6957632.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6957632.v1
Dataset updated
Aug 11, 2018
Dataset provided by
figshare
Authors
Mariah Taylor
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Dataset contains ratios of 13C/12C and 15N/14N during different time intervals for top predator fish and select lower trophic level organisms for use as input for "Stable C and N changes in LML food web R code".
g
Datasets for manuscript: ADAM: A Web Platform for Graph-Based Modeling and...
gimi9.com
Updated Nov 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Datasets for manuscript: ADAM: A Web Platform for Graph-Based Modeling and Optimization of Supply Chains | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_datasets-for-manuscript-adam-a-web-platform-for-graph-based-modeling-and-optimization-of-s/
Explore at:
Dataset updated
Nov 2, 2023
Description
ADAM-Data-Repository This repository contains all the data needed to run the case studies for the ADAM manuscript. ## Biogas production The directory "biogas" contains all data for the biogas production case studies (Figs 13 and 14). Specifically, "biogas/biogas_x" contains the data files for the scenario where "x" is the corresponding Renewable Energy Certificates (RECs) value. ## Plastic waste recycling The directory "plastic_waste" contains all data for the plastic waste recycling case studies (Figs 15 and 16). Different scenarios share the same supply, technology site, and technology candidate data, as specified by the "csv" files under "plastic_waste". Each scenario has a different demand data file, which is contained in "plastic_waste/Elec_price" and "plastic_waste/PET_price". ## How to run the case studies In order to run the case studies, one can create a new model in ADAM and upload appropriate CSV file at each step (e.g. upload biogas/biogas_0/supplydata197.csv in step 2 where supply data are specified). This dataset is associated with the following publication: Hu, Y., W. Zhang, P. Tominac, M. Shen, D. Göreke, E. Martín-Hernández, M. Martín, G.J. Ruiz-Mercado, and V.M. Zavala. ADAM: A web platform for graph-based modeling and optimization of supply chains. COMPUTERS AND CHEMICAL ENGINEERING. Elsevier Science Ltd, New York, NY, USA, 165: 107911, (2022).
a
ArcGIS Online Web Map Service Validation Audit
ohio-gis-code-repository-geohio.hub.arcgis.com
Updated Sep 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ohio Geographic Information and Data Exchange (2023). ArcGIS Online Web Map Service Validation Audit [Dataset]. https://ohio-gis-code-repository-geohio.hub.arcgis.com/documents/4fd905b5e89e42c7aab61a6afb4baf0b
Explore at:
Dataset updated
Sep 19, 2023
Dataset authored and provided by
Ohio Geographic Information and Data Exchange
Description
This script will go through an entire ArcGIS Online Organization or a Portal Organization and look through all of the Web Maps. Then, this script will check the all of the urls of all of the map services within each Web Map to determine if they are valid. If they are not valid, it will write the results to a csv file so they can be taken care of. The csv file can then be used to aid the administrator in the cleanup of the map services with invalid urls. This is a Jupyter Notebook written using the ArcGIS Python API.
a
Chicago Crime August 2017.csv
pmorrisas430623-gisanddata.opendata.arcgis.com
Updated Oct 8, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
pmorri29_GISandData (2017). Chicago Crime August 2017.csv [Dataset]. https://pmorrisas430623-gisanddata.opendata.arcgis.com/datasets/754c752c4a4245f7bfebd6fd1f7a6790
Explore at:
Dataset updated
Oct 8, 2017
Dataset authored and provided by
pmorri29_GISandData
Area covered

Description
This feature service shows Chicago crimes in of August of 2017.
t
Web-ids23 dataset - Vdataset - LDM
service.tib.eu
Updated May 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Web-ids23 dataset - Vdataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/osn-doi-10-26249-fk2-mociy8
Explore at:
Dataset updated
May 16, 2025
Description
Abstract: WEB-IDS23 WEB-IDS23 is a network intrusion detection dataset that includes over 12 million flows, categorizing 20 attack types across FTP, HTTP/S, SMTP, SSH, and network scanning activities. This dataset is documented in the paper "Technical Report: Generating the WEB-IDS23 Dataset," which provides insights into the generation, structure, and key characteristics of the dataset. Data The dataset is available as CSV files under web-ids23. Each file includes the data of one class, and each row corresponds to a flow extracted using Zeek FlowMeter. In total, the dataset includes over 12 million samples. Short Documentation A short documentation of the data and the according labels can be found in the files readme.md or readme.pdf
Top 100 Batsman
kaggle.com
Updated Jan 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rahul Wadwani (2023). Top 100 Batsman [Dataset]. http://doi.org/10.34740/kaggle/dsv/4824563
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/4824563
Dataset updated
Jan 8, 2023
Dataset provided by
Kaggle
Authors
Rahul Wadwani
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This web-scraped dataset collected from the cricbuzz website contains all the top 100 batsmen This web-scraped dataset collected from the cricbuzz website contains all the top 100 batsmen web-scraped dataset collected from the cricbuzz website contains all the top 100 batsmen with the best performance level at the top of the dataset, indicating that the player who has performed the best has been ranked in the following top100batsman.csv file. This dataset has only the top 100 players This web-scraped dataset collected from the cricbuzz website contains all the top 100 batsmen This a web-scraped dataset collected from the cricbuzz website contains the top 100 batsmen with the best performance level at the top of the dataset, indicating that the player who has performed the best has been ranked in the following top100batsman.csv file. This dataset has only the top 100 players who has completed the best in the field of test cricket and the data is collected on 7th January 2023.

Dataset contains:- test_ranking: this column contains the current test ranking of the player. player id : this column contains the player id which is unique and specified according to cricbuzz batsman : this column contains the name of the batsman to date rating : this column is provided by the ICC team: this column deals with the name of the team from which the player belongs. matches : this column: this column is the number of matches played by the player till date innings : innings deals with the number of times in a match the player has batted runs:total number of runs scored by the batsman high_score : highest score achieved by a batsman average : it is the ratio of total number of runs scored to the number of times the batsman got out. strike_rate: this the overall strike rate of the batsman which is calculated by runs scored divided by the ball played century @[💯](100) : number of centuries scored by the batsman double_century : number of double centuries scored by the batsman h scored by the batsman half_century : number of half_century scored by the batsman fours : total number of fours hit till date sixes : total number of sixes hit till date
Z
Cebulka (Polish dark web cryptomarket and image board) messages data
data.niaid.nih.gov
zenodo.org
Updated Mar 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cheba, Patrycja (2024). Cebulka (Polish dark web cryptomarket and image board) messages data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10810938
Explore at:
Dataset updated
Mar 18, 2024
Dataset provided by
Siuda, Piotr
Świeca, Leszek
Shi, Haitao
Cheba, Patrycja
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General Information

Title of Dataset

Cebulka (Polish dark web cryptomarket and image board) messages data.

Data Collectors

Haitao Shi (The University of Edinburgh, UK); Patrycja Cheba (Jagiellonian University); Leszek Świeca (Kazimierz Wielki University in Bydgoszcz, Poland).

Funding Information

The dataset is part of the research supported by the Polish National Science Centre (Narodowe Centrum Nauki) grant 2021/43/B/HS6/00710.

Project title: “Rhizomatic networks, circulation of meanings and contents, and offline contexts of online drug trade” (2022-2025; PLN 956 620; funding institution: Polish National Science Centre [NCN], call: OPUS 22; Principal Investigator: Piotr Siuda [Kazimierz Wielki University in Bydgoszcz, Poland]).

Data Collection Context

Data Source

Polish dark web cryptomarket and image board called Cebulka (http://cebulka7uxchnbpvmqapg5pfos4ngaxglsktzvha7a5rigndghvadeyd.onion/index.php).

Purpose

This dataset was developed within the abovementioned project. The project focuses on studying internet behavior concerning disruptive actions, particularly emphasizing the online narcotics market in Poland. The research seeks to (1) investigate how the open internet, including social media, is used in the drug trade; (2) outline the significance of darknet platforms in the distribution of drugs; and (3) explore the complex exchange of content related to the drug trade between the surface web and the darknet, along with understanding meanings constructed within the drug subculture.

Within this context, Cebulka is identified as a critical digital venue in Poland’s dark web illicit substances scene. Besides serving as a marketplace, it plays a crucial role in shaping the narratives and discussions prevalent in the drug subculture. The dataset has proved to be a valuable tool for performing the analyses needed to achieve the project’s objectives.

Data Content

Data Description

The data was collected in three periods, i.e., in January 2023, June 2023, and January 2024.

The dataset comprises a sample of messages posted on Cebulka from its inception until January 2024 (including all the messages with drug advertisements). These messages include the initial posts that start each thread and the subsequent posts (replies) within those threads. The dataset is organized into two directories. The “cebulka_adverts” directory contains posts related to drug advertisements (both advertisements and comments). In contrast, the “cebulka_community” directory holds a sample of posts from other parts of the cryptomarket, i.e., those not related directly to trading drugs but rather focusing on discussing illicit substances. The dataset consists of 16,842 posts.

Data Cleaning, Processing, and Anonymization

The data has been cleaned and processed using regular expressions in Python. Additionally, all personal information was removed through regular expressions. The data has been hashed to exclude all identifiers related to instant messaging apps and email addresses. Furthermore, all usernames appearing in messages have been eliminated.

File Formats and Variables/Fields

The dataset consists of the following files:

Zipped .txt files (“cebulka_adverts.zip” and “cebulka_community.zip”) containing all messages. These files are organized into individual directories that mirror the folder structure found on Cebulka.

Two .csv files that list all the messages, including file names and the content of each post. The first .csv lists messages from “cebulka_adverts.zip,” and the second .csv lists messages from “cebulka_community.zip.”

Ethical Considerations

Ethics Statement

A set of data handling policies aimed at ensuring safety and ethics has been outlined in the following paper:

Harviainen, J.T., Haasio, A., Ruokolainen, T., Hassan, L., Siuda, P., Hamari, J. (2021). Information Protection in Dark Web Drug Markets Research [in:] Proceedings of the 54th Hawaii International Conference on System Sciences, HICSS 2021, Grand Hyatt Kauai, Hawaii, USA, 4-8 January 2021, Maui, Hawaii, (ed.) Tung X. Bui, Honolulu, HI, pp. 4673-4680.

The primary safeguard was the early-stage hashing of usernames and identifiers from the messages, utilizing automated systems for irreversible hashing. Recognizing that automatic name removal might not catch all identifiers, the data underwent manual review to ensure compliance with research ethics and thorough anonymization.
Supplementary data.csv
figshare.com
txt
Updated Oct 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mingxiao Li; Chongxuan Lu; Hong Yang; Anqi Ling; Zaoqu liu; Quan Cheng; Shixiang Wang; Jian Zhang; Peng Luo; Peng Luo (2023). Supplementary data.csv [Dataset]. http://doi.org/10.6084/m9.figshare.24302920.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24302920.v2
Dataset updated
Oct 13, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Mingxiao Li; Chongxuan Lu; Hong Yang; Anqi Ling; Zaoqu liu; Quan Cheng; Shixiang Wang; Jian Zhang; Peng Luo; Peng Luo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In our manuscript, we introduce "ImmunoCheckDB," a web-based tool developed using the Shiny framework, which addresses the need for comprehensive analysis of ICI efficacy data and multiomic markers across different cancer types. ImmunoCheckDB enables users to conduct online meta-analyses and multiomic analyses by collecting and organizing extensive data from published clinical trials and multiomic experiments.

Facebook

Twitter

Click to copy link

Link copied

Cite

ESRI (2020). Mapping incident locations from a CSV file in a web map (video) [Dataset]. https://data.amerigeoss.org/zh_CN/dataset/mapping-incident-locations-from-a-csv-file-in-a-web-map-video

Mapping incident locations from a CSV file in a web map (video)

Explore at:

esri rest, htmlAvailable download formats

Dataset updated

Mar 17, 2020

Dataset provided by

ESRI

Description

Mapping incident locations from a CSV file in a web map (YouTube video).

View this short demonstration video to learn how to geocode incident locations from a spreadsheet in ArcGIS Online. In this demonstration, the presenter drags a simple .csv file into a browser-based Web Map and maps the appropriate address fields to display incident points allowing different types of spatial overlays and analysis.

Communities around the world are taking strides in mitigating the threat that COVID-19 (coronavirus) poses. Geography and location analysis have a crucial role in better understanding this evolving pandemic.

When you need help quickly, Esri can provide data, software, configurable applications, and technical support for your emergency GIS operations. Use GIS to rapidly access and visualize mission-critical information. Get the information you need quickly, in a way that’s easy to understand, to make better decisions during a crisis.

Esri’s Disaster Response Program (DRP) assists with disasters worldwide as part of our corporate citizenship. We support response and relief efforts with GIS technology and expertise.

More information...

Clear search

Close search

Google apps

Main menu

Mapping incident locations from a CSV file in a web map (video)

Bulk Domain Names CSV Files -- varocarbas.com -- weekly updated

Skills Building - Add a CSV file to a map

CSV file of all links in the summary network

Semantic links between selected CSV datasets harvested by the European Data...

Network traffic and code for machine learning classification

Selected data.gov.au web analytics

Data from: Census of the Ecosystem of Decentralized Autonomous Organizations...

Selected NationalMap web analytics

The BriQ Data set

MaRV Scripts and Dataset

Requirements

1. Software Dependencies:

2. Environment Variables:

Refactoring Technique Selection

1. Environment Setup:

2. Configuring the Repositories CSV:

3. Executing the Script:

4. Script Behavior:

5. Count Refactorings:

Data Gathering

Web Tool for Manual Evaluation

DoH-Gen-F-CCDDD

Stable C and N isotope changes in LML food web (.csv file)

Datasets for manuscript: ADAM: A Web Platform for Graph-Based Modeling and...

ArcGIS Online Web Map Service Validation Audit

Chicago Crime August 2017.csv

Web-ids23 dataset - Vdataset - LDM

Top 100 Batsman

Cebulka (Polish dark web cryptomarket and image board) messages data

Supplementary data.csv

Mapping incident locations from a CSV file in a web map (video)