100+ datasets found

h
example-space-to-dataset-image-zip
huggingface.co
Updated Jun 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucain Pouget (2023). example-space-to-dataset-image-zip [Dataset]. https://huggingface.co/datasets/Wauplin/example-space-to-dataset-image-zip
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 16, 2023
Authors
Lucain Pouget
Description
Demo to save data from a Space to a Dataset. Goal is to provide reusable snippets of code.

Documentation: https://huggingface.co/docs/huggingface_hub/main/en/guides/upload#scheduled-uploads Space: https://huggingface.co/spaces/Wauplin/space_to_dataset_saver/ JSON dataset: https://huggingface.co/datasets/Wauplin/example-commit-scheduler-json Image dataset: https://huggingface.co/datasets/Wauplin/example-commit-scheduler-image Image (zipped) dataset:… See the full description on the dataset page: https://huggingface.co/datasets/Wauplin/example-space-to-dataset-image-zip.
h
example-space-to-dataset-json
huggingface.co
Updated May 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
m (2025). example-space-to-dataset-json [Dataset]. https://huggingface.co/datasets/mmwmm/example-space-to-dataset-json
Explore at:
Dataset updated
May 26, 2025
Authors
m
Description
mmwmm/example-space-to-dataset-json dataset hosted on Hugging Face and contributed by the HF Datasets community
d
Country Polygons as GeoJSON
datahub.io
Updated Sep 1, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). Country Polygons as GeoJSON [Dataset]. https://datahub.io/core/geo-countries
Explore at:
Dataset updated
Sep 1, 2017
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
geodata data package providing geojson polygons for all the world's countries
USA states GeoJson
kaggle.com
Updated Aug 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kate Gallo (2020). USA states GeoJson [Dataset]. https://www.kaggle.com/pompelmo/usa-states-geojson/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 18, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kate Gallo
Area covered
United States
Description
Context

I created a dataset to help people create choropleth maps of United States states.

Content

One geojson to plot the countries borders, and one csv from the Census Bureau for the us population per state.

Inspiration

I think the best way to use this dataset is in joining it with other data. For example, I used this dataset to plot police killings using the data from https://www.kaggle.com/jpmiller/police-violence-in-the-us
Z
Data from: 3DHD CityScenes: High-Definition Maps in High-Density Point...
data.niaid.nih.gov
zenodo.org
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fricke, Jenny (2024). 3DHD CityScenes: High-Definition Maps in High-Density Point Clouds [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7085089
Explore at:
Dataset updated
Jul 16, 2024
Dataset provided by
Sertolli, Benjamin
Fricke, Jenny
Klingner, Marvin
Fingscheidt, Tim
Plachetka, Christopher
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview

3DHD CityScenes is the most comprehensive, large-scale high-definition (HD) map dataset to date, annotated in the three spatial dimensions of globally referenced, high-density LiDAR point clouds collected in urban domains. Our HD map covers 127 km of road sections of the inner city of Hamburg, Germany including 467 km of individual lanes. In total, our map comprises 266,762 individual items.

Our corresponding paper (published at ITSC 2022) is available here. Further, we have applied 3DHD CityScenes to map deviation detection here.

Moreover, we release code to facilitate the application of our dataset and the reproducibility of our research. Specifically, our 3DHD_DevKit comprises:

Python tools to read, generate, and visualize the dataset,

3DHDNet deep learning pipeline (training, inference, evaluation) for map deviation detection and 3D object detection.

The DevKit is available here:

https://github.com/volkswagen/3DHD_devkit.

The dataset and DevKit have been created by Christopher Plachetka as project lead during his PhD period at Volkswagen Group, Germany.

When using our dataset, you are welcome to cite:

@INPROCEEDINGS{9921866, author={Plachetka, Christopher and Sertolli, Benjamin and Fricke, Jenny and Klingner, Marvin and Fingscheidt, Tim}, booktitle={2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)}, title={3DHD CityScenes: High-Definition Maps in High-Density Point Clouds}, year={2022}, pages={627-634}}

Acknowledgements

We thank the following interns for their exceptional contributions to our work.

Benjamin Sertolli: Major contributions to our DevKit during his master thesis

Niels Maier: Measurement campaign for data collection and data preparation

The European large-scale project Hi-Drive (www.Hi-Drive.eu) supports the publication of 3DHD CityScenes and encourages the general publication of information and databases facilitating the development of automated driving technologies.

The Dataset

After downloading, the 3DHD_CityScenes folder provides five subdirectories, which are explained briefly in the following.

Dataset

This directory contains the training, validation, and test set definition (train.json, val.json, test.json) used in our publications. Respective files contain samples that define a geolocation and the orientation of the ego vehicle in global coordinates on the map.

During dataset generation (done by our DevKit), samples are used to take crops from the larger point cloud. Also, map elements in reach of a sample are collected. Both modalities can then be used, e.g., as input to a neural network such as our 3DHDNet.

To read any JSON-encoded data provided by 3DHD CityScenes in Python, you can use the following code snipped as an example.

import json

json_path = r"E:\3DHD_CityScenes\Dataset\train.json" with open(json_path) as jf: data = json.load(jf) print(data)

HD_Map

Map items are stored as lists of items in JSON format. In particular, we provide:

traffic signs,

traffic lights,

pole-like objects,

construction site locations,

construction site obstacles (point-like such as cones, and line-like such as fences),

line-shaped markings (solid, dashed, etc.),

polygon-shaped markings (arrows, stop lines, symbols, etc.),

lanes (ordinary and temporary),

relations between elements (only for construction sites, e.g., sign to lane association).

HD_Map_MetaData

Our high-density point cloud used as basis for annotating the HD map is split in 648 tiles. This directory contains the geolocation for each tile as polygon on the map. You can view the respective tile definition using QGIS. Alternatively, we also provide respective polygons as lists of UTM coordinates in JSON.

Files with the ending .dbf, .prj, .qpj, .shp, and .shx belong to the tile definition as “shape file” (commonly used in geodesy) that can be viewed using QGIS. The JSON file contains the same information provided in a different format used in our Python API.

HD_PointCloud_Tiles

The high-density point cloud tiles are provided in global UTM32N coordinates and are encoded in a proprietary binary format. The first 4 bytes (integer) encode the number of points contained in that file. Subsequently, all point cloud values are provided as arrays. First all x-values, then all y-values, and so on. Specifically, the arrays are encoded as follows.

x-coordinates: 4 byte integer

y-coordinates: 4 byte integer

z-coordinates: 4 byte integer

intensity of reflected beams: 2 byte unsigned integer

ground classification flag: 1 byte unsigned integer

After reading, respective values have to be unnormalized. As an example, you can use the following code snipped to read the point cloud data. For visualization, you can use the pptk package, for instance.

import numpy as np import pptk

file_path = r"E:\3DHD_CityScenes\HD_PointCloud_Tiles\HH_001.bin" pc_dict = {} key_list = ['x', 'y', 'z', 'intensity', 'is_ground'] type_list = ['
Country State GeoJSON
kaggle.com
zip
Updated Apr 27, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mukesh Chapagain (2020). Country State GeoJSON [Dataset]. https://www.kaggle.com/chapagain/country-state-geo-location
Explore at:
zip(286136 bytes)Available download formats
Dataset updated
Apr 27, 2020
Authors
Mukesh Chapagain
Description
About

World Country and State coordinate for plotting geospatial maps.

Source

Files source:

Folium GitHub Repository:

https://github.com/python-visualization/folium

https://github.com/python-visualization/folium/tree/master/examples/data
Wireless HotSpots (GEOJSON)
data.gov.sg
Updated Jun 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Info-communications Media Development Authority (2024). Wireless HotSpots (GEOJSON) [Dataset]. https://data.gov.sg/datasets/d_d8644084f8b54f851a1acbb2f04d5089/view
Explore at:
Dataset updated
Jun 6, 2024
Dataset provided by
Infocomm Media Development Authorityhttp://www.imda.gov.sg/
Authors
Info-communications Media Development Authority
License
https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence
Description
Dataset from Info-communications Media Development Authority. For more information, visit https://data.gov.sg/datasets/d_d8644084f8b54f851a1acbb2f04d5089/view
JSON Repository
data.amerigeoss.org
csv, geojson, json +1
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UN Humanitarian Data Exchange (2025). JSON Repository [Dataset]. https://data.amerigeoss.org/dataset/json-repository
Explore at:
csv(9901), csv(779), csv(462610), json(3411081), geojson(543777), geojson(545299), geojson(365288), json(1132925), geojson(366788), csv(177073), geojson(162605), json(2064743), json(520472), geojson(953043), geojson(886086), json(457832), geojson(222216), geojson(9124), csv(85982), geojson(164379), csv(457), csv(242), json(3401512), csv(669568), json(461423), json(876253), csv(6789), csv(536), json(640845), json(707249), csv(358964), geojson(135805), csv(4907), csv(177), json(327649), csv(9980), geojson(709673), geojson(54889), geojson(2396630), json(632081), topojson(2728099), csv(845984), geojson(178718), json(559095), json(1975854), geojson(74470), geojson(219728), geojson(1324722), json(3478518)Available download formats
Dataset updated
Jun 4, 2025
Dataset provided by
United Nationshttp://un.org/
Description
This dataset contains resources transformed from other datasets on HDX. They exist here only in a format modified to support visualization on HDX and may not be as up to date as the source datasets from which they are derived.

Source datasets: https://data.hdx.rwlabs.org/dataset/idps-data-by-region-in-mali
json_large_sample
kaggle.com
Updated Dec 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Noura Aly (2023). json_large_sample [Dataset]. https://www.kaggle.com/datasets/nouraaly/json-large-sample
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 1, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Noura Aly
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Noura Aly

Released under Apache 2.0

Contents
h
example-space-to-dataset-json
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmad Sohrabi, example-space-to-dataset-json [Dataset]. https://huggingface.co/datasets/CognitiveScience/example-space-to-dataset-json
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Ahmad Sohrabi
Description
CognitiveScience/example-space-to-dataset-json dataset hosted on Hugging Face and contributed by the HF Datasets community
Z
Dataset of IEEE 802.11 probe requests from an uncontrolled urban environment...
data.niaid.nih.gov
Updated Jan 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrej Hrovat (2023). Dataset of IEEE 802.11 probe requests from an uncontrolled urban environment [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7509279
Explore at:
Dataset updated
Jan 6, 2023
Dataset provided by
Mihael Mohorčič
Miha Mohorčič
Andrej Hrovat
Aleš Simončič
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction

The 802.11 standard includes several management features and corresponding frame types. One of them are Probe Requests (PR), which are sent by mobile devices in an unassociated state to scan the nearby area for existing wireless networks. The frame part of PRs consists of variable-length fields, called Information Elements (IE), which represent the capabilities of a mobile device, such as supported data rates.

This dataset contains PRs collected over a seven-day period by four gateway devices in an uncontrolled urban environment in the city of Catania.

It can be used for various use cases, e.g., analyzing MAC randomization, determining the number of people in a given location at a given time or in different time periods, analyzing trends in population movement (streets, shopping malls, etc.) in different time periods, etc.

Related dataset

Same authors also produced the Labeled dataset of IEEE 802.11 probe requests with same data layout and recording equipment.

Measurement setup

The system for collecting PRs consists of a Raspberry Pi 4 (RPi) with an additional WiFi dongle to capture WiFi signal traffic in monitoring mode (gateway device). Passive PR monitoring is performed by listening to 802.11 traffic and filtering out PR packets on a single WiFi channel.

The following information about each received PR is collected: - MAC address - Supported data rates - extended supported rates - HT capabilities - extended capabilities - data under extended tag and vendor specific tag - interworking - VHT capabilities - RSSI - SSID - timestamp when PR was received.

The collected data was forwarded to a remote database via a secure VPN connection. A Python script was written using the Pyshark package to collect, preprocess, and transmit the data.

Data preprocessing

The gateway collects PRs for each successive predefined scan interval (10 seconds). During this interval, the data is preprocessed before being transmitted to the database. For each detected PR in the scan interval, the IEs fields are saved in the following JSON structure:

PR_IE_data = { 'DATA_RTS': {'SUPP': DATA_supp , 'EXT': DATA_ext}, 'HT_CAP': DATA_htcap, 'EXT_CAP': {'length': DATA_len, 'data': DATA_extcap}, 'VHT_CAP': DATA_vhtcap, 'INTERWORKING': DATA_inter, 'EXT_TAG': {'ID_1': DATA_1_ext, 'ID_2': DATA_2_ext ...}, 'VENDOR_SPEC': {VENDOR_1:{ 'ID_1': DATA_1_vendor1, 'ID_2': DATA_2_vendor1 ...}, VENDOR_2:{ 'ID_1': DATA_1_vendor2, 'ID_2': DATA_2_vendor2 ...} ...} }

Supported data rates and extended supported rates are represented as arrays of values that encode information about the rates supported by a mobile device. The rest of the IEs data is represented in hexadecimal format. Vendor Specific Tag is structured differently than the other IEs. This field can contain multiple vendor IDs with multiple data IDs with corresponding data. Similarly, the extended tag can contain multiple data IDs with corresponding data.
Missing IE fields in the captured PR are not included in PR_IE_DATA.

When a new MAC address is detected in the current scan time interval, the data from PR is stored in the following structure:

{'MAC': MAC_address, 'SSIDs': [ SSID ], 'PROBE_REQs': [PR_data] },

where PR_data is structured as follows:

{ 'TIME': [ DATA_time ], 'RSSI': [ DATA_rssi ], 'DATA': PR_IE_data }.

This data structure allows to store only 'TOA' and 'RSSI' for all PRs originating from the same MAC address and containing the same 'PR_IE_data'. All SSIDs from the same MAC address are also stored. The data of the newly detected PR is compared with the already stored data of the same MAC in the current scan time interval. If identical PR's IE data from the same MAC address is already stored, only data for the keys 'TIME' and 'RSSI' are appended. If identical PR's IE data from the same MAC address has not yet been received, then the PR_data structure of the new PR for that MAC address is appended to the 'PROBE_REQs' key. The preprocessing procedure is shown in Figure ./Figures/Preprocessing_procedure.png

At the end of each scan time interval, all processed data is sent to the database along with additional metadata about the collected data, such as the serial number of the wireless gateway and the timestamps for the start and end of the scan. For an example of a single PR capture, see the Single_PR_capture_example.json file.

Folder structure

For ease of processing of the data, the dataset is divided into 7 folders, each containing a 24-hour period. Each folder contains four files, each containing samples from that device.

The folders are named after the start and end time (in UTC). For example, the folder 2022-09-22T22-00-00_2022-09-23T22-00-00 contains samples collected between 23th of September 2022 00:00 local time, until 24th of September 2022 00:00 local time.

Files representing their location via mapping: - 1.json -> location 1 - 2.json -> location 2 - 3.json -> location 3 - 4.json -> location 4

Environments description

The measurements were carried out in the city of Catania, in Piazza Università and Piazza del Duomo The gateway devices (rPIs with WiFi dongle) were set up and gathering data before the start time of this dataset. As of September 23, 2022, the devices were placed in their final configuration and personally checked for correctness of installation and data status of the entire data collection system. Devices were connected either to a nearby Ethernet outlet or via WiFi to the access point provided.

Four Raspbery Pi-s were used: - location 1 -> Piazza del Duomo - Chierici building (balcony near Fontana dell’Amenano) - location 2 -> southernmost window in the building of Via Etnea near Piazza del Duomo - location 3 -> nothernmost window in the building of Via Etnea near Piazza Università - location 4 -> first window top the right of the entrance of the University of Catania

Locations were suggested by the authors and adjusted during deployment based on physical constraints (locations of electrical outlets or internet access) Under ideal circumstances, the locations of the devices and their coverage area would cover both squares and the part of Via Etna between them, with a partial overlap of signal detection. The locations of the gateways are shown in Figure ./Figures/catania.png.

Known dataset shortcomings

Due to technical and physical limitations, the dataset contains some identified deficiencies.

PRs are collected and transmitted in 10-second chunks. Due to the limited capabilites of the recording devices, some time (in the range of seconds) may not be accounted for between chunks if the transmission of the previous packet took too long or an unexpected error occurred.

Every 20 minutes the service is restarted on the recording device. This is a workaround for undefined behavior of the USB WiFi dongle, which can no longer respond. For this reason, up to 20 seconds of data will not be recorded in each 20-minute period.

The devices had a scheduled reboot at 4:00 each day which is shown as missing data of up to a few minutes.

Location 1 - Piazza del Duomo - Chierici

The gateway device (rPi) is located on the second floor balcony and is hardwired to the Ethernet port. This device appears to function stably throughout the data collection period. Its location is constant and is not disturbed, dataset seems to have complete coverage.

Location 2 - Via Etnea - Piazza del Duomo

The device is located inside the building. During working hours (approximately 9:00-17:00), the device was placed on the windowsill. However, the movement of the device cannot be confirmed. As the device was moved back and forth, power outages and internet connection issues occurred. The last three days in the record contain no PRs from this location.

Location 3 - Via Etnea - Piazza Università

Similar to Location 2, the device is placed on the windowsill and moved around by people working in the building. Similar behavior is also observed, e.g., it is placed on the windowsill and moved inside a thick wall when no people are present. This device appears to have been collecting data throughout the whole dataset period.

Location 4 - Piazza Università

This location is wirelessly connected to the access point. The device was placed statically on a windowsill overlooking the square. Due to physical limitations, the device had lost power several times during the deployment. The internet connection was also interrupted sporadically.

Recognitions

The data was collected within the scope of Resiloc project with the help of City of Catania and project partners.
URA Parking Lot (GEOJSON)
data.gov.sg
Updated Jun 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Urban Redevelopment Authority (2024). URA Parking Lot (GEOJSON) [Dataset]. https://data.gov.sg/datasets/d_d959102fa76d58f2de276bfbb7e8f68e/view
Explore at:
Dataset updated
Jun 6, 2024
Dataset authored and provided by
Urban Redevelopment Authorityhttp://ura.gov.sg/
License
https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence
Description
Dataset from Urban Redevelopment Authority. For more information, visit https://data.gov.sg/datasets/d_d959102fa76d58f2de276bfbb7e8f68e/view
Spider Realistic Dataset In Structure-Grounded Pretraining for Text-to-SQL
zenodo.org
bin, json, txt
Updated Aug 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson; Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson (2021). Spider Realistic Dataset In Structure-Grounded Pretraining for Text-to-SQL [Dataset]. http://doi.org/10.5281/zenodo.5205322
Explore at:
txt, json, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5205322
Dataset updated
Aug 16, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson; Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This folder contains the Spider-Realistic dataset used for evaluation in the paper "Structure-Grounded Pretraining for Text-to-SQL". The dataset is created based on the dev split of the Spider dataset (2020-06-07 version from https://yale-lily.github.io/spider). We manually modified the original questions to remove the explicit mention of column names while keeping the SQL queries unchanged to better evaluate the model's capability in aligning the NL utterance and the DB schema. For more details, please check our paper at https://arxiv.org/abs/2010.12773.

It contains the following files:

- spider-realistic.json
# The spider-realistic evaluation set
# Examples: 508
# Databases: 19
- dev.json
# The original dev split of Spider
# Examples: 1034
# Databases: 20
- tables.json
# The original DB schemas from Spider
# Databases: 166
- README.txt
- license

The Spider-Realistic dataset is created based on the dev split of the Spider dataset realsed by Yu, Tao, et al. "Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task." It is a subset of the original dataset with explicit mention of the column names removed. The sql queries and databases are kept unchanged.
For the format of each json file, please refer to the github page of Spider https://github.com/taoyds/spider.
For the database files please refer to the official Spider release https://yale-lily.github.io/spider.

This dataset is distributed under the CC BY-SA 4.0 license.

If you use the dataset, please cite the following papers including the original Spider datasets, Finegan-Dollak et al., 2018 and the original datasets for Restaurants, GeoQuery, Scholar, Academic, IMDB, and Yelp.

@article{deng2020structure,
title={Structure-Grounded Pretraining for Text-to-SQL},
author={Deng, Xiang and Awadallah, Ahmed Hassan and Meek, Christopher and Polozov, Oleksandr and Sun, Huan and Richardson, Matthew},
journal={arXiv preprint arXiv:2010.12773},
year={2020}
}

@inproceedings{Yu&al.18c,
year = 2018,
title = {Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task},
booktitle = {EMNLP},
author = {Tao Yu and Rui Zhang and Kai Yang and Michihiro Yasunaga and Dongxu Wang and Zifan Li and James Ma and Irene Li and Qingning Yao and Shanelle Roman and Zilin Zhang and Dragomir Radev }
}

@InProceedings{P18-1033,
author = "Finegan-Dollak, Catherine
and Kummerfeld, Jonathan K.
and Zhang, Li
and Ramanathan, Karthik
and Sadasivam, Sesh
and Zhang, Rui
and Radev, Dragomir",
title = "Improving Text-to-SQL Evaluation Methodology",
booktitle = "Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
year = "2018",
publisher = "Association for Computational Linguistics",
pages = "351--360",
location = "Melbourne, Australia",
url = "http://aclweb.org/anthology/P18-1033"
}

@InProceedings{data-sql-imdb-yelp,
dataset = {IMDB and Yelp},
author = {Navid Yaghmazadeh, Yuepeng Wang, Isil Dillig, and Thomas Dillig},
title = {SQLizer: Query Synthesis from Natural Language},
booktitle = {International Conference on Object-Oriented Programming, Systems, Languages, and Applications, ACM},
month = {October},
year = {2017},
pages = {63:1--63:26},
url = {http://doi.org/10.1145/3133887},
}

@article{data-academic,
dataset = {Academic},
author = {Fei Li and H. V. Jagadish},
title = {Constructing an Interactive Natural Language Interface for Relational Databases},
journal = {Proceedings of the VLDB Endowment},
volume = {8},
number = {1},
month = {September},
year = {2014},
pages = {73--84},
url = {http://dx.doi.org/10.14778/2735461.2735468},
}

@InProceedings{data-atis-geography-scholar,
dataset = {Scholar, and Updated ATIS and Geography},
author = {Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, and Luke Zettlemoyer},
title = {Learning a Neural Semantic Parser from User Feedback},
booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
year = {2017},
pages = {963--973},
location = {Vancouver, Canada},
url = {http://www.aclweb.org/anthology/P17-1089},
}

@inproceedings{data-geography-original
dataset = {Geography, original},
author = {John M. Zelle and Raymond J. Mooney},
title = {Learning to Parse Database Queries Using Inductive Logic Programming},
booktitle = {Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume 2},
year = {1996},
pages = {1050--1055},
location = {Portland, Oregon},
url = {http://dl.acm.org/citation.cfm?id=1864519.1864543},
}

@inproceedings{data-restaurants-logic,
author = {Lappoon R. Tang and Raymond J. Mooney},
title = {Automated Construction of Database Interfaces: Intergrating Statistical and Relational Learning for Semantic Parsing},
booktitle = {2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora},
year = {2000},
pages = {133--141},
location = {Hong Kong, China},
url = {http://www.aclweb.org/anthology/W00-1317},
}

@inproceedings{data-restaurants-original,
author = {Ana-Maria Popescu, Oren Etzioni, and Henry Kautz},
title = {Towards a Theory of Natural Language Interfaces to Databases},
booktitle = {Proceedings of the 8th International Conference on Intelligent User Interfaces},
year = {2003},
location = {Miami, Florida, USA},
pages = {149--157},
url = {http://doi.acm.org/10.1145/604045.604070},
}

@inproceedings{data-restaurants,
author = {Alessandra Giordani and Alessandro Moschitti},
title = {Automatic Generation and Reranking of SQL-derived Answers to NL Questions},
booktitle = {Proceedings of the Second International Conference on Trustworthy Eternal Systems via Evolving Software, Data and Knowledge},
year = {2012},
location = {Montpellier, France},
pages = {59--76},
url = {https://doi.org/10.1007/978-3-642-45260-4_5},
}
e
User memories from Cultural Heritage Search
data.europa.eu
unknown
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
User memories from Cultural Heritage Search [Dataset]. https://data.europa.eu/data/datasets/https-data-norge-no-node-2123?locale=en
Explore at:
unknownAvailable download formats
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
User memories from Cultural Heritage Search is a data set (in the form of a file dump) that consists of the audience’s own registrations in the online solution Kulturminnesøk. Cultural Heritage Search is an intermediary service from the Directorate for Cultural Heritage, and is run by an editorial board. The information about most of the cultural monuments in Cultural Heritage Search comes from the cultural heritage base Askeladden, which is managed by the Directorate of Cultural Heritage, but you as a user can also contribute your user memories into the solution. The dataset follows the GeoJSON-LD standard. It can be read and used as regular GeoJSON, but also has a semantic component that allows it to be processed as JSON-LD. For more information, see http://geojson.org/geojson-ld/ . Each document can be linked with Cultural Heritage Search. For example, the document with ID http://kulturminnesok.no/fm/gilahytta-1 can be retrieved from Cultural Heritage Search as follows: https://kulturminnesok.no/minne/?queryString=http://kulturminnesok.no/fm/gilahytta-1
a
Concurrent LC MHM Polygons
globe-data-igestrategies.hub.arcgis.com
geospatial.strategies.org
Updated Jan 7, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Institute for Global Environmental Strategies (2023). Concurrent LC MHM Polygons [Dataset]. https://globe-data-igestrategies.hub.arcgis.com/datasets/concurrent-lc-mhm-polygons
Explore at:
Dataset updated
Jan 7, 2023
Dataset authored and provided by
Institute for Global Environmental Strategies
Area covered
Description
This feature layer consists of paired GLOBE Observer Mosquito Habitat Mapper (MHM) and GLOBE Observer Land Cover (LC) observation data resulting from the following processing steps:MHM
GEOJSON Data was pulled from this GLOBE API URL: https://api.globe.gov/search/v1/measurement/protocol/measureddate/?protocols=mosquito_habitat_mapper&startdate=2017-05-01&enddate=2022-12-31&geojson=TRUE&sample=FALSE Only device-reported measurements are kept- "DataSource" = "GLOBE Observer App"
As we are only interested in device measurements, latitude and longitude are determined from "MeasurementLatitude" and "MeasurementLongitude". All instances of duplicate photos have been removed from the dataset.LC
GEOJSON Data was pulled from this GLOBE API URL:https://api.globe.gov/search/v1/measurement/protocol/measureddate/?protocols=land_covers&startdate=2018-09-01&enddate=2022-12-31&geojson=TRUE&sample=FALSE Only device-reported measurements are kept- "DataSource" = "GLOBE Observer App"
As we are only interested in device measurements, latitude and longitude are determined from "MeasurementLatitude" and "MeasurementLongitude".ConcurrenceThese two layers were then combined using a spatiotemporal join with the following conditions: Tool: Geoanalytics Desktop Tools -> Join Features Target Layer: LC Join Type: one to many Join Layer: MHM Coordinate fields used: MeasurementLatitude, MeasurementLongitude Time fields used: MeasuredAt (UTC time) Spatial Proximity: 100 meters (NEAR_GEODESIC) Temporal Proximity: 60 minutes (NEAR) Attribute match: UserIDThe result is a dataset consisting of all paired instances where the same observer (Userid) collected a Mosquito Habitat Mapper observation within 100 meters and 1 hour of collecting a Land Cover observation.Additional fields include:lc_mhm_obsID_pair': A string representing the two paired observations- {lc_LandCoverId}_{mhm_MosquitoHabitatMapperId}'lc_latlon': A string representing the coordinates of the LC observation - "({lc_MeasurementLatitude}, {lc_MeasurementLongitude})"'mhm_latlon': A string representing the coordinates of the MHM observation - "({mhm_MeasurementLatitude}, {mhm_MeasurementLongitude})"'spatialDistanceMeters': Numeric value representing the distance between the two paired observations in meters'temporalDistanceMinutes': Numeric value representing the time delta between the two paired observations in minutes'squareBuffer': A polygon string representing a 100m square centered on the LC observation coordinates. This may be used in conjunction with additional map layers to evaluate the land cover types near the observation coordinates. (n.b. This is not the buffer used in calculating spatiotemporal concurrence)For the purposes of this visualization, geometry is a 100m x 100m square centered on the Land Cover observation coordinates.
d
Polygon Data | Marinas in US and Canada | Map & Geospatial Insights
datarade.ai
Updated Mar 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xtract (2023). Polygon Data | Marinas in US and Canada | Map & Geospatial Insights [Dataset]. https://datarade.ai/data-products/xtract-io-geometry-data-marinas-in-us-and-canada-xtract
Explore at:
.json, .csv, .xls, .txtAvailable download formats
Dataset updated
Mar 23, 2023
Dataset authored and provided by
Xtract
Area covered
United States, Canada
Description
This specialized location dataset delivers detailed information about marina establishments. Maritime industry professionals, coastal planners, and tourism researchers can leverage precise location insights to understand maritime infrastructure, analyze recreational boating landscapes, and develop targeted strategies.

How Do We Create Polygons? -All our polygons are manually crafted using advanced GIS tools like QGIS, ArcGIS, and similar applications. This involves leveraging aerial imagery and street-level views to ensure precision. -Beyond visual data, our expert GIS data engineers integrate venue layout/elevation plans sourced from official company websites to construct detailed indoor polygons. This meticulous process ensures higher accuracy and consistency. -We verify our polygons through multiple quality checks, focusing on accuracy, relevance, and completeness.

What's More? -Custom Polygon Creation: Our team can build polygons for any location or category based on your specific requirements. Whether it’s a new retail chain, transportation hub, or niche point of interest, we’ve got you covered. -Enhanced Customization: In addition to polygons, we capture critical details such as entry and exit points, parking areas, and adjacent pathways, adding greater context to your geospatial data. -Flexible Data Delivery Formats: We provide datasets in industry-standard formats like WKT, GeoJSON, Shapefile, and GDB, making them compatible with various systems and tools. -Regular Data Updates: Stay ahead with our customizable refresh schedules, ensuring your polygon data is always up-to-date for evolving business needs.

Unlock the Power of POI and Geospatial Data With our robust polygon datasets and point-of-interest data, you can: -Perform detailed market analyses to identify growth opportunities. -Pinpoint the ideal location for your next store or business expansion. -Decode consumer behavior patterns using geospatial insights. -Execute targeted, location-driven marketing campaigns for better ROI. -Gain an edge over competitors by leveraging geofencing and spatial intelligence.

Why Choose LocationsXYZ? LocationsXYZ is trusted by leading brands to unlock actionable business insights with our spatial data solutions. Join our growing network of successful clients who have scaled their operations with precise polygon and POI data. Request your free sample today and explore how we can help accelerate your business growth.
Atlas of the Working Group I Contribution to the IPCC Sixth Assessment...
catalogue.ceda.ac.uk
Updated Jun 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maialen Iturbide; José Manuel Gutiérrez; Joaquín Bedia; Ezequiel Cimadevilla; Javier Díez-Sierra; Rodrigo Manzanas; Ana Casanueva; Jorge Baño-Medina; Josipa Milovac; Sixto Milovac; Antonio S. Cofiño; Daniel San Martín; Markel García-Díez; Mathias Hauser; David Huard; Özge Yelekci; Jesús Fernández (2023). Atlas of the Working Group I Contribution to the IPCC Sixth Assessment Report - data for Figure Atlas.2 (v20221104) [Dataset]. https://catalogue.ceda.ac.uk/uuid/789ad030299342ea99534edfb62450d9
Explore at:
Dataset updated
Jun 19, 2023
Dataset provided by
Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
Authors
Maialen Iturbide; José Manuel Gutiérrez; Joaquín Bedia; Ezequiel Cimadevilla; Javier Díez-Sierra; Rodrigo Manzanas; Ana Casanueva; Jorge Baño-Medina; Josipa Milovac; Sixto Milovac; Antonio S. Cofiño; Daniel San Martín; Markel García-Díez; Mathias Hauser; David Huard; Özge Yelekci; Jesús Fernández
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1850 - Dec 31, 2099
Area covered
Earth
Description
Data for Figure Atlas.2 from Atlas of the Working Group I (WGI) Contribution to the Intergovernmental Panel on Climate Change (IPCC) Sixth Assessment Report (AR6).

Figure Atlas.2 shows WGI reference regions used in the (a) AR5 and (b) AR6 reports.

How to cite this dataset

When citing this dataset, please include both the data citation below (under 'Citable as') and the following citations: For the report component from which the figure originates: Gutiérrez, J.M., R.G. Jones, G.T. Narisma, L.M. Alves, M. Amjad, I.V. Gorodetskaya, M. Grose, N.A.B. Klutse, S. Krakovska, J. Li, D. Martínez-Castro, L.O. Mearns, S.H. Mernild, T. Ngo-Duc, B. van den Hurk, and J.-H. Yoon, 2021: Atlas. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [Masson-Delmotte, V., P. Zhai, A. Pirani, S.L. Connors, C. Péan, S. Berger, N. Caud, Y. Chen, L. Goldfarb, M.I. Gomis, M. Huang, K. Leitzell, E. Lonnoy, J.B.R. Matthews, T.K. Maycock, T. Waterfield, O. Yelekçi, R. Yu, and B. Zhou (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp. 1927–2058, doi:10.1017/9781009157896.021

Iturbide, M. et al., 2021: Repository supporting the implementation of FAIR principles in the IPCC-WG1 Interactive Atlas. Zenodo. Retrieved from: http://doi.org/10.5281/zenodo.5171760

Figure subpanels

The figure has two panels, with data provided for both panels in the master GitHub repository linked in the documentation.

Data provided in relation to figure

This dataset contains the corner coordinates defining each reference region for the second panel of the figure, which contain coordinate information at a 0.44º resolution. The repository directory 'reference-regions' contains data provided for the reference regions as polygons in different formats (CSV with coordinates, R data, shapefile and geojson) together with R and Python notebooks illustrating the use of these regions with worked examples.

Data for reference regions for AR5 can be found here: https://catalogue.ceda.ac.uk/uuid/a3b6d7f93e5c4ea986f3622eeee2b96f

CMIP5 is the fifth phase of the Coupled Model Intercomparison Project. CMIP6 is the sixth phase of the Coupled Model Intercomparison Project. CORDEX is The Coordinated Regional Downscaling Experiment from the WCRP. AR5 and AR6 refer to the 5th and 6th Annual Report of the IPCC. WGI stands for Working Group I

Notes on reproducing the figure from the provided data

Data and figures produced by the Jupyter Notebooks live inside the notebooks directory. The notebooks describe step by step the basic process followed to generate some key figures of the AR6 WGI Atlas and some products underpinning the Interactive Atlas, such as reference regions, global warming levels, aggregated datasets. They include comments and hints to extend the analysis, thus promoting reusability of the results. These notebooks are provided as guidance for practitioners, more user friendly than the code provided as scripts in the reproducibility folder.

Some of the notebooks require access to large data volumes out of this repository. To speed up the execution of the notebook, in addition to the full code to access the data, we provide a data loading shortcut, by storing intermediate results in the auxiliary-material folder in this repository. To test other parameter settings, the full data access instructions should be followed, which can take long waiting times.

Sources of additional information

The following weblinks are provided in the Related Documents section of this catalogue record: - Link to the figure on the IPCC AR6 website - Link to the report component containing the figure (Atlas) - Link to the Supplementary Material for Atlas, which contains details on the input data used in Table Atlas.SM.15. - Link to the code for the figure, archived on Zenodo. - Link to the necessary notebooks for reproducing the figure from GitHub. - Link to IPCC AR5 reference regions dataset
G
Hydroclimatic atlas 2022
open.canada.ca
catalogue.arctic-sdi.org
+1more
csv, geojson, html +3
Updated May 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government and Municipalities of Québec (2025). Hydroclimatic atlas 2022 [Dataset]. https://open.canada.ca/data/dataset/8bc217ff-d25d-4f55-a9a7-ada3df4b29a7
Explore at:
csv, geojson, pdf, zip, html, shpAvailable download formats
Dataset updated
May 1, 2025
Dataset provided by
Government and Municipalities of Québec
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Jan 1, 1970 - Dec 31, 2100
Description
#Données of the 2022 Hydroclimatic Atlas ## #Description The Hydroclimatic Atlas describes the current and future water regime of southern Quebec in order to support the implementation of water management practices that are resilient to climate change. These data are from the most recent version of the Hydroclimatic Atlas. ## #Nouveautés * Improvement of the spatial resolution of the hydrographic network; * Greater spatial coverage; * Addition of the CliMEX and CORDEX-NA sets, in addition to the scenarios in the CMIP5 set; * Use of six hydrological platforms; * * Addition of indicators, especially annual ones. * Etc. ## #Liste data available * Link to the new Hydroclimatic Atlas website. * Map of the 24,604 river sections of the Hydroclimatic Atlas with their attributes, available in GeoJSON and shapefile format. To facilitate download and display, the map is divided into 11 GeoJSON files: ABIT (Abitibi and Lac Abitibi region), CND west (North Shore A and B regions), CND east (North Shore regions C, D and E), GASP (North Shore regions C, D and E), GASP (Gaspésie), MONT (Gaspesie), MONT (Montégérie), OUTM (Outaouais Upstream), OUTV (Outaouais Downstream), OUTV (Outaouais Downstream), SAGU (Saguenay), SLNO (St-Laurent Nord-Ouest), SLSO (St-Laurent Sud-Ouest), and VAUD (Vaudreuil). * The CSV tables (“Magnitude...”) for each of the 76 hydrological indicators describing the amplement, the direction and the dispersion for RCP 4.5 and RCP8.5, for the three future horizons (see the documentation for details). * The CSV tables (“Projected indicator...”) for each of the 76 hydrological indicators detailing the flow values with their uncertainty for the historical period and the three future horizons (RCP4.5 and 8.5). See the documentation for more details. * A PDF with the metadata and a more detailed description of the data. ## #Note The 2018 version data is archived on Data Quebec for reference, for example for old reports or analyses referring to this version of the data. Any new study or analysis should use the most recent data available below or on the Atlas website.**This third party metadata element was translated using an automated translation tool (Amazon Translate).**
DataCite Public Data
redivis.com
application/jsonl +7
Updated Dec 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Redivis Demo Organization (2024). DataCite Public Data [Dataset]. https://redivis.com/datasets/7wec-6vgw8qaaq
Explore at:
application/jsonl, arrow, spss, csv, stata, sas, avro, parquetAvailable download formats
Dataset updated
Dec 12, 2024
Dataset provided by
Redivis Inc.
Authors
Redivis Demo Organization
Description
Abstract

The DataCite Public Data File contains metadata records in JSON format for all DataCite DOIs in Findable state that were registered up to the end of 2023.

This dataset represents a processed version of the Public Data File, where the data have been extracted and loaded into a Redivis dataset.

Methodology

The DataCite Public Data File contains metadata records in JSON format for all DataCite DOIs in Findable state that were registered up to the end of 2023.

Records have descriptive metadata for research outputs and resources structured according to the DataCite Metadata Schema and include links to other persistent identifiers (PIDs) for works (DOIs), people (ORCID iDs), and organizations (ROR IDs).

Use of the DataCite Public Data File is subject to the DataCite Data File Use Policy.

Usage

This datasets is a processed version of the DataCite public data file, where the original file (a 23GB .tar.gz) has been extracted into 55,239 JSONL files, that were then concatenated into a single JSONL file.

This JSONL file has been imported into a Redivis table to facilitate further exploration and analysis.

A sample project demonstrating how to query the DataCite data file can be found here: https://redivis.com/projects/hx1e-a6w8vmwsx
Data from: A Dataset of Bot and Human Activities in GitHub
zenodo.org
json, txt
Updated Jan 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natarajan Chidambaram; Natarajan Chidambaram; Alexandre Decan; Alexandre Decan; Tom Mens; Tom Mens (2024). A Dataset of Bot and Human Activities in GitHub [Dataset]. http://doi.org/10.5281/zenodo.8219470
Explore at:
json, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8219470
Dataset updated
Jan 5, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Natarajan Chidambaram; Natarajan Chidambaram; Alexandre Decan; Alexandre Decan; Tom Mens; Tom Mens
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A Dataset of Bot and Human Activities in GitHub

This repository provides an updated version of a dataset of GitHub contributor activities that is accompanied by a paper published at MSR 2023 in the Data and Tool Showcase Track. The paper is entitled A Dataset of Bot and Human Activities in GitHub and is co-authored by Natarajan Chidambaram, Alexandre Decan and Tom Mens (Software Engineering Lab, University of Mons, Belgium). DOI: https://www.doi.org/10.1109/MSR59073.2023.00070. This work is done as a part of Natarajan Chdiambaram's PhD research in the context of DigitalWallonia4.AI research project ARIAC (grant number 2010235) and TRAIL.

The dataset contains 1,015,422 high-level activities made by 350 bots and 620 human contributors on GitHub between 25 November 2022 and 15 April 2023. The activities were generated from 1,221,907 low-level events obtained from the GitHub's Event API and cover 24 distinct activity types. This dataset facilitates the characterisation of bot and human behaviour in GitHub repositories, by enabling the analysis of activity sequences and activity patterns of bot and human contributors. This dataset could lead to better bot identification tools and empirical studies on how bots play a role in collaborative software development.

Files description

The following files are provided as part of the archive:

bot_activities.json - A JSON file containing 754,165 activities made by 350 bot contributors;

human_activities.json - A JSON file containing 261,258 activities made by 620 human contributors (anonymized);

JsonSchema.json - A JSON schema that validates the above datasets;

bots.txt - A TEXT file containing login names of all the 350 bots

Example

Below is an example of a Closing pull request activity:

{ "date": "2022-11-25T18:49:09+00:00", "activity": "Closing pull request", "contributor": "typescript-bot", "repository": "DefinitelyTyped/DefinitelyTyped", "comment": { "length": 249, "GH_node": "IC_kwDOAFz6BM5PJG7l" }, "pull_request": { "id": 62328, "title": "[qunit] Add `test.each()`", "created_at": "2022-09-19T17:34:28+00:00", "status": "closed", "closed_at": "2022-11-25T18:49:08+00:00", "merged": false, "GH_node": "PR_kwDOAFz6BM4_N5ib" }, "conversation": { "comments": 19 }, "payload": { "pr_commits": 1, "pr_changed_files": 5 } }

List of activity types

In total, we have identified 24 different high-level activity types from 15 different low-level event types. They are Creating repository, Creating branch, Creating tag, Deleting tag, Deleting repository, Publishing a release, Making repository public, Adding collaborator to repository, Forking repository, Starring repository, Editing wiki page, Opening issue, Closing issue, Reopening issue, Transferring issue, Commenting issue, Opening pull request, Closing pull request, Reopening pull request, Commenting pull request, Commenting pull request changes, Reviewing code, Commenting commits, Pushing commits.

List of fields

Not only does the dataset contain a list of activities made by bot and human contributors, but it also contains some details about these activities. For example, commenting issue activities provide details about the author of the comment, the repository and issue in which the comment was created, and so on.

For all activity types, we provide the date of the activity, the contributor that made the activity, and the repository in which the activity took place. Depending on the activity type, additional fields are provided. In this section, we describe for each activity type the different fields that are provided in the JSON file. It is worth to mention that we also provide the corresponding JSON schema alongside with the datasets.

Properties

date

Date on which the activity is performed

Type: string

e.g., "2022-11-25T09:55:19+00:00"

String format must be a "date-time"

activity

The activity performed by the contributor

Type: string

e.g., "Commenting pull request"

contributor

The login name of the contributor who performed this activity

Type: string

e.g., "analysis-bot", "anonymised" in the case of a human contributor

repository

The repository in which the activity is performed

Type: string

e.g., "apache/spark", "anonymised" in the case of a human contributor

issue

Issue information - provided for Opening issue, Closing issue, Reopening issue, Transferring issue and Commenting issue

Type: object

Properties

id

Issue number

Type: integer

e.g., 35471

title

Issue title

Type: string

e.g., "error building handtracking gpu example with bazel", "anonymised" in the case of a human contributor

created_at

The date on which this issue is created

Type: string

e.g., "2022-11-10T13:07:23+00:00"

String format must be a "date-time"

status

Current state of the issue

Type: string

"open" or "closed"

closed_at

The date on which this issue is closed. "null" will be provided if the issue is open

Types: string, null

e.g., "2022-11-25T10:42:39+00:00"

String format must be a "date-time"

resolved

The issue is resolved or not_planned/still open

Type: boolean

true or false

GH_node

The GitHub node of this issue

Type: string

e.g., "IC_kwDOC27xRM5PHTBU", "anonymised" in the case of a human contributor

pull_request

Pull request information - provided for Opening pull request, Closing pull request, Reopening pull request, Commenting pull request changes and Reviewing code

Type: object

Properties

id

Pull request number

Type: integer

e.g., 35471

title

Pull request title

Type: string

e.g., "error building handtracking gpu example with bazel", "anonymised" in the case of a human contributor

created_at

The date on which this pull request is created

Type: string

e.g., "2022-11-10T13:07:23+00:00"

String format must be a "date-time"

status

Current state of the pull request

Type: string

"open" or "closed"

closed_at

The date on which this pull request is closed. "null" will be provided if the pull request is open

Types: string, null

e.g., "2022-11-25T10:42:39+00:00"

String format must be a "date-time"

merged

The PR is merged or rejected/still open

Type: boolean

true or false

GH_node

The GitHub node of this pull request

Type: string

e.g., "PR_kwDOC7Q2kM5Dsu3-", "anonymised" in the case of a human contributor

review

Pull request review information - provided for Reviewing code

Type: object

Properties

status

Status of the review

Type: string

"changes_requested" or "approved" or "dismissed"

GH_node

The GitHub node of this review

Type: string

e.g., "PRR_kwDOEBHXU85HLfIn", "anonymised" in the case of a human contributor

conversation

Comments information in issue or pull request - Provided for Opening issue, Closing issue, Reopening issue, Transferring issue, Commenting issue, Opening pull request, Closing pull request, Reopening pull request and Commenting pull request

Type: object

Properties

comments

Number of comments present in the corresponding issue or pull request

Type: integer

e.g.,