Facebook
TwitterDemo to save data from a Space to a Dataset. Goal is to provide reusable snippets of code.
Documentation: https://huggingface.co/docs/huggingface_hub/main/en/guides/upload#scheduled-uploads Space: https://huggingface.co/spaces/Wauplin/space_to_dataset_saver/ JSON dataset: https://huggingface.co/datasets/Wauplin/example-space-to-dataset-json Image dataset: https://huggingface.co/datasets/Wauplin/example-space-to-dataset-image Image (zipped) dataset:… See the full description on the dataset page: https://huggingface.co/datasets/Wauplin/example-space-to-dataset-json.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Dataset contains more than 50000 records of Sales and order data related to an online store.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Noura Aly
Released under Apache 2.0
Facebook
TwitterDhdb/example-space-to-dataset-json dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by TheWiseO
Released under MIT
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Gzipped JSON file of the output of the benchmarking pipeline. This has, for each sample, the resistance calls of each tool for that sample. It is the input file needed to generate all the results in the publication.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Blockchain data query: JSON example
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by hung hoang 31
Released under MIT
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
We have successfully extracted a comprehensive news dataset from CNBC, covering not only financial updates but also an extensive range of news categories relevant to diverse audiences in Europe, the US, and the UK. This dataset includes over 500,000 records, meticulously structured in JSON format for seamless integration and analysis.
This extensive extraction spans multiple segments, such as:
Each record in the dataset is enriched with metadata tags, enabling precise filtering by region, sector, topic, and publication date.
The comprehensive news dataset provides real-time insights into global developments, corporate strategies, leadership changes, and sector-specific trends. Designed for media analysts, research firms, and businesses, it empowers users to perform:
Additionally, the JSON format ensures easy integration with analytics platforms for advanced processing.
Looking for a rich repository of structured news data? Visit our news dataset collection to explore additional offerings tailored to your analysis needs.
To get a preview, check out the CSV sample of the CNBC economy articles dataset.
Facebook
TwitterThis dataset was created by Neal Magee
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset contains inventory data for a pharmacy e-commerce website in JSON format, designed for easy integration into MongoDB databases, making it ideal for MERN stack projects. It includes 10 fields:
This dataset is useful for developing pharmacy-related web applications, inventory management systems, or online medical stores using the MERN stack.
Do not use for production-level purposes; use for project development only. Feel free to contribute if you find any mistakes or have suggestions.
Facebook
TwitterThis dataset was created by Jeong Hoon Lee
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Interoperability in systems-of-systems is a difficult problem due to the abundance of data standards and formats. Current approaches to interoperability rely on hand-made adapters or methods using ontological metadata. This dataset was created to facilitate research on data-driven interoperability solutions. The data comes from a simulation of a building heating system, and the messages sent within control systems-of-systems. For more information see attached data documentation.
The data comes in two semicolon-separated (;) csv files, training.csv and test.csv. The train/test split is not random; training data comes from the first 80% of simulated timesteps, and the test data is the last 20%. There is no specific validation dataset, the validation data should instead be randomly selected from the training data. The simulation runs for as many time steps as there are outside temperature values available. The original SMHI data only samples once every hour, which we linearly interpolate to get one temperature sample every ten seconds. The data saved at each time step consists of 34 JSON messages (four per room and two temperature readings from the outside), 9 temperature values (one per room and outside), 8 setpoint values, and 8 actuator outputs. The data associated with each of those 34 JSON-messages is stored as a single row in the tables. This means that much data is duplicated, a choice made to make it easier to use the data.
The simulation data is not meant to be opened and analyzed in spreadsheet software, it is meant for training machine learning models. It is recommended to open the data with the pandas library for Python, available at https://pypi.org/project/pandas/.
The data file with temperatures (smhi-july-23-29-2018.csv) acts as input for the thermodynamic building simulation found on Github, where it is used to get the outside temperature and corresponding timestamps. Temperature data for Luleå Summer 2018 were downloaded from SMHI.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Blockchain data query: V2 Parse JSON String sample
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Ahmed Sleem
Released under Apache 2.0
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by prateek khandelwal
Released under Apache 2.0
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The Modified Swiss Dwellings (MSD) JSON dataset is an ML-ready dataset for floor plan generation and analysis at building-level scale. The MSD JSON dataset is derived from the Modified Swiss Dwellings database (v6). The MSD JSON dataset contains 4572 room-based geometries as well as their topological dual graphs in JSON format. It also contains sample colour-coded 250 images for visual reference. The dataset (geometries and graphs) can be imported into TopologicPy for further analysis and for use with ML workflows. The original attributes are stored within the nodes and edges of the graphs as well as in the faces of the geometries.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10402362%2F422b0a20652bc743085f668979905d99%2Fmsd_header_large.png?generation=1759921999232197&alt=media" alt="">
building_id (int)floor_id (int)plan_id (int)site_id (int)elevation (float)height (float)ml_type (str)unit_usage (str)Each vertex represents a spatial unit (room, corridor, balcony, etc.).
Core classification & IDs
entity_type (str) = "area"entity_subtype ∈ {ROOM, BATHROOM, CORRIDOR, KITCHEN, BALCONY, STAIRCASE, STOREROOM, LIVING_DINING, …}. area_id (float|int)apartment_id (str|null)unit_id (float|int|null)Geometry
geom (WKT polygon string)geometry (array of [x, y] points)x, y, z; plus per-vertex height, elevation, area.Semantic & visual attributes
roomtypenode_namenode_typeunit_usagezoningzone_namezone_typenode_colorapartment_colorzone_colorEdges encode connectivity between vertices (by vertex IDs).
connectivity (str): e.g., "door", "entrance", "passage" edge_width (number): e.g., 4source (str),target (str): vertex IDs like "Vertex_0000"A single JSON file encoding a floor plan as Topologic geometry. The file is an array of topology objects — Vertex, Edge, Wire, and Face. Each object carries a uuid, a type, a dictionary (metadata), and (optionally) an apertures array.
type: "Vertex" | "Edge" | "Wire" | "Face".uuid: globally unique identifier (string).dictionary: per-object metadata. Faces include rich room/zone attributes here (see Room/Zone attributes below). :contentReference[oaicite:1]apertures: array (often empty) reserved for openings/voids.coordinates: [x, y, z] (z is often 0.0 for floor plans).Example ```json { "type": "Vertex", "uuid": "4097fc7d-a38c-11f0-82e9-e8c8299204ae", "dictionary": { "toplevel": false, "uuid": "4097fc7d-a38c-11f0-82e9-e8c8299204ae" }, "apertures": [], "coordinates": [5.529276, 2.15043, 0.0] }
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Json file with a list of portcalls from vessels arriving to Valencia ports. Data was used inside the INTER-IoT project as an example dataset that a legacy IoT platform was providing.
*NOTE: Due to a bug in the system it is not possible to upload files with a .json extension. It is uploaded to ._json extension instead. Please rename it after download.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data set containing features extracted from 211 DNS Tunneling packet captures. The packet capture samples are classified by the protocols tunneled within the DNS tunnel. The features are stored in json files for each packet capture. The features in each file include the IP Packet Length, the DNS Query Name Length and the DNS Query Name entropy. In this "slightly unclean" version of the feature set the DNS Query Name field values are also present, but are not actually necessary.
This feature set may be used to perform machine learning techniques on DNS Tunneling traffic to discover new insights without necessarily having to reconstruct and analyze the equivalent full packet captures.
Facebook
TwitterDemo to save data from a Space to a Dataset. Goal is to provide reusable snippets of code.
Documentation: https://huggingface.co/docs/huggingface_hub/main/en/guides/upload#scheduled-uploads Space: https://huggingface.co/spaces/Wauplin/space_to_dataset_saver/ JSON dataset: https://huggingface.co/datasets/Wauplin/example-space-to-dataset-json Image dataset: https://huggingface.co/datasets/Wauplin/example-space-to-dataset-image Image (zipped) dataset:… See the full description on the dataset page: https://huggingface.co/datasets/Wauplin/example-space-to-dataset-json.