Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
🛳️ Titanic Dataset (JSON Format) 📌 Overview
This is the classic Titanic: Machine Learning from Disaster dataset, converted into JSON format for easier use in APIs, data pipelines, and Python projects. It contains the same passenger details as the original CSV version, but stored as JSON for convenience.
đź“‚ Dataset Contents
File: titanic.json
Columns: PassengerId, Survived, Pclass, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, Embarked
Use Cases: Exploratory Data Analysis (EDA), feature engineering, machine learning model training, web app backends, JSON parsing practice.
🛠️ How to Use 🔹 1. Load with kagglehub import kagglehub
path = kagglehub.dataset_download("engrbasit62/titanic-json-format") print("Path to dataset files:", path)
🔹 2. Load into Pandas import pandas as pd
df = pd.read_json(f"{path}/titanic.json")
print(df.head())
đź’ˇ Notes
Preview truncation: Kaggle may show only part of the JSON in the preview panel because of its size. ✅ Don’t worry — the full dataset is available when loaded via code.
Benefits of JSON format: Ideal for web apps, APIs, or projects that work with structured data. Easily convertible back to CSV if needed.
Facebook
TwitterThe bulk download facility provides the entire contents of each major API data set in a single ZIP file. A small JSON formatted manifest file lists the bulk files and the update date of each file. The manifest is generally updated daily and can be downloaded from http://api.eia.gov/bulk/manifest.txt. The manifest contains information about the bulk files, including all required common core attributes.
Facebook
TwitterJson Files This dataset comprises a collection of JSON files designed for use in various Python projects. Each JSON file contains structured data, making it ideal for tasks such as data analysis, machine learning, and application development. The data within these files can be easily manipulated using Python's extensive libraries, such as json, pandas, and numpy.
Whether you are training a machine learning model, developing an API, or working on data transformation tasks, this dataset provides the flexibility and structure needed to work effectively with JSON data in Python.
Facebook
TwitterTRAINING DATASET: Hands-On Formatting Data Part 1 (Download This File)
Facebook
TwitterDownload the complete MAC Address JSON database to integrate network data into your projects. Regularly updated and easy to use.
Facebook
TwitterTRAINING DATASET: Hands-On Uploading Data (Download This File)
Facebook
TwitterAutomatically describing images using natural sentences is an essential task to visually impaired people's inclusion on the Internet. Although there are many datasets in the literature, most of them contain only English captions, whereas datasets with captions described in other languages are scarce.
PraCegoVer arose on the Internet, stimulating users from social media to publish images, tag #PraCegoVer and add a short description of their content. Inspired by this movement, we have proposed the #PraCegoVer, a multi-modal dataset with Portuguese captions based on posts from Instagram. It is the first large dataset for image captioning in Portuguese with freely annotated images.
Dataset Structure
containing the images. The file dataset.json comprehends a list of json objects with the attributes:
user: anonymized user that made the post;
filename: image file name;
raw_caption: raw caption;
caption: clean caption;
date: post date.
Each instance in dataset.json is associated with exactly one image in the images directory whose filename is pointed by the attribute filename. Also, we provide a sample with five instances, so the users can download the sample to get an overview of the dataset before downloading it completely.
Download Instructions
If you just want to have an overview of the dataset structure, you can download sample.tar.gz. But, if you want to use the dataset, or any of its subsets (63k and 173k), you must download all the files and run the following commands to uncompress and join the files:
cat images.tar.gz.part* > images.tar.gz tar -xzvf images.tar.gz
Alternatively, you can download the entire dataset from the terminal using the python script download_dataset.py available in PraCegoVer repository. In this case, first, you have to download the script and create an access token here. Then, you can run the following command to download and uncompress the image files:
python download_dataset.py --access_token=
Facebook
TwitterGeneral information: The data sets contain information on how often materials of studies available through GESIS: Data Archive for the Social Sciences were downloaded and/or ordered through one of the archive´s plattforms/services between 2004 and 2017.
Sources and plattforms: Study materials are accessible through various GESIS plattforms and services: Data Catalogue (DBK), histat, datorium, data service (and others).
Years available: - Data Catalogue: 2012-2017 - data service: 2006-2017 - datorium: 2014-2017 - histat: 2004-2017
Data sets: Data set ZA6899_Datasets_only_all_sources contains information on how often data files such as those with dta- (Stata) or sav- (SPSS) extension have been downloaded. Identification of data files is handled semi-automatically (depending on the plattform/serice). Multiple downloads of one file by the same user (identified through IP-address or username for registered users) on the same days are only counted as one download.
Data set ZA6899_Doc_and_Data_all_sources contains information on how often study materials have been downloaded. Multiple downloads of any file of the same study by the same user (identified through IP-address or username for registered users) on the same days are only counted as one download.
Both data sets are available in three formats: csv (quoted, semicolon-separated), dta (Stata v13, labeled) and sav (SPSS, labeled). All formats contain identical information.
Variables: Variables/columns in both data sets are identical. za_nr ´Archive study number´ version ´GESIS Archiv Version´ doi ´Digital Object Identifier´ StudyNo ´Study number of respective study´ Title ´English study title´ Title_DE ´German study title´ Access ´Access category (0, A, B, C, D, E)´ PubYear ´Publication year of last version of the study´ inZACAT ´Study is currently also available via ZACAT´ inHISTAT ´Study is currently also available via HISTAT´ inDownloads ´There are currently data files available for download for this study in DBK or datorium´ Total ´All downloads combined´ downloads_2004 ´downloads/orders from all sources combined in 2004´ [up to ...] downloads_2017 ´downloads/orders from all sources combined in 2017´ d_2004_dbk ´downloads from source dbk in 2004´ [up to ...] d_2017_dbk ´downloads from source dbk in 2017´ d_2004_histat ´downloads from source histat in 2004´ [up to ...] d_2017_histat ´downloads from source histat in 2017´ d_2004_dataservice ´downloads/orders from source dataservice in 2004´ [up to ...] d_2017_dataservice ´downloads/orders from source dataservice in 2017´
More information is available within the codebook.
Facebook
TwitterThe Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
Facebook
Twitterhttps://webtechsurvey.com/termshttps://webtechsurvey.com/terms
A complete list of live websites using the File Download technology, compiled through global website indexing conducted by WebTechSurvey.
Facebook
TwitterJSON file that can be imported into some XBRL-based financial report creation tools that then converts the information into the XBRL global standard format. These tools support this format: Auditchain Suite, see Auditchain Suite; General Luca, see General Luca.
For more information about the PROOF, see this XBRL-based report model.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Dataset contains more than 50000 records of Sales and order data related to an online store.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary
Geojson files used to visualize geospatial layers relevant to identifying and assessing trucking fleet decarbonization opportunities with the MIT Climate & Sustainability Consortium's Geospatial Trucking Industry Decarbonization Explorer (Geo-TIDE) tool.
Relevant Links
Link to the online version of the tool (requires creation of a free user account).
Link to GitHub repo with source code to produce this dataset and deploy the Geo-TIDE tool locally.
Funding
This dataset was produced with support from the MIT Climate & Sustainability Consortium.
Original Data Sources
These geojson files draw from and synthesize a number of different datasets and tools. The original data sources and tools are described below:
Filename(s) Description of Original Data Source(s) Link(s) to Download Original Data License and Attribution for Original Data Source(s)
faf5_freight_flows/*.geojson
trucking_energy_demand.geojson
highway_assignment_links_*.geojson
infrastructure_pooling_thought_experiment/*.geojson
Regional and highway-level freight flow data obtained from the Freight Analysis Framework Version 5. Shapefiles for FAF5 region boundaries and highway links are obtained from the National Transportation Atlas Database. Emissions attributes are evaluated by incorporating data from the 2002 Vehicle Inventory and Use Survey and the GREET lifecycle emissions tool maintained by Argonne National Lab.
Shapefile for FAF5 Regions
Shapefile for FAF5 Highway Network Links
FAF5 2022 Origin-Destination Freight Flow database
FAF5 2022 Highway Assignment Results
Attribution for Shapefiles: United States Department of Transportation Bureau of Transportation Statistics National Transportation Atlas Database (NTAD). Available at: https://geodata.bts.gov/search?collection=Dataset.
License for Shapefiles: This NTAD dataset is a work of the United States government as defined in 17 U.S.C. § 101 and as such are not protected by any U.S. copyrights. This work is available for unrestricted public use.
Attribution for Origin-Destination Freight Flow database: National Transportation Research Center in the Oak Ridge National Laboratory with funding from the Bureau of Transportation Statistics and the Federal Highway Administration. Freight Analysis Framework Version 5: Origin-Destination Data. Available from: https://faf.ornl.gov/faf5/Default.aspx. Obtained on Aug 5, 2024. In the public domain.
Attribution for the 2022 Vehicle Inventory and Use Survey Data: United States Department of Transportation Bureau of Transportation Statistics. Vehicle Inventory and Use Survey (VIUS) 2002 [supporting datasets]. 2024. https://doi.org/10.21949/1506070
Attribution for the GREET tool (original publication): Argonne National Laboratory Energy Systems Division Center for Transportation Research. GREET Life-cycle Model. 2014. Available from this link.
Attribution for the GREET tool (2022 updates): Wang, Michael, et al. Summary of Expansions and Updates in GREET® 2022. United States. https://doi.org/10.2172/1891644
grid_emission_intensity/*.geojson
Emission intensity data is obtained from the eGRID database maintained by the United States Environmental Protection Agency.
eGRID subregion boundaries are obtained as a shapefile from the eGRID Mapping Files database.
eGRID database
Shapefile with eGRID subregion boundaries
Attribution for eGRID data: United States Environmental Protection Agency: eGRID with 2022 data. Available from https://www.epa.gov/egrid/download-data. In the public domain.
Attribution for shapefile: United States Environmental Protection Agency: eGRID Mapping Files. Available from https://www.epa.gov/egrid/egrid-mapping-files. In the public domain.
US_elec.geojson
US_hy.geojson
US_lng.geojson
US_cng.geojson
US_lpg.geojson
Locations of direct current fast chargers and refueling stations for alternative fuels along U.S. highways. Obtained directly from the Station Data for Alternative Fuel Corridors in the Alternative Fuels Data Center maintained by the United States Department of Energy Office of Energy Efficiency and Renewable Energy.
US_elec.geojson
US_hy.geojson
US_lng.geojson
US_cng.geojson
US_lpg.geojson
Attribution: U.S. Department of Energy, Energy Efficiency and Renewable Energy. Alternative Fueling Station Corridors. 2024. Available from: https://afdc.energy.gov/corridors. In the public domain.
These data and software code ("Data") are provided by the National Renewable Energy Laboratory ("NREL"), which is operated by the Alliance for Sustainable Energy, LLC ("Alliance"), for the U.S. Department of Energy ("DOE"), and may be used for any purpose whatsoever.
daily_grid_emission_profiles/*.geojson
Hourly emission intensity data obtained from ElectricityMaps.
Original data can be downloaded as csv files from the ElectricityMaps United States of America database
Shapefile with region boundaries used by ElectricityMaps
License: Open Database License (ODbL). Details here: https://www.electricitymaps.com/data-portal
Attribution for csv files: Electricity Maps (2024). United States of America 2022-23 Hourly Carbon Intensity Data (Version January 17, 2024). Electricity Maps Data Portal. https://www.electricitymaps.com/data-portal.
Attribution for shapefile with region boundaries: ElectricityMaps contributors (2024). electricitymaps-contrib (Version v1.155.0) [Computer software]. https://github.com/electricitymaps/electricitymaps-contrib.
gen_cap_2022_state_merged.geojson
trucking_energy_demand.geojson
Grid electricity generation and net summer power capacity data is obtained from the state-level electricity database maintained by the United States Energy Information Administration.
U.S. state boundaries obtained from this United States Department of the Interior U.S. Geological Survey ScienceBase-Catalog.
Annual electricity generation by state
Net summer capacity by state
Shapefile with U.S. state boundaries
Attribution for electricity generation and capacity data: U.S. Energy Information Administration (Aug 2024). Available from: https://www.eia.gov/electricity/data/state/. In the public domain.
electricity_rates_by_state_merged.geojson
Commercial electricity prices are obtained from the Electricity database maintained by the United States Energy Information Administration.
Electricity rate by state
Attribution: U.S. Energy Information Administration (Aug 2024). Available from: https://www.eia.gov/electricity/data.php. In the public domain.
demand_charges_merged.geojson
demand_charges_by_state.geojson
Maximum historical demand charges for each state and zip code are derived from a dataset compiled by the National Renewable Energy Laboratory in this this Data Catalog.
Historical demand charge dataset
The original dataset is compiled by the National Renewable Energy Laboratory (NREL), the U.S. Department of Energy (DOE), and the Alliance for Sustainable Energy, LLC ('Alliance').
Attribution: McLaren, Joyce, Pieter Gagnon, Daniel Zimny-Schmitt, Michael DeMinco, and Eric Wilson. 2017. 'Maximum demand charge rates for commercial and industrial electricity tariffs in the United States.' NREL Data Catalog. Golden, CO: National Renewable Energy Laboratory. Last updated: July 24, 2024. DOI: 10.7799/1392982.
eastcoast.geojson
midwest.geojson
la_i710.geojson
h2la.geojson
bayarea.geojson
saltlake.geojson
northeast.geojson
Highway corridors and regions targeted for heavy duty vehicle infrastructure projects are derived from a public announcement on February 15, 2023 by the United States Department of Energy.
The shapefile with Bay area boundaries is obtained from this Berkeley Library dataset.
The shapefile with Utah county boundaries is obtained from this dataset from the Utah Geospatial Resource Center.
Shapefile for Bay Area country boundaries
Shapefile for counties in Utah
Attribution for public announcement: United States Department of Energy. Biden-Harris Administration Announces Funding for Zero-Emission Medium- and Heavy-Duty Vehicle Corridors, Expansion of EV Charging in Underserved Communities (2023). Available from https://www.energy.gov/articles/biden-harris-administration-announces-funding-zero-emission-medium-and-heavy-duty-vehicle.
Attribution for Bay area boundaries: San Francisco (Calif.). Department Of Telecommunications and Information Services. Bay Area Counties. 2006. In the public domain.
Attribution for Utah boundaries: Utah Geospatial Resource Center & Lieutenant Governor's Office. Utah County Boundaries (2023). Available from https://gis.utah.gov/products/sgid/boundaries/county/.
License for Utah boundaries: Creative Commons 4.0 International License.
incentives_and_regulations/*.geojson
State-level incentives and regulations targeting heavy duty vehicles are collected from the State Laws and Incentives database maintained by the United States Department of Energy's Alternative Fuels Data Center.
Data was collected manually from the State Laws and Incentives database.
Attribution: U.S. Department of Energy, Energy Efficiency and Renewable Energy, Alternative Fuels Data Center. State Laws and Incentives. Accessed on Aug 5, 2024 from: https://afdc.energy.gov/laws/state. In the public domain.
These data and software code ("Data") are provided by the National Renewable Energy Laboratory ("NREL"), which is operated by the Alliance for Sustainable Energy, LLC ("Alliance"), for the U.S. Department of Energy ("DOE"), and may be used for any purpose whatsoever.
costs_and_emissions/*.geojson
diesel_price_by_state.geojson
trucking_energy_demand.geojson
Lifecycle costs and emissions of electric and diesel trucking are evaluated by adapting the model developed by Moreno Sader et al., and calibrated to the Run on Less dataset for the Tesla Semi collected from the 2023 PepsiCo Semi pilot by the North American Council for Freight Efficiency.
In
Facebook
Twitterhttps://www.nist.gov/open/licensehttps://www.nist.gov/open/license
ThermoML is an XML-based IUPAC standard for the storage and exchange of experimental thermophysical and thermochemical property data. The ThermoML archive is a subset of Thermodynamics Research Center (TRC) data holdings corresponding to cooperation between NIST TRC and five journals: Journal of Chemical Engineering and Data (ISSN: 1520-5134), The Journal of Chemical Thermodynamics (ISSN: 1096-3626), Fluid Phase Equilibria (ISSN: 0378-3812), Thermochimica Acta (ISSN: 0040-6031), and International Journal of Thermophysics (ISSN: 1572-9567). Data from initial cooperation (around 2003) through the 2019 calendar year are included. The original scope of the archive has been expanded to include JSON files. The JSON files are structured according to the ThermoML.xsd (available below) and rendered from the same experimental thermophysical and thermochemical property data reported in the corresponding articles as the ThermoML files. In fact, the ThermoML files are generated from the JSON files to keep the information in sync. The JSON files may contain additional information not supported by the ThermoML schema. For example, each JSON file contains the md5 checksum on the ThermoML file (THERMOML_MD5_CHECKSUM) that may be used to validate the ThermoML download. This data.nist.gov resource provides a .tgz file download containing the JSON and ThermoML files for each version of the archive. Data from initial cooperation (around 2003) through the 2019 calendar year are provided below (ThermoML.v2020-09.30.tgz). The date of the extraction from TRC databases, as specified in the dateCit field of the xml files, are 2020-09-29 and 2020-09-30. The .tgz file contains a directory tree that maps to the DOI prefix/suffix of the entries; e.g. unzipping the .tgz file creates a directory for each of the prefixes ( 10.1007, 10.1016, and 10.1021) that contains all the .json and .xml files. The data and other information throughout this digital resource (including the website, API, JSON, and ThermoML files) have been carefully extracted from the original articles by NIST/TRC personnel. Neither the Journal publisher, nor its editors, nor NIST/TRC warrant or represent, expressly or implied, the correctness or accuracy of the content of information contained throughout this digital resource, nor its fitness for any use or for any purpose, nor can they, or will they, accept any liability or responsibility whatever for the consequences of its use or misuse by anyone. In any individual case of application, the respective user must check the correctness by consulting other relevant sources of information.
Facebook
TwitterThe Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
Facebook
TwitterThe Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
Facebook
TwitterThis dataset contains the metadata of the datasets published in 77 Dataverse installations, information about each installation's metadata blocks, and the list of standard licenses that dataset depositors can apply to the datasets they publish in the 36 installations running more recent versions of the Dataverse software. The data is useful for reporting on the quality of dataset and file-level metadata within and across Dataverse installations. Curators and other researchers can use this dataset to explore how well Dataverse software and the repositories using the software help depositors describe data. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation on October 2 and October 3, 2022 using a Python script kept in a GitHub repo at https://github.com/jggautier/dataverse-scripts/blob/main/other_scripts/get_dataset_metadata_of_all_installations.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL in which I was able to create an account and another named "apikey" listing my accounts' API tokens. The Python script expects and uses the API tokens in this CSV file to get metadata and other information from installations that require API tokens. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author(citation).csv │ ├── basic.csv │ ├── contributor(citation).csv │ ├── ... │ └── topic_classification(citation).csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2022.10.02_17.11.19.zip │ ├── dataset_pids_Abacus_2022.10.02_17.11.19.csv │ ├── Dataverse_JSON_metadata_2022.10.02_17.11.19 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0.json │ ├── ... │ ├── metadatablocks_v5.6 │ ├── astrophysics_v5.6.json │ ├── biomedical_v5.6.json │ ├── citation_v5.6.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2022.10.02_17.26.19.zip │ ├── ADA_Dataverse_2022.10.02_17.26.57.zip │ ├── Arca_Dados_2022.10.02_17.44.35.zip │ ├── ... │ └── World_Agroforestry_-_Research_Data_Repository_2022.10.02_22.59.36.zip └── dataset_pids_from_most_known_dataverse_installations.csv └── licenses_used_by_dataverse_installations.csv └── metadatablocks_from_most_known_dataverse_installations.csv This dataset contains two directories and three CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 18 CSV files that contain the values from common metadata fields of all 77 Dataverse installations. For example, author(citation)_2022.10.02-2022.10.03.csv contains the "Author" metadata for all published, non-deaccessioned, versions of all datasets in the 77 installations, where there's a row for each author name, affiliation, identifier type and identifier. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 77 zipped files, one for each of the 77 Dataverse installations whose dataset metadata I was able to download using Dataverse APIs. Each zip file contains a CSV file and two sub-directories: The CSV file contains the persistent IDs and URLs of each published dataset in the Dataverse installation as well as a column to indicate whether or not the Python script was able to download the Dataverse JSON metadata for each dataset. For Dataverse installations using Dataverse software versions whose Search APIs include each dataset's owning Dataverse collection name and alias, the CSV files also include which Dataverse collection (within the installation) that dataset was published in. One sub-directory contains a JSON file for each of the installation's published, non-deaccessioned dataset versions. The JSON files contain the metadata in the "Dataverse JSON" metadata schema. The other sub-directory contains information about the metadata models (the "metadata blocks" in JSON files) that the installation was using when the dataset metadata was downloaded. I saved them so that they can be used when extracting metadata from the Dataverse JSON files. The dataset_pids_from_most_known_dataverse_installations.csv file contains the dataset PIDs of all published datasets in the 77 Dataverse installations, with a column to indicate if the Python script was able to download the dataset's metadata. It's a union of all of the "dataset_pids_..." files in each of the 77 zip files. The licenses_used_by_dataverse_installations.csv file contains information about the licenses that a number of the installations let depositors choose when creating datasets. When I collected ... Visit https://dataone.org/datasets/sha256%3Ad27d528dae8cf01e3ea915f450426c38fd6320e8c11d3e901c43580f997a3146 for complete metadata about this dataset.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Gzipped JSON file of the output of the benchmarking pipeline. This has, for each sample, the resistance calls of each tool for that sample. It is the input file needed to generate all the results in the publication.
Facebook
Twitterhttps://data.go.kr/ugs/selectPortalPolicyView.dohttps://data.go.kr/ugs/selectPortalPolicyView.do
It provides the number of downloads and API utilization requests by year (2011-2023) of file data registered in the public data portal, and is useful for analyzing the trend of increase in public data utilization. The file format is provided in CSV format, and the meta items are statistical year, registration agency, list name, data name, file downloads, and API utilization requests. You can download file data from the public data portal without logging in, and to utilize the open API, you must register as a public data portal member and log in to apply for utilization.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
🛳️ Titanic Dataset (JSON Format) 📌 Overview
This is the classic Titanic: Machine Learning from Disaster dataset, converted into JSON format for easier use in APIs, data pipelines, and Python projects. It contains the same passenger details as the original CSV version, but stored as JSON for convenience.
đź“‚ Dataset Contents
File: titanic.json
Columns: PassengerId, Survived, Pclass, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, Embarked
Use Cases: Exploratory Data Analysis (EDA), feature engineering, machine learning model training, web app backends, JSON parsing practice.
🛠️ How to Use 🔹 1. Load with kagglehub import kagglehub
path = kagglehub.dataset_download("engrbasit62/titanic-json-format") print("Path to dataset files:", path)
🔹 2. Load into Pandas import pandas as pd
df = pd.read_json(f"{path}/titanic.json")
print(df.head())
đź’ˇ Notes
Preview truncation: Kaggle may show only part of the JSON in the preview panel because of its size. ✅ Don’t worry — the full dataset is available when loaded via code.
Benefits of JSON format: Ideal for web apps, APIs, or projects that work with structured data. Easily convertible back to CSV if needed.