7 datasets found

d
Postfire Debris-Flow Database (Literature Derived)
catalog.data.gov
data.usgs.gov
Updated Jul 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Postfire Debris-Flow Database (Literature Derived) [Dataset]. https://catalog.data.gov/dataset/postfire-debris-flow-database-literature-derived
Explore at:
Dataset updated
Jul 20, 2024
Dataset provided by
U.S. Geological Survey
Description
The data presented in this data release represent observations of postfire debris flows that have been collected from publicly available datasets. Data originate from 13 different countries: the United States, Australia, China, Italy, Greece, Portugal, Spain, the United Kingdom, Austria, Switzerland, Canada, South Korea, and Japan. The data are located in the file called “PFDF_database_sortedbyReference.txt” and a description of each column header can be found in both the file “column_headers.txt” and the metadata file (“Post-fire Debris-Flow Database (Literature Derived).xml”). The observations are derived from areas that have been burned by wildfire and are global in nature. However, this dataset is synthesized from information collected by many different researchers for different purposes, and therefore not all fields are available for each of the observations. Missing information is indicated by the value “-9999” in the ”PFDF_database_sortedbyReference.txt” file. Note that the text file contains special characters and a mix of date-time formats that reflect the original data provided by the authors. The text may not be displayed correctly if it is opened by proprietary software such as Microsoft Excel but will appear correctly when opened in a text editor software.
o
Top 250 Korean Dramas (KDrama) Dataset
opendatabay.com
.undefined
Updated Jun 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). Top 250 Korean Dramas (KDrama) Dataset [Dataset]. https://www.opendatabay.com/data/consumer/da19780d-ee8b-428f-994b-cb432e9cd3ca
Explore at:
.undefinedAvailable download formats
Dataset updated
Jun 8, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Entertainment & Media Consumption
Description
This dataset contains data from the top-ranked 250 Korean Dramas as per the MyDramaList website. The data has been collected and uploaded in the form of a CSV file and can be used to work on various Data Science Projects.

The CSV file has 17 columns and 251 rows containing mostly textual data.

Most of the data were collected from the MyDramaList website (https://mydramalist.com), and the data for the names of Production Companies was collected from Wikipedia (https://www.wikipedia.org). I wasn't sure how to scrape the data at the time, and hence I went all manual; copying and pasting the data using the cursor. (Yes it was very tedious to manually copy and paste the data!)

I was working on a Content-based Recommender System for Korean Dramas and I needed data to work with. The datasets available on Kaggle had up to only 100 k-drama titles. Not only that, but quite a few of the features deemed essential were also missing; Synopsis, Tags, Director's name, Cast names, Production Companies' names, and such data weren't available with the pre-existing datasets.

Original Data Source: Top 250 Korean Dramas (KDrama) Dataset
Democratic People's Republic of Korea: Road Surface Data
data.humdata.org
geojson, geopackage
Updated Apr 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HeiGIT (Heidelberg Institute for Geoinformation Technology) (2025). Democratic People's Republic of Korea: Road Surface Data [Dataset]. https://data.humdata.org/dataset/e5ce7f27-f7d6-46ea-8bab-3567728178ad?force_layout=desktop
Explore at:
geopackage, geojsonAvailable download formats
Dataset updated
Apr 15, 2025
Dataset provided by
HeiGIThttps://heigit.org/
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Area covered
North Korea
Description
This dataset provides detailed information on road surfaces from OpenStreetMap (OSM) data, distinguishing between paved and unpaved surfaces across the region. This information is based on road surface prediction derived from hybrid deep learning approach. For more information on Methods, refer to the paper

Roughly 0.1183 million km of roads are mapped in OSM in this region. Based on AI-mapped estimates the share of paved and unpaved roads is approximately 0.0015 and 0.0135 (in million kms), corressponding to 1.2687% and 11.4095% respectively of the total road length in the dataset region. 0.1033 million km or 87.3218% of road surface information is missing in OSM. In order to fill this gap, Mapillary derived road surface dataset provides an additional 0.0 million km of information (corressponding to 0.0422% of total missing information on road surface)

It is intended for use in transportation planning, infrastructure analysis, climate emissions and geographic information system (GIS) applications.

This dataset provides comprehensive information on road and urban area features, including location, surface quality, and classification metadata. This dataset includes attributes from OpenStreetMap (OSM) data, AI predictions for road surface, and urban classifications.

AI features:

pred_class: Model-predicted class for the road surface, with values "paved" or "unpaved."

pred_label: Binary label associated with pred_class (0 = paved, 1 = unpaved).

osm_surface_class: Classification of the surface type from OSM, categorized as "paved" or "unpaved."

combined_surface_osm_priority: Surface classification combining pred_label and surface(OSM) while prioritizing the OSM surface tag, classified as "paved" or "unpaved."

combined_surface_DL_priority: Surface classification combining pred_label and surface(OSM) while prioritizing DL prediction pred_label, classified as "paved" or "unpaved."

n_of_predictions_used: Number of predictions used for the feature length estimation.

predicted_length: Predicted length based on the DL model’s estimations, in meters.

DL_mean_timestamp: Mean timestamp of the predictions used, for comparison.

OSM features may have these attributes(Learn what tags mean here):

name: Name of the feature, if available in OSM.

name:en: Name of the feature in English, if available in OSM.

name:* (in local language): Name of the feature in the local official language, where available.

highway: Road classification based on OSM tags (e.g., residential, motorway, footway).

surface: Description of the surface material of the road (e.g., asphalt, gravel, dirt).

smoothness: Assessment of surface smoothness (e.g., excellent, good, intermediate, bad).

width: Width of the road, where available.

lanes: Number of lanes on the road.

oneway: Indicates if the road is one-way (yes or no).

bridge: Specifies if the feature is a bridge (yes or no).

layer: Indicates the layer of the feature in cases where multiple features are stacked (e.g., bridges, tunnels).

source: Source of the data, indicating the origin or authority of specific attributes.

Urban classification features may have these attributes:

continent: The continent where the data point is located (e.g., Europe, Asia).

country_iso_a2: The ISO Alpha-2 code representing the country (e.g., "US" for the United States).

urban: Binary indicator for urban areas based on the GHSU Urban Layer 2019. (0 = rural, 1 = urban)

urban_area: Name of the urban area or city where the data point is located.

osm_id: Unique identifier assigned by OpenStreetMap (OSM) to each feature.

osm_type: Type of OSM element (e.g., node, way, relation).

The data originates from OpenStreetMap (OSM) and is augmented with model predictions using images downloaded from Mapillary in combination with the GHSU Global Human Settlement Urban Layer 2019 and AFRICAPOLIS2020 urban layer.

This dataset is one of many HeiGIT exports on HDX. See the HeiGIT website for more information.

We are looking forward to hearing about your use-case! Feel free to reach out to us and tell us about your research at communications@heigit.org – we would be happy to amplify your work.
T
South Korean Won Data
tradingeconomics.com
pl.tradingeconomics.com
+13more
csv, excel, json, xml
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS, South Korean Won Data [Dataset]. https://tradingeconomics.com/south-korea/currency
Explore at:
xml, json, excel, csvAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
May 20, 1983 - Jul 4, 2025
Area covered
South Korea
Description
The USD/KRW exchange rate fell to 1,362.7300 on July 4, 2025, down 0.06% from the previous session. Over the past month, the South Korean Won has weakened 0.53%, but it's up by 1.08% over the last 12 months. South Korean Won - values, historical data, forecasts and news - updated on July of 2025.
p
Counts of Dengue hemorrhagic fever reported in KOREA (REPUBLIC OF):...
tycho.pitt.edu
Updated Apr 1, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Willem G Van Panhuis; Anne L Cross; Donald S Burke (2018). Counts of Dengue hemorrhagic fever reported in KOREA (REPUBLIC OF): 2004-2004 [Dataset]. https://www.tycho.pitt.edu/dataset/KR.20927009
Explore at:
Dataset updated
Apr 1, 2018
Dataset provided by
Project Tycho, University of Pittsburgh
Authors
Willem G Van Panhuis; Anne L Cross; Donald S Burke
Time period covered
2004
Area covered
South Korea
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.

Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.

Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
The PRIMAP-hist national historical emissions time series (1750-2019) v2.3.1...
zenodo.org
bin, csv, nc, pdf
Updated Jul 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Johannes Gütschow; Johannes Gütschow; Annika Günther; Annika Günther; Mika Pflüger; Mika Pflüger (2024). The PRIMAP-hist national historical emissions time series (1750-2019) v2.3.1 [Dataset]. http://doi.org/10.5281/zenodo.5494497
Explore at:
pdf, nc, bin, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5494497
Dataset updated
Jul 17, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Johannes Gütschow; Johannes Gütschow; Annika Günther; Annika Günther; Mika Pflüger; Mika Pflüger
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Recommended citation

Gütschow, J.; Günther, A.; Pflüger, M. (2021): The PRIMAP-hist national historical emissions time series v2.3.1 (1850-2019). zenodo. doi:10.5281/zenodo.5494497.

Gütschow, J.; Jeffery, L.; Gieseke, R.; Gebel, R.; Stevens, D.; Krapp, M.; Rocha, M. (2016): The PRIMAP-hist national historical emissions time series, Earth Syst. Sci. Data, 8, 571-603, doi:10.5194/essd-8-571-2016

Content

Use of the dataset and full description

Abstract

Support

Sources

Files included in the dataset

Notes

Data format description (columns)

References

Changelog

Abstract

The PRIMAP-hist dataset combines several published datasets to create a comprehensive set of greenhouse gas emission pathways for every country and Kyoto gas, covering the years 1750 to 2019, and all UNFCCC (United Nations Framework Convention on Climate Change) member states as well as most non-UNFCCC territories. The data resolves the main IPCC (Intergovernmental Panel on Climate Change) 2006 categories. For CO₂, CH₄, and N₂O subsector data for Energy, Industrial Processes and Product Use (IPPU), and Agriculture are available. Due to data availability and methodological issues, version 2.3.1 of the PRIMAP-hist dataset does not include emissions from Land Use, Land-Use Change, and Forestry (LULUCF) in the main file. LULUCF data are included in the file with increased number of significant digits and have to be used with care.

The PRIMAP-hist v2.3.1 dataset is an updated version of

Gütschow, J.; Günther, A.; Pflüger, M. (2021): The PRIMAP-hist national historical emissions time series v2.3 (1750-2019). zenodo. doi:10.5281/zenodo.5175154

The Changelog indicates the most important changes. You can also check the issue tracker on github.com/JGuetschow/PRIMAP-hist for additional information on issues found after the release of the dataset.

Use of the dataset and full description

Before using the dataset, please read this document and the article describing the methodology, especially the section on uncertainties and the section on limitations of the method and use of the dataset.

Gütschow, J.; Jeffery, L.; Gieseke, R.; Gebel, R.; Stevens, D.; Krapp, M.; Rocha, M. (2016): The PRIMAP-hist national historical emissions time series, Earth Syst. Sci. Data, 8, 571-603, doi:10.5194/essd-8-571-2016

Please notify us (johannes.guetschow@pik-potsdam.de) if you use the dataset so that we can keep track of how it is used and take that into consideration when updating and improving the dataset.

When using this dataset or one of its updates, please cite the DOI of the precise version of the dataset used and also the data description article which this dataset is supplement to (see above). Please consider also citing the relevant original sources when using the PRIMAP-hist dataset. See the full citations in the References section further below.

Since version 2.3 we use the data formats developed for the PRIMAP2 climate policy analysis suite: PRIMAP2 on GitHub. The data is published both in the interchange format which consists of a csv file with the data and a yaml file with additional metadata and the native NetCDF based format. For a detailed description of the data format we refer to the PRIMAP2 documentation.

We have also, for the first, time included files with more than three significant digits. This file is mainly aimed at people doing policy analysis using the country reported data scenario (HISTCR). Using the high precision data they can avoid questions on discrepancies with the reported data. The uncertainties of emissions data do not justify the additional significant digits and they might give a false sense of accuracy, so please use this version of the dataset with extra care.

Support

If you encounter possible errors or other things that should be noted, please check our issue tracker at github.com/JGuetschow/PRIMAP-hist and report your findings there. Please use the tag “v2.3.1” in any issue you create regarding this dataset.

If you need support in using the dataset or have any other questions regarding the dataset, please contact johannes.guetschow@pik-potsdam.de.

Sources

Global CO$_2$ emissions from cement production v210723 (Andrew 2021)** data, paper: Andrew (2019a)

BP Statistical Review of World Energy website: British Petroleum (2021)

CDIAC data: Boden et al. (2017): Gilfillan et al. (2020), paper Gilfillan and Marland (2021)

EDGAR versions 4.2 and 4.2 FT2010: EDGAR v4.2, EDGAR v4.2 FT2010: JRC and PBL (2011), Olivier and Janssens-Maenhout (2012)

EDGAR version 6.0: data, website, Paper in prep.: Crippa et al. (n.d.), JRC (2021)

EDGAR-HYDE 1.4 data: Van Aardenne et al. (2001), Olivier and Berdowski (2001)

FAOSTAT database data: Food and Agriculture Organization of the United Nations (2021)/li>

RCP historical data data, paper: Meinshausen et al. (2011)

UNFCCC National Communications and National Inventory Reports for developing countries website, data: UNFCCC (2021c), Gieseke and Gütschow (2021)

UNFCCC Biennial Update Reports website: UNFCCC (2021b)

UNFCCC Common Reporting Format (CRF) website, paper, data: Gütschow et al. (2021b), UNFCCC (2021a) (processed as described in Jeffery et al. (2018a))

Official country repositories (non-UNFCCC)

Taiwan / Republic of China: website: Republic of China - Environmental Protection Administration (2020)

South Korea: website: Republic of Korea (2020)

Files included in the dataset

For each dataset we have three files: the .nc file contains the data and metadata in the native PRIMAP2 netCDF based format. The .csv file contains the data in a csv format following the specifications of the PRIMAP2 interchange format. The metadata for the interchange format file is included in the .yaml file.

Guetschow-et-al-2021-PRIMAP-hist_v2.3.1_20_Sep_2021.X: The main dataset with numerical extrapolation of all time series to 2019 and three significant digits.

Guetschow-et-al-2021-PRIMAP-hist_v2.3.1_no_extrap_20_Sep_2021.X: Variant without numerical extrapolation of missing values and not including the country groups mentioned in section [“country”] (three significant digits).

Guetschow-et-al-2021-PRIMAP-hist_v2.3.1_no_rounding_20_Sep_2021.X: The main dataset with numerical extrapolation of all time series to 2019 and eleven significant digits.

Guetschow-et-al-2021-PRIMAP-hist_v2.3.1_no_extrap_no_rounding_20_Sep_2021.csv: Variant without numerical extrapolation of missing values and not including the country groups mentioned in section [“country”] (eleven significant digits).

PRIMAP-hist_v2.3.1_data-description.pdf: Data description including changelog.

PRIMAP-hist_v2.3.1_updated_figures.pdf: Updated figures from the PRIMAP-hist paper published in ESSD.

Notes

Emissions from international aviation and shipping are not included in the dataset.

Emissions from Land Use, Land-Use Change, and Forestry (LULUCF) are not included in the main version of this dataset. They are included in the
h
EverythingLM-data-V2-Ko
huggingface.co
Updated Aug 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jioh L. Jung (2023). EverythingLM-data-V2-Ko [Dataset]. https://huggingface.co/datasets/ziozzang/EverythingLM-data-V2-Ko
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 23, 2023
Authors
Jioh L. Jung
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Translated into Korean with DeepL

All Texts are translated with DeepL. (Machine Translated.)

Issue: some data items are missing, cause of DeepL plan and processing method. I use very cheap plan and all datas are merged into single file and splitted by few code and hand. This is sample/test processing of data set creation with DeepL.

Original Dataset: totally-not-an-llm/EverythingLM-data-V2

EverythingLM V2 Dataset

EverythingLM V2 is a diverse instruct dataset… See the full description on the dataset page: https://huggingface.co/datasets/ziozzang/EverythingLM-data-V2-Ko.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. Geological Survey (2024). Postfire Debris-Flow Database (Literature Derived) [Dataset]. https://catalog.data.gov/dataset/postfire-debris-flow-database-literature-derived

Postfire Debris-Flow Database (Literature Derived)

Explore at:

Dataset updated

Jul 20, 2024

Dataset provided by

U.S. Geological Survey

Description

The data presented in this data release represent observations of postfire debris flows that have been collected from publicly available datasets. Data originate from 13 different countries: the United States, Australia, China, Italy, Greece, Portugal, Spain, the United Kingdom, Austria, Switzerland, Canada, South Korea, and Japan. The data are located in the file called “PFDF_database_sortedbyReference.txt” and a description of each column header can be found in both the file “column_headers.txt” and the metadata file (“Post-fire Debris-Flow Database (Literature Derived).xml”). The observations are derived from areas that have been burned by wildfire and are global in nature. However, this dataset is synthesized from information collected by many different researchers for different purposes, and therefore not all fields are available for each of the observations. Missing information is indicated by the value “-9999” in the ”PFDF_database_sortedbyReference.txt” file. Note that the text file contains special characters and a mix of date-time formats that reflect the original data provided by the authors. The text may not be displayed correctly if it is opened by proprietary software such as Microsoft Excel but will appear correctly when opened in a text editor software.

Clear search

Close search

Google apps

Main menu

Postfire Debris-Flow Database (Literature Derived)

Top 250 Korean Dramas (KDrama) Dataset

Democratic People's Republic of Korea: Road Surface Data

South Korean Won Data

Counts of Dengue hemorrhagic fever reported in KOREA (REPUBLIC OF):...

The PRIMAP-hist national historical emissions time series (1750-2019) v2.3.1...

EverythingLM-data-V2-Ko

Postfire Debris-Flow Database (Literature Derived)See More Versions

Postfire Debris-Flow Database (Literature Derived)