94 datasets found

e
Csv Exports And Imports | See Full Import/Export Data | Eximpedia
eximpedia.app
Updated Feb 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2025). Csv Exports And Imports | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Feb 7, 2025
Dataset provided by
Eximpedia Export Import Trade Data
Eximpedia PTE LTD
Authors
Seair Exim
Area covered
Aruba, United Republic of, Albania, Georgia, Seychelles, Guam, Monaco, Antigua and Barbuda, United Arab Emirates, Northern Mariana Islands
Description
Csv Exports And Imports Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
c
Dog Food Data Extracted from Chewy (USA) - 4,500 Records in CSV Format
crawlfeeds.com
csv, zip
Updated Apr 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dog Food Data Extracted from Chewy (USA) - 4,500 Records in CSV Format [Dataset]. https://crawlfeeds.com/datasets/dog-food-data-extracted-from-chewy-usa-4-500-records-in-csv-format
Explore at:
zip, csvAvailable download formats
Dataset updated
Apr 22, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
The Dog Food Data Extracted from Chewy (USA) dataset contains 4,500 detailed records of dog food products sourced from one of the leading pet supply platforms in the United States, Chewy. This dataset is ideal for businesses, researchers, and data analysts who want to explore and analyze the dog food market, including product offerings, pricing strategies, brand diversity, and customer preferences within the USA.

The dataset includes essential information such as product names, brands, prices, ingredient details, product descriptions, weight options, and availability. Organized in a CSV format for easy integration into analytics tools, this dataset provides valuable insights for those looking to study the pet food market, develop marketing strategies, or train machine learning models.

Key Features:

Record Count: 4,500 dog food product records.

Data Fields: Product names, brands, prices, descriptions, ingredients .. etc. Find more fields under data points section.

Format: CSV, easy to import into databases and data analysis tools.

Source: Extracted from Chewy’s official USA platform.

Geography: Focused on the USA dog food market.

Use Cases:

Market Research: Analyze trends and preferences in the USA dog food market, including popular brands, price ranges, and product availability.

E-commerce Analysis: Understand how Chewy presents and prices dog food products, helping businesses compare their own product offerings.

Competitor Analysis: Compare different brands and products to develop competitive strategies for dog food businesses.

Machine Learning Models: Use the dataset for machine learning tasks such as product recommendation systems, demand forecasting, and price optimization.
f
Central Bank of Brazil data of foreign capital transfers, 2000-2011
su.figshare.com
researchdata.se
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alice Dauriach; Emma Sundström; Beatrice Crona; Victor Galaz (2023). Central Bank of Brazil data of foreign capital transfers, 2000-2011 [Dataset]. http://doi.org/10.17045/sthlmuni.5857716.v4
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.17045/sthlmuni.5857716.v4
Dataset updated
May 30, 2023
Dataset provided by
Stockholm University
Authors
Alice Dauriach; Emma Sundström; Beatrice Crona; Victor Galaz
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This data set is a subset of the "Records of foreign capital" (Registros de capitais estrangeiros", RCE) published by the Central Bank of Brazil (CBB) on their website.The data set consists of three data files and three corresponding metadata files. All files are in openly accessible .csv or .txt formats. See detailed outline below for data contained in each. Data files contain transaction-specific data such as unique identifier, currency, cancelled status and amount. Metadata files outline variables in the corresponding data file.RCE_Unclean_full_dataset.csv - all transactions published to the Central Bank website from the four main categories outlined belowMetadata_Unclean_full_dataset.csvRCE_Unclean_cancelled_dataset.csv - data extracted from the RCE_Unclean_full_dataset.csv where transactions were registered then cancelledMetadata_Unclean_cancelled_dataset.csvRCE_Clean_selection_dataset.csv - transaction data extracted from RCE_Unclean_full_dataset.csv and RCE_Unclean_cancelled_dataset.csv for the nine companies and criteria identified belowMetadata_Clean_selection_dataset.csvThe data include the period between October 2000 and July 2011. This is the only time span for the data provided by the Central Bank of Brazil at this stage. The records were published monthly by the Central Bank of Brazil as required by Art. 66 in Decree nº 55.762 of 17 February 1965, modified by Decree nº 4.842 of 17 September 2003. The records were published on the bank’s website starting October 2000, as per communique nº 011489 of 7 October 2003. This remained the case until August 2011, after which the amount of each transaction was no longer disclosed (and publication of these stopped altogether after October 2011). The disclosure of the records was suspended in order to review their legal and technical aspects, and ensure their suitability to the requirements of the rules governing the confidentiality of the information (Law nº 12.527 of 18 November 2011 and Decree nº 7724 of May 2012) (pers. comm. Central Bank of Brazil, 2016. Name of contact available upon request to Authors).The records track transfers of foreign capital made from abroad to companies domiciled in Brazil, with information on the foreign company (name and country) transferring the money, and on the company receiving the capital (name and federative unit). For the purpose of this study, we consider the four categories of foreign capital transactions which are published with their amount and currency in the Central Bank’s data, and which are all part of the “Register of financial transactions” (abbreviated RDE-ROF): loans, leasing, financed import and cash in advance (see below for a detailed description). Additional categories exist, such as foreign direct investment (RDE-IED) and External Investment in Portfolio (RDE-Portfólio), for which no amount is published and which are therefore not included.We used the data posted online as PDFs on the bank’s website, and created a script to extract the data automatically from these four categories into the RCE_Unclean_full_dataset.csv file. This data set has not been double-checked manually and may contain errors. We used a similar script to extract rows from the "cancelled transactions" sections of the PDFs into the RCE_Unclean_cancelled_dataset.csv file. This is useful to identify transactions that have been registered to the Central Bank but later cancelled. This data set has not been double-checked manually and may contain errors.From these raw data sets, we conducted the following selections and calculations in order to create the RCE_Clean_selection_dataset.csv file. This data set has been double-checked manually to secure that no errors have been made in the extraction process.We selected all transactions whose recipient company name corresponds to one of these nine companies, or to one of their known subsidiaries in Brazil, according to the list of subsidiaries recorded in the Orbis database, maintained by Bureau Van Dijk. Transactions are included if the recipient company name matches one of the following:- the current or former name of one of the nine companies in our sample (former names are identified using Orbis, Bloomberg’s company profiles or the company website);- the name of a known subsidiary of one of the nine companies, if and only if we find evidence (in Orbis, Bloomberg’s company profiles or on the company website) that this subsidiary was owned at some point during the period 2000-2011, and that it operated in a sector related to the soy or beef industry (including fertilizers and trading activities).For each transaction, we extracted the name of the company sending capital and when possible, attributed the transaction to the known ultimate owner.The name of the countries of origin sometimes comes with typos or different denominations: we harmonized them.A manual check of all the selected data unveiled that a few transactions (n=14), appear twice in the database while bearing the same unique identification number. According to the Central Bank of Brazil (pers. comm., November 2016), this is due to errors in their routine of data extraction. We therefore deleted duplicates in our database, keeping only the latest occurrence of each unique transaction. Six (6) transactions recorded with an amount of zero were also deleted. Two (2) transactions registered in August 2003 with incoherent currencies (Deutsche Mark and Dutch guilder, which were demonetised in early 2002) were also deleted.To secure that the import of data from PDF to the database did not contain any systematic errors, for instance due to mistakes in coding, data were checked in two ways. First, because the script identifies the end of the row in the PDF using the amount of the transaction, which can sometimes fail if the amount is not entered correctly, we went through the extracted raw data (2798 rows) and cleaned all rows whose end had not been correctly identified by the script. Next, we manually double-checked the 486 largest transactions representing 90% of the total amount of capital inflows, as well as 140 randomly selected additional rows representing 5% of the total rows, compared the extracted data to the original PDFs, and found no mistakes.Transfers recorded in the database have been made in different currencies, including US dollars, Euros, Japanese Yens, Brazilian Reais, and more. The conversion to US dollars of all amounts denominated in other currencies was done using the average monthly exchange rate as published by the International Monetary Fund (International Financial Statistics: Exchange rates, national currency per US dollar, period average). Due to the limited time period, we have not corrected for inflation but aggregated nominal amounts in USD over the period 2000-2011.The categories loans, cash in advance (anticipated payment for exports), financed import, and leasing/rental, are those used by the Central Bank of Brazil in their published data. They are denominated respectively: “Loans” (“emprestimos” in original source) - : includes all loans, either contracted directly with creditors or indirectly through the issuance of securities, brokered by foreign agents. “Anticipated payment for exports” (“pagamento/renovacao pagamento antecipado de exportacao” in original source): defined as a type of loan (used in trade finance)“Financed import” (“importacao financiada” in original source): comprises all import financing transactions either direct (contracted by the importer with a foreign bank or with a foreign supplier), or indirect (contracted by Brazilian banks with foreign banks on behalf of Brazilian importers). They must be declared to the Central Bank if their term of payment is superior to 360 days.“Leasing/rental” (“arrendamento mercantil, leasing e aluguel” in original source) : concerns all types of external leasing operations consented by a Brazilian entity to a foreign one. They must be declared if the term of payment is superior to 360 days.More information about the different categories can be found through the Central Bank online.(Research Data Support provided by Springer Nature)
Z
Data from: Large Landing Trajectory Data Set for Go-Around Analysis
data.niaid.nih.gov
zenodo.org
Updated Dec 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcel Dettling (2022). Large Landing Trajectory Data Set for Go-Around Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7148116
Explore at:
Dataset updated
Dec 16, 2022
Dataset provided by
Raphael Monstein
Manuel Waltert
Timothé Krauth
Benoit Figuet
Marcel Dettling
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.

If you use this data for a scientific publication, please consider citing our paper.

The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:

go_arounds_minimal.csv.gz

Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:

Column name Type Description time date time UTC time of landing or first GA attempt icao24 string Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned callsign string Aircraft identifier in air-ground communications airport string ICAO airport code where the aircraft is landing runway string Runway designator on which the aircraft landed has_ga string "True" if at least one GA was performed, otherwise "False" n_approaches integer Number of approaches identified for this flight n_rwy_approached integer Number of unique runways approached by this flight

The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.

go_arounds_augmented.csv.gz

Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:

Column name Type Description time date time UTC time of landing or first GA attempt icao24 string Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned callsign string Aircraft identifier in air-ground communications airport string ICAO airport code where the aircraft is landing runway string Runway designator on which the aircraft landed has_ga string "True" if at least one GA was performed, otherwise "False" n_approaches integer Number of approaches identified for this flight n_rwy_approached integer Number of unique runways approached by this flight registration string Aircraft registration typecode string Aircraft ICAO typecode icaoaircrafttype string ICAO aircraft type wtc string ICAO wake turbulence category glide_slope_angle float Angle of the ILS glide slope in degrees has_intersection

string

Boolean that is true if the runway has an other runway intersecting it, otherwise false rwy_length float Length of the runway in kilometre airport_country string ISO Alpha-3 country code of the airport airport_region string Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania) operator_country string ISO Alpha-3 country code of the operator operator_region string Geographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania) wind_speed_knts integer METAR, surface wind speed in knots wind_dir_deg integer METAR, surface wind direction in degrees wind_gust_knts integer METAR, surface wind gust speed in knots visibility_m float METAR, visibility in m temperature_deg integer METAR, temperature in degrees Celsius press_sea_level_p float METAR, sea level pressure in hPa press_p float METAR, QNH in hPA weather_intensity list METAR, list of present weather codes: qualifier - intensity weather_precipitation list METAR, list of present weather codes: weather phenomena - precipitation weather_desc list METAR, list of present weather codes: qualifier - descriptor weather_obscuration list METAR, list of present weather codes: weather phenomena - obscuration weather_other list METAR, list of present weather codes: weather phenomena - other

This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.

go_arounds_agg.csv.gz

Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:

Column name Type Description airport string ICAO airport code where the aircraft is landing runway string Runway designator on which the aircraft landed n_landings integer Total number of landings observed on this runway in 2019 ga_rate float Go-around rate, per 1000 landings glide_slope_angle float Angle of the ILS glide slope in degrees has_intersection string Boolean that is true if the runway has an other runway intersecting it, otherwise false rwy_length float Length of the runway in kilometres airport_country string ISO Alpha-3 country code of the airport airport_region string Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)

This aggregated data set is used in the paper for the generalized linear regression model.

Downloading the trajectories

Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:

import datetime from tqdm.auto import tqdm import pandas as pd from traffic.data import opensky from traffic.core import Traffic

load minimum data set

df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False) df["time"] = pd.to_datetime(df["time"])

select London City Airport, go-arounds, and 2019-01-04

airport = "EGLC" start = datetime.datetime(year=2019, month=1, day=4).replace( tzinfo=datetime.timezone.utc ) stop = datetime.datetime(year=2019, month=1, day=5).replace( tzinfo=datetime.timezone.utc )

df_selection = df.query("airport==@airport & has_ga & (@start <= time <= @stop)")

iterate over flights and pull the data from OpenSky Network

flights = [] delta_time = pd.Timedelta(minutes=10) for _, row in tqdm(df_selection.iterrows(), total=df_selection.shape[0]): # take at most 10 minutes before and 10 minutes after the landing or go-around start_time = row["time"] - delta_time stop_time = row["time"] + delta_time

# fetch the data from OpenSky Network flights.append( opensky.history( start=start_time.strftime("%Y-%m-%d %H:%M:%S"), stop=stop_time.strftime("%Y-%m-%d %H:%M:%S"), callsign=row["callsign"], return_flight=True, ) )

The flights can be converted into a Traffic object

Traffic.from_flights(flights)

Additional files

Additional files are available to check the quality of the classification into GA/not GA and the selection of the landing runway. These are:

validation_table.xlsx: This Excel sheet was manually completed during the review of the samples for each runway in the data set. It provides an estimate of the false positive and false negative rate of the go-around classification. It also provides an estimate of the runway misclassification rate when the airport has two or more parallel runways. The columns with the headers highlighted in red were filled in manually, the rest is generated automatically.

validation_sample.zip: For each runway, 8 batches of 500 randomly selected trajectories (or as many as available, if fewer than 4000) classified as not having a GA and up to 8 batches of 10 random landings, classified as GA, are plotted. This allows the interested user to visually inspect a random sample of the landings and go-arounds easily.
e
Csv Pharmaceuticals India Private Limited | See Full Import/Export Data |...
eximpedia.app
Updated Jan 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2024). Csv Pharmaceuticals India Private Limited | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Jan 19, 2024
Dataset provided by
Eximpedia Export Import Trade Data
Eximpedia PTE LTD
Authors
Seair Exim
Area covered
India
Description
Eximpedia Export import trade data lets you search trade data and active Exporters, Importers, Buyers, Suppliers, manufacturers exporters from over 209 countries
Industrial Park Management Bureau of the Ministry of Economic...
data.gov.tw
csv
Updated Jun 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Industrial Parks, Ministry of Economic Affairs (2025). Industrial Park Management Bureau of the Ministry of Economic Affairs_Statistics on Import and Export Trade Volume of Science and Technology Industrial Parks [Dataset]. https://data.gov.tw/en/datasets/25792
Explore at:
csvAvailable download formats
Dataset updated
Jun 2, 2025
Dataset authored and provided by
Bureau of Industrial Parks, Ministry of Economic Affairs
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Description
Provide "Statistics of Import and Export Trade Volume of Each Park" to let the public understand the import and export and its growth trend of each park. In addition to updating this information every month, CSV file format is also provided for free download and use by the public.The dataset includes statistics on the import and export trade volume of parks such as Nanzih, Kaohsiung, Taichung, Zhonggang, Pingtung, and other parks (Lingguang, Chenggong, Gaoruan), with main fields including "Park, Import and Export (This Month, Year-to-Date)", "Export (This Month, Year-to-Date)", "Import (This Month, Year-to-Date)", and other important information.
Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic...
zenodo.org
data.niaid.nih.gov
bin, csv, zip
Updated Dec 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander R. Hartloper; Alexander R. Hartloper; Selimcan Ozden; Albano de Castro e Sousa; Dimitrios G. Lignos; Dimitrios G. Lignos; Selimcan Ozden; Albano de Castro e Sousa (2022). Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic Materials [Dataset]. http://doi.org/10.5281/zenodo.6965147
Explore at:
bin, zip, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6965147
Dataset updated
Dec 24, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alexander R. Hartloper; Alexander R. Hartloper; Selimcan Ozden; Albano de Castro e Sousa; Dimitrios G. Lignos; Dimitrios G. Lignos; Selimcan Ozden; Albano de Castro e Sousa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic Materials

Background

This dataset contains data from monotonic and cyclic loading experiments on structural metallic materials. The materials are primarily structural steels and one iron-based shape memory alloy is also included. Summary files are included that provide an overview of the database and data from the individual experiments is also included.

The files included in the database are outlined below and the format of the files is briefly described. Additional information regarding the formatting can be found through the post-processing library (https://github.com/ahartloper/rlmtp/tree/master/protocols).

Usage

The data is licensed through the Creative Commons Attribution 4.0 International.

If you have used our data and are publishing your work, we ask that you please reference both:

this database through its DOI, and

any publication that is associated with the experiments. See the Overall_Summary and Database_References files for the associated publication references.

Included Files

Overall_Summary_2022-08-25_v1-0-0.csv: summarises the specimen information for all experiments in the database.

Summarized_Mechanical_Props_Campaign_2022-08-25_v1-0-0.csv: summarises the average initial yield stress and average initial elastic modulus per campaign.

Unreduced_Data-#_v1-0-0.zip: contain the original (not downsampled) data

Where # is one of: 1, 2, 3, 4, 5, 6. The unreduced data is broken into separate archives because of upload limitations to Zenodo. Together they provide all the experimental data.

We recommend you un-zip all the folders and place them in one "Unreduced_Data" directory similar to the "Clean_Data"

The experimental data is provided through .csv files for each test that contain the processed data. The experiments are organised by experimental campaign and named by load protocol and specimen. A .pdf file accompanies each test showing the stress-strain graph.

There is a "db_tag_clean_data_map.csv" file that is used to map the database summary with the unreduced data.

The computed yield stresses and elastic moduli are stored in the "yield_stress" directory.

Clean_Data_v1-0-0.zip: contains all the downsampled data

The experimental data is provided through .csv files for each test that contain the processed data. The experiments are organised by experimental campaign and named by load protocol and specimen. A .pdf file accompanies each test showing the stress-strain graph.

There is a "db_tag_clean_data_map.csv" file that is used to map the database summary with the clean data.

The computed yield stresses and elastic moduli are stored in the "yield_stress" directory.

Database_References_v1-0-0.bib

Contains a bibtex reference for many of the experiments in the database. Corresponds to the "citekey" entry in the summary files.

File Format: Downsampled Data

These are the "LP_

The header of the first column is empty: the first column corresponds to the index of the sample point in the original (unreduced) data

Time[s]: time in seconds since the start of the test

e_true: true strain

Sigma_true: true stress in MPa

(optional) Temperature[C]: the surface temperature in degC

These data files can be easily loaded using the pandas library in Python through:

import pandas data = pandas.read_csv(data_file, index_col=0)

The data is formatted so it can be used directly in RESSPyLab (https://github.com/AlbanoCastroSousa/RESSPyLab). Note that the column names "e_true" and "Sigma_true" were kept for backwards compatibility reasons with RESSPyLab.

File Format: Unreduced Data

These are the "LP_

The first column is the index of each data point

S/No: sample number recorded by the DAQ

System Date: Date and time of sample

Time[s]: time in seconds since the start of the test

C_1_Force[kN]: load cell force

C_1_Déform1[mm]: extensometer displacement

C_1_Déplacement[mm]: cross-head displacement

Eng_Stress[MPa]: engineering stress

Eng_Strain[]: engineering strain

e_true: true strain

Sigma_true: true stress in MPa

(optional) Temperature[C]: specimen surface temperature in degC

The data can be loaded and used similarly to the downsampled data.

File Format: Overall_Summary

The overall summary file provides data on all the test specimens in the database. The columns include:

hidden_index: internal reference ID

grade: material grade

spec: specifications for the material

source: base material for the test specimen

id: internal name for the specimen

lp: load protocol

size: type of specimen (M8, M12, M20)

gage_length_mm_: unreduced section length in mm

avg_reduced_dia_mm_: average measured diameter for the reduced section in mm

avg_fractured_dia_top_mm_: average measured diameter of the top fracture surface in mm

avg_fractured_dia_bot_mm_: average measured diameter of the bottom fracture surface in mm

fy_n_mpa_: nominal yield stress

fu_n_mpa_: nominal ultimate stress

t_a_deg_c_: ambient temperature in degC

date: date of test

investigator: person(s) who conducted the test

location: laboratory where test was conducted

machine: setup used to conduct test

pid_force_k_p, pid_force_t_i, pid_force_t_d: PID parameters for force control

pid_disp_k_p, pid_disp_t_i, pid_disp_t_d: PID parameters for displacement control

pid_extenso_k_p, pid_extenso_t_i, pid_extenso_t_d: PID parameters for extensometer control

citekey: reference corresponding to the Database_References.bib file

yield_stress_mpa_: computed yield stress in MPa

elastic_modulus_mpa_: computed elastic modulus in MPa

fracture_strain: computed average true strain across the fracture surface

c,si,mn,p,s,n,cu,mo,ni,cr,v,nb,ti,al,b,zr,sn,ca,h,fe: chemical compositions in units of %mass

file: file name of corresponding clean (downsampled) stress-strain data

File Format: Summarized_Mechanical_Props_Campaign

Meant to be loaded in Python as a pandas DataFrame with multi-indexing, e.g.,

tab1 = pd.read_csv('Summarized_Mechanical_Props_Campaign_' + date + version + '.csv', index_col=[0, 1, 2, 3], skipinitialspace=True, header=[0, 1], keep_default_na=False, na_values='')

citekey: reference in "Campaign_References.bib".

Grade: material grade.

Spec.: specifications (e.g., J2+N).

Yield Stress [MPa]: initial yield stress in MPa

size, count, mean, coefvar: number of experiments in campaign, number of experiments in mean, mean value for campaign, coefficient of variation for campaign

Elastic Modulus [MPa]: initial elastic modulus in MPa

size, count, mean, coefvar: number of experiments in campaign, number of experiments in mean, mean value for campaign, coefficient of variation for campaign

Caveats

The files in the following directories were tested before the protocol was established. Therefore, only the true stress-strain is available for each:

A500

A992_Gr50

BCP325

BCR295

HYP400

S460NL

S690QL/25mm

S355J2_Plates/S355J2_N_25mm and S355J2_N_50mm
Speedtest Open Data - Four International cities - MEL, BKK, SHG, LAX plus...
figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Richard Ferrers; Speedtest Global Index (2023). Speedtest Open Data - Four International cities - MEL, BKK, SHG, LAX plus ALC - 2020, 2022 [Dataset]. http://doi.org/10.6084/m9.figshare.13621169.v24
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13621169.v24
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Richard Ferrers; Speedtest Global Index
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset compares four cities FIXED-line broadband internet speeds: - Melbourne, AU - Bangkok, TH - Shanghai, CN - Los Angeles, US - Alice Springs, AU

ERRATA: 1.Data is for Q3 2020, but some files are labelled incorrectly as 02-20 of June 20. They all should read Sept 20, or 09-20 as Q3 20, rather than Q2. Will rename and reload. Amended in v7.

LAX file named 0320, when should be Q320. Amended in v8.

*lines of data for each geojson file; a line equates to a 600m^2 location, inc total tests, devices used, and average upload and download speed - MEL 16181 locations/lines => 0.85M speedtests (16.7 tests per 100people) - SHG 31745 lines => 0.65M speedtests (2.5/100pp) - BKK 29296 lines => 1.5M speedtests (14.3/100pp) - LAX 15899 lines => 1.3M speedtests (10.4/100pp) - ALC 76 lines => 500 speedtests (2/100pp)

Geojsons of these 2* by 2* extracts for MEL, BKK, SHG now added, and LAX added v6. Alice Springs added v15.

This dataset unpacks, geospatially, data summaries provided in Speedtest Global Index (linked below). See Jupyter Notebook (*.ipynb) to interrogate geo data. See link to install Jupyter.

** To Do Will add Google Map versions so everyone can see without installing Jupyter. - Link to Google Map (BKK) added below. Key:Green > 100Mbps(Superfast). Black > 500Mbps (Ultrafast). CSV provided. Code in Speedtestv1.1.ipynb Jupyter Notebook. - Community (Whirlpool) surprised [Link: https://whrl.pl/RgAPTl] that Melb has 20% at or above 100Mbps. Suggest plot Top 20% on map for community. Google Map link - now added (and tweet).

** Python melb = au_tiles.cx[144:146 , -39:-37] #Lat/Lon extract shg = tiles.cx[120:122 , 30:32] #Lat/Lon extract bkk = tiles.cx[100:102 , 13:15] #Lat/Lon extract lax = tiles.cx[-118:-120, 33:35] #lat/Lon extract ALC=tiles.cx[132:134, -22:-24] #Lat/Lon extract

Histograms (v9), and data visualisations (v3,5,9,11) will be provided. Data Sourced from - This is an extract of Speedtest Open data available at Amazon WS (link below - opendata.aws).

**VERSIONS v.24 Add tweet and google map of Top 20% (over 100Mbps locations) in Mel Q322. Add v.1.5 MEL-Superfast notebook, and CSV of results (now on Google Map; link below). v23. Add graph of 2022 Broadband distribution, and compare 2020 - 2022. Updated v1.4 Jupyter notebook. v22. Add Import ipynb; workflow-import-4cities. v21. Add Q3 2022 data; five cities inc ALC. Geojson files. (2020; 4.3M tests 2022; 2.9M tests)

Melb 14784 lines Avg download speed 69.4M Tests 0.39M

SHG 31207 lines Avg 233.7M Tests 0.56M

ALC 113 lines Avg 51.5M Test 1092

BKK 29684 lines Avg 215.9M Tests 1.2M

LAX 15505 lines Avg 218.5M Tests 0.74M

v20. Speedtest - Five Cities inc ALC. v19. Add ALC2.ipynb. v18. Add ALC line graph. v17. Added ipynb for ALC. Added ALC to title.v16. Load Alice Springs Data Q221 - csv. Added Google Map link of ALC. v15. Load Melb Q1 2021 data - csv. V14. Added Melb Q1 2021 data - geojson. v13. Added Twitter link to pics. v12 Add Line-Compare pic (fastest 1000 locations) inc Jupyter (nbn-intl-v1.2.ipynb). v11 Add Line-Compare pic, plotting Four Cities on a graph. v10 Add Four Histograms in one pic. v9 Add Histogram for Four Cities. Add NBN-Intl.v1.1.ipynb (Jupyter Notebook). v8 Renamed LAX file to Q3, rather than 03. v7 Amended file names of BKK files to correctly label as Q3, not Q2 or 06. v6 Added LAX file. v5 Add screenshot of BKK Google Map. v4 Add BKK Google map(link below), and BKK csv mapping files. v3 replaced MEL map with big key version. Prev key was very tiny in top right corner. v2 Uploaded MEL, SHG, BKK data and Jupyter Notebook v1 Metadata record

** LICENCE AWS data licence on Speedtest data is "CC BY-NC-SA 4.0", so use of this data must be: - non-commercial (NC) - reuse must be share-alike (SA)(add same licence). This restricts the standard CC-BY Figshare licence.

** Other uses of Speedtest Open Data; - see link at Speedtest below.
e
Csv Active S R L | See Full Import/Export Data | Eximpedia
eximpedia.app
Updated Feb 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2025). Csv Active S R L | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Feb 6, 2025
Dataset provided by
Eximpedia Export Import Trade Data
Eximpedia PTE LTD
Authors
Seair Exim
Area covered
Poland, India, Israel, Moldova (Republic of), Equatorial Guinea, Paraguay, Suriname, Morocco, Guinea-Bissau, Timor-Leste
Description
Csv Active S R L Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Z
Data from: WormJam Metabolites Local CSV for MetFrag
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Witting, Michael (2020). WormJam Metabolites Local CSV for MetFrag [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3403364
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Schymanski, Emma
Witting, Michael
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a local CSV file of WormJam (https://www.tandfonline.com/doi/full/10.1080/21624054.2017.1373939) for MetFrag (https://msbi.ipb-halle.de/MetFrag/).

The text file provided by Michael (also part of this dataset) was modified into CSV by adding identifiers and adjusting headers for MetFrag import.

This CSV file is for users wanting to integrate WormJam into MetFrag CL workflows (offline), this file will be integrated into MetFrag online; please use the file in the dropdown menu rather than uploading this one.

Update 10 Sept 2019: curated truncated InChIKey, InChI entries, added missing SMILES, added DTXSIDs by InChIKey match.
Data from: Data and code from: Environmental influences on drying rate of...
catalog.data.gov
s.cnmilf.com
+1more
Updated Apr 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Data and code from: Environmental influences on drying rate of spray applied disinfestants from horticultural production services [Dataset]. https://catalog.data.gov/dataset/data-and-code-from-environmental-influences-on-drying-rate-of-spray-applied-disinfestants-
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
This dataset includes all the data and R code needed to reproduce the analyses in a forthcoming manuscript:Copes, W. E., Q. D. Read, and B. J. Smith. Environmental influences on drying rate of spray applied disinfestants from horticultural production services. PhytoFrontiers, DOI pending.Study description: Instructions for disinfestants typically specify a dose and a contact time to kill plant pathogens on production surfaces. A problem occurs when disinfestants are applied to large production areas where the evaporation rate is affected by weather conditions. The common contact time recommendation of 10 min may not be achieved under hot, sunny conditions that promote fast drying. This study is an investigation into how the evaporation rates of six commercial disinfestants vary when applied to six types of substrate materials under cool to hot and cloudy to sunny weather conditions. Initially, disinfestants with low surface tension spread out to provide 100% coverage and disinfestants with high surface tension beaded up to provide about 60% coverage when applied to hard smooth surfaces. Disinfestants applied to porous materials were quickly absorbed into the body of the material, such as wood and concrete. Even though disinfestants evaporated faster under hot sunny conditions than under cool cloudy conditions, coverage was reduced considerably in the first 2.5 min under most weather conditions and reduced to less than or equal to 50% coverage by 5 min. Dataset contents: This dataset includes R code to import the data and fit Bayesian statistical models using the model fitting software CmdStan, interfaced with R using the packages brms and cmdstanr. The models (one for 2022 and one for 2023) compare how quickly different spray-applied disinfestants dry, depending on what chemical was sprayed, what surface material it was sprayed onto, and what the weather conditions were at the time. Next, the statistical models are used to generate predictions and compare mean drying rates between the disinfestants, surface materials, and weather conditions. Finally, tables and figures are created. These files are included:Drying2022.csv: drying rate data for the 2022 experimental runWeather2022.csv: weather data for the 2022 experimental runDrying2023.csv: drying rate data for the 2023 experimental runWeather2023.csv: weather data for the 2023 experimental rundisinfestant_drying_analysis.Rmd: RMarkdown notebook with all data processing, analysis, and table creation codedisinfestant_drying_analysis.html: rendered output of notebookMS_figures.R: additional R code to create figures formatted for journal requirementsfit2022_discretetime_weather_solar.rds: fitted brms model object for 2022. This will allow users to reproduce the model prediction results without having to refit the model, which was originally fit on a high-performance computing clusterfit2023_discretetime_weather_solar.rds: fitted brms model object for 2023data_dictionary.xlsx: descriptions of each column in the CSV data files
c
ckanext-trak
catalog.civicdataecosystem.org
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). ckanext-trak [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-trak
Explore at:
Dataset updated
Jun 4, 2025
Description
The trak extension for CKAN enhances the platform's tracking capabilities by providing tools to import Google Analytics data and modify the presentation of page view statistics. It introduces a paster command for importing page view data from exported Google Analytics CSV files, enabling users to supplement CKAN's built-in tracking. The extension also includes template customizations to alter how page view counts are displayed on dataset and resource listing pages. Key Features: Google Analytics Data Import: Imports page view data directly from a stripped-down CSV of Google Analytics data using a dedicated paster command (csv2table). The CSV should contain a list of page views, where each row starts with '/'. The PageViews column is expected to be the 3rd column. Customizable Page View Display: Changes the default presentation of page view statistics within CKAN, removing the minimum view count restriction (default is 10) so all views can be seen and modifies UI elements. Altered Page Tracking Stats: Alters the placement of page tracking statistics, moving them below Package Data (on dataset list pages) and Resource Data (on resource list pages) for better integration of tracking data. UI/UX Enhancements: Replaces the flame icon typically used for page tracking and substitutes it with more subtle background styling to modernize the presentation of tracking data. Backend Data Manipulation Uses a 'floor date' of 2011-01-01 for page view calculation. Entries are made in the trackingraw table for each view, with a unique UUID. Integration with CKAN: The extension integrates into CKAN's core functionalities by introducing a new paster command and modifying existing templates for displaying page view statistics. It relies on CKAN's built-in tracking to be enabled, but supplements its capabilities with imported data and presentation adjustments. After importing data using the csv2table paster command, the standard tracking update and search-index rebuild paster tasks need to be run to process the imported data and update the search index.. Benefits & Impact: By importing data from Google Analytics, the trak extension allows administrators to see a holistic view of page views. It changes the user experience to facilitate tracking statistics in a more integrated fashion. This allows for a better understanding of the impact and utilization of resources within the CKAN instance, based on Google Analytics data.

LeetCode CN Problems

kaggle.com

Updated Apr 5, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

imba-tjd (2024). LeetCode CN Problems [Dataset]. https://www.kaggle.com/datasets/imbatjd/leetcode-cn-problems/data

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 5, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

imba-tjd

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

The data was collected on 2024-04-05 containing 3492 problems.

Cleaned via the following script.

import json
import csv
from io import TextIOWrapper


def clean(data: dict):
  questions = data['data']['problemsetQuestionList']['questions']
  for q in questions:
    yield {
      'id': q['frontendQuestionId'],
      'difficulty': q['difficulty'],
      'title': q['title'],
      'titleCn': q['titleCn'],
      'titleSlug': q['titleSlug'],
      'paidOnly': q['paidOnly'],
      'acRate': round(q['acRate'], 3),
      'topicTags': [t['name'] for t in q['topicTags']],
    }


def out_jsonl(f: TextIOWrapper):
  for id in range(0, 35):
    with open(f'data/{id}.json', encoding='u8') as f2:
      data = json.load(f2)

    for q in clean(data):
      f.write(json.dumps(q, ensure_ascii=False))
      f.write('
')


def out_json(f: TextIOWrapper):
  l = []
  for id in range(0, 35):
    with open(f'data/{id}.json', encoding='u8') as f2:
      data = json.load(f2)

    for q in clean(data):
      l.append(q)

  json.dump(l, f, ensure_ascii=False)


def out_csv(f: TextIOWrapper):
  writer = csv.DictWriter(f, fieldnames=[
    'id', 'difficulty', 'title', 'titleCn', 'titleSlug', 'paidOnly', 'acRate', 'topicTags'
  ])
  writer.writeheader()

  for id in range(0, 35):
    with open(f'data/{id}.json', encoding='u8') as f2:
      data = json.load(f2)

    writer.writerows(clean(data))


with open('data.jsonl', 'w', encoding='u8') as f:
  out_jsonl(f)

with open('data.json', 'w', encoding='u8') as f:
  out_json(f)

with open('data.csv', 'w', encoding='u8', newline='') as f:
  out_csv(f)

Crypto Price Monitoring Dataset for On-chain Derivatives Research
zenodo.org
csv
Updated Mar 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ivan Vakhmyanin; Yana Volkovich; Ivan Vakhmyanin; Yana Volkovich (2023). Crypto Price Monitoring Dataset for On-chain Derivatives Research [Dataset]. http://doi.org/10.5281/zenodo.7749133
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7749133
Dataset updated
Mar 19, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ivan Vakhmyanin; Yana Volkovich; Ivan Vakhmyanin; Yana Volkovich
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
# Crypto Price Monitoring Repository

This repository contains two CSV data files that were created to support the research titled "Price Arbitrage for DeFi Derivatives." This research is to be presented at the IEEE International Conference on Blockchain and Cryptocurrencies, taking place on 5th May 2023 in Dubai, UAE. The data files include monitoring prices for various crypto assets from several sources. The data files are structured with five columns, providing information about the symbol, unified symbol, time, price, and source of the price.

## Data Files

There are two CSV data files in this repository (one for each date):

1. `Pricemon_results_2023_01_13.csv`
2. `Pricemon_results_2023_01_14.csv`

## Data Format

Both data files have the same format and structure, with the following five columns:

1. `symbol`: The trading symbol for the crypto asset (e.g., BTC, ETH).
2. `unified_symbol`: A standardized symbol used across different platforms.
3. `time`: Timestamp for when the price data was recorded (in UTC format).
4. `price`: The price of the crypto asset at the given time (in USD).
5. `source`: The name of the price source for the data.

## Price Sources

The `source` column in the data files refers to the provider of the price data for each record. The sources include:

- `chainlink`: Chainlink Price Oracle
- `mycellium`: Built-in oracle of the Mycellium platform
- `bitfinex`: Bitfinex cryptocurrency exchange
- `ftx`: FTX cryptocurrency exchange
- `binance`: Binance cryptocurrency exchange

## Usage

You can use these data files for various purposes, such as analyzing price discrepancies across different sources, identifying trends, or developing trading algorithms. To use the data, simply import the CSV files into your preferred data processing or analysis tool.

### Example

Here's an example of how you can read and display the data using Python and the pandas library:

import pandas as pd

# Read the data from CSV file
data = pd.read_csv('Pricemon_results_2023_01_13.csv')

# Display the first 5 rows of the data
print(data.head())`

## Acknowledgements

These datasets were recorded and supported by Datamint company (value-added on-chain data provider) and its team.

## Contributing

If you have any suggestions or find any issues with the data, please feel free to contact authors.
Smartwatch Purchase Data
kaggle.com
Updated Dec 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aayush Chourasiya (2022). Smartwatch Purchase Data [Dataset]. https://www.kaggle.com/datasets/albedo0/smartwatch-purchase-data/versions/2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 30, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aayush Chourasiya
Description
Disclaimer: This is an artificially generated data using a python script based on arbitrary assumptions listed down.

The data consists of 100,000 examples of training data and 10,000 examples of test data, each representing a user who may or may not buy a smart watch.

----- Version 1 -------

trainingDataV1.csv, testDataV1.csv or trainingData.csv, testData.csv The data includes the following features for each user: 1. age: The age of the user (integer, 18-70) 1. income: The income of the user (integer, 25,000-200,000) 1. gender: The gender of the user (string, "male" or "female") 1. maritalStatus: The marital status of the user (string, "single", "married", or "divorced") 1. hour: The hour of the day (integer, 0-23) 1. weekend: A boolean indicating whether it is the weekend (True or False) 1. The data also includes a label for each user indicating whether they are likely to buy a smart watch or not (string, "yes" or "no"). The label is determined based on the following arbitrary conditions: - If the user is divorced and a random number generated by the script is less than 0.4, the label is "no" (i.e., assuming 40% of divorcees are not likely to buy a smart watch) - If it is the weekend and a random number generated by the script is less than 1.3, the label is "yes". (i.e., assuming sales are 30% more likely to occur on weekends) - If the user is male and under 30 with an income over 75,000, the label is "yes". - If the user is female and 30 or over with an income over 100,000, the label is "yes". Otherwise, the label is "no".

The training data is intended to be used to build and train a classification model, and the test data is intended to be used to evaluate the performance of the trained model.

Following Python script was used to generate this dataset

import random import csv # Set the number of examples to generate numExamples = 100000 # Generate the training data with open("trainingData.csv", "w", newline="") as csvfile: fieldnames = ["age", "income", "gender", "maritalStatus", "hour", "weekend", "buySmartWatch"] writer = csv.DictWriter(csvfile, fieldnames=fieldnames) writer.writeheader() for i in range(numExamples): age = random.randint(18, 70) income = random.randint(25000, 200000) gender = random.choice(["male", "female"]) maritalStatus = random.choice(["single", "married", "divorced"]) hour = random.randint(0, 23) weekend = random.choice([True, False]) # Randomly assign the label based on some arbitrary conditions # assuming 40% of divorcees won't buy a smart watch if maritalStatus == "divorced" and random.random() < 0.4: buySmartWatch = "no" # assuming sales are 30% more likely to occur on weekends. elif weekend == True and random.random() < 1.3: buySmartWatch = "yes" elif gender == "male" and age < 30 and income > 75000: buySmartWatch = "yes" elif gender == "female" and age >= 30 and income > 100000: buySmartWatch = "yes" else: buySmartWatch = "no" writer.writerow({ "age": age, "income": income, "gender": gender, "maritalStatus": maritalStatus, "hour": hour, "weekend": weekend, "buySmartWatch": buySmartWatch })

----- Version 2 -------

trainingDataV2.csv, testDataV2.csv The data includes the following features for each user: 1. age: The age of the user (integer, 18-70) 1. income: The income of the user (integer, 25,000-200,000) 1. gender: The gender of the user (string, "male" or "female") 1. maritalStatus: The marital status of the user (string, "single", "married", or "divorced") 1. educationLevel: The education level of the user (string, "high school", "associate's degree", "bachelor's degree", "master's degree", or "doctorate") 1. occupation: The occupation of the user (string, "tech worker", "manager", "executive", "sales", "customer service", "creative", "manual labor", "healthcare", "education", "government", "unemployed", or "student") 1. familySize: The number of people in the user's family (integer, 1-5) 1. fitnessInterest: A boolean indicating whether the user is interested in fitness (True or False) 1. priorSmartwatchOwnership: A boolean indicating whether the user has owned a smartwatch in the past (True or False) 1. hour: The hour of the day when the user was surveyed (integer, 0-23) 1. weekend: A boolean indicating whether the user was surveyed on a weekend (True or False) 1. buySmartWatch: A boolean indicating whether the user purchased a smartwatch (True or False)

Python script used to generate the data:

import random import csv # Set the number of examples to generate numExamples = 100000 with open("t...
U.S. Food Imports
catalog.data.gov
cloud.csiss.gmu.edu
+5more
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Economic Research Service, Department of Agriculture (2025). U.S. Food Imports [Dataset]. https://catalog.data.gov/dataset/u-s-food-imports
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Economic Research Servicehttp://www.ers.usda.gov/
Area covered
United States
Description
U.S. consumers demand variety, quality, and convenience in the foods they consume. As Americans have become wealthier and more ethnically diverse, the American food basket reflects a growing share of tropical products, spices, and imported gourmet products. Seasonal and climatic factors drive U.S. imports of popular types of fruits and vegetables and tropical products, such as cocoa and coffee. In addition, a growing share of U.S. imports can be attributed to intra-industry trade, whereby agricultural-processing industries based in the United States carry out certain processing steps offshore and import products at different levels of processing from their subsidiaries in foreign markets. This data set provides import values of edible products (food and beverages) entering U.S. ports and their origin of shipment. Data are from the U.S. Department of Commerce, U.S. Census Bureau. Food and beverage import values are compiled by calendar year into food groups corresponding to major commodities or level of processing. At least 10 years of annual data are included, enabling users to track long-term growth patterns.
h
letter_recognition
huggingface.co
Updated Sep 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pittawat Taveekitworachai (2023). letter_recognition [Dataset]. https://huggingface.co/datasets/pittawat/letter_recognition
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 17, 2023
Authors
Pittawat Taveekitworachai
Description
Dataset Card for "letter_recognition"

Images in this dataset was generated using the script defined below. The original dataset in CSV format and more information of the original dataset is available at A-Z Handwritten Alphabets in .csv format. import os import pandas as pd import matplotlib.pyplot as plt

CHARACTER_COUNT = 26

data = pd.read_csv('./A_Z Handwritten Data.csv') mapping = {str(i): chr(i+65) for i in range(26)}

def generate_dataset(folder, end, start=0): if not… See the full description on the dataset page: https://huggingface.co/datasets/pittawat/letter_recognition.
Students Test Data
kaggle.com
Updated Sep 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ATHARV BHARASKAR (2023). Students Test Data [Dataset]. https://www.kaggle.com/datasets/atharvbharaskar/students-test-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 12, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ATHARV BHARASKAR
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
Dataset Overview: This dataset pertains to the examination results of students who participated in a series of academic assessments at a fictitious educational institution named "University of Exampleville." The assessments were administered across various courses and academic levels, with a focus on evaluating students' performance in general management and domain-specific topics.

Columns: The dataset comprises 12 columns, each representing specific attributes and performance indicators of the students. These columns encompass information such as the students' names (which have been anonymized), their respective universities, academic program names (including BBA and MBA), specializations, the semester of the assessment, the type of examination domain (general management or domain-specific), general management scores (out of 50), domain-specific scores (out of 50), total scores (out of 100), student ranks, and percentiles.

Data Collection: The examination data was collected during a standardized assessment process conducted by the University of Exampleville. The exams were designed to assess students' knowledge and skills in general management and their chosen domain-specific subjects. It involved students from both BBA and MBA programs who were in their final year of study.

Data Format: The dataset is available in a structured format, typically as a CSV file. Each row represents a unique student's performance in the examination, while columns contain specific information about their results and academic details.

Data Usage: This dataset is valuable for analyzing and gaining insights into the academic performance of students pursuing BBA and MBA degrees. It can be used for various purposes, including statistical analysis, performance trend identification, program assessment, and comparison of scores across domains and specializations. Furthermore, it can be employed in predictive modeling or decision-making related to curriculum development and student support.

Data Quality: The dataset has undergone preprocessing and anonymization to protect the privacy of individual students. Nevertheless, it is essential to use the data responsibly and in compliance with relevant data protection regulations when conducting any analysis or research.

Data Format: The exam data is typically provided in a structured format, commonly as a CSV (Comma-Separated Values) file. Each row in the dataset represents a unique student's examination performance, and each column contains specific attributes and scores related to the examination. The CSV format allows for easy import and analysis using various data analysis tools and programming languages like Python, R, or spreadsheet software like Microsoft Excel.

Here's a column-wise description of the dataset:

Name OF THE STUDENT: The full name of the student who took the exam. (Anonymized)

UNIVERSITY: The university where the student is enrolled.

PROGRAM NAME: The name of the academic program in which the student is enrolled (BBA or MBA).

Specialization: If applicable, the specific area of specialization or major that the student has chosen within their program.

Semester: The semester or academic term in which the student took the exam.

Domain: Indicates whether the exam was divided into two parts: general management and domain-specific.

GENERAL MANAGEMENT SCORE (OUT of 50): The score obtained by the student in the general management part of the exam, out of a maximum possible score of 50.

Domain-Specific Score (Out of 50): The score obtained by the student in the domain-specific part of the exam, also out of a maximum possible score of 50.

TOTAL SCORE (OUT of 100): The total score obtained by adding the scores from the general management and domain-specific parts, out of a maximum possible score of 100.
Z
N3C-Formatted OMOP2OBO Mappings
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Oct 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
N3C OMOP to OBO Working Group (2022). N3C-Formatted OMOP2OBO Mappings [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7249165
Explore at:
Dataset updated
Oct 27, 2022
Dataset provided by
Callahan, Tiffany J
N3C OMOP to OBO Working Group
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
OMOP2OBO Mappings - N3C OMOP to OBO Working group

This repository stores OMOP2OBO mappings which have been processed for use within the National COVID Cohort Collaborative (N3C) Enclave. The version of the mappings stored in this repository have been specifically formatted for use within the N3C Enclave.

N3C OMOP to OBO Working Group: https://covid.cd2h.org/ontology

Accessing the N3C-Formatted Mappings

You can access the three OMOP2OBO HPO mapping files in the Enclave from the Knowledge store using the following link: https://unite.nih.gov/workspace/compass/view/ri.compass.main.folder.1719efcf-9a87-484f-9a67-be6a29598567.

The mapping set includes three files, but you only need to merge the following two files with existing data in the Enclave in order to be able to create the concept sets:

OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv

OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv

The first file OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv, contains columns for the OMOP concept ids and codes as well as specifies information like whether or not the OMOP concept’s descendants should be included when deriving the concept sets (defaults to FALSE). The other file OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv, contains details on the mapping’s label (i.e., the HPO curie and label in the concept_set_id field) and its provenance/evidence (the specific column to access for this information is called intention).

Creating Concept Sets

Merge these files together on the column named codeset_id and then join them with existing Enclave tables like concept and condition_occurrence to populate the actual concept sets. The name of the concept set can be obtained from the OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv file and is stored as a string in the column called concept_set_id. Although not ideal (but is the best way to approach this currently given what fields are available in the Enclave), to get the HPO CURIE and label will require applying a regex to this column.

An example mapping is shown below (highlighting some of the most useful columns):

codeset_id: 900000000 concept_set_id: [OMOP2OBO] hp_0002031-abnormal_esophagus_morphology concept: 23868 code: 69771008 codeSystem: SNOMED includeDescendants: False intention:

Mixed - This mapping was created using the OMOP2OBO mapping algorithm (https://github.com/callahantiff/OMOP2OBO).

The Mapping Category and Evidence supporting the mappings are provided below, by OMOP concept:

23868

Mapping Category: Automatic Exact - Concept

Mapping Provenance

OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:snomed_69771008 | OBO_DbXref-OMOP_CONCEPT_SOURCE_CODE:snomed_69771008 | CONCEPT_SIMILARITY:HP_0002031_0.713

Release Notes - v2.0.0

Preparation

In order to import data into the Enclave, the following items are needed:

Obtain API Token, which will be included in the authorization header (stored as GitHub Secret)

Obtain username hash from the Enclave

OMOP2OBO Mappings (v1.5.0)

Data

Concept Set Container (concept_set_container): CreateNewConceptSet

Concept Set Version (code_sets): CreateNewDraftOMOPConceptSetVersion

Concept Set Expression Items (concept_set_version_item): addCodeAsVersionExpression

Script

n3c_mapping_conversion.py

Generated Output

Need to have the codeset_id filled from self-generation (ideally, from a conserved range) prior to beginning any of the API steps. The current list of assigned identifiers is stored in the file named omop2obo_enclave_codeset_id_dict_v2.0.0.json. Note that in order to accommodate the 1:Many mappings the codeset ids were re-generated and rather than being ampped to HPO concepts, they are mapped to SNOMED-CT concepts. This creates a cleaner mapping and will easily scale to future mapping builds.

To be consistent with OMOP tools, specifically Atlas, we have also created Atlas-formatted json files for each mapping, which are stored in the zipped directory named atlas_json_files_v2.0.0.zip. Note that as mentioned above, to enable the representation of 1:Many mappings the filenames are no longer named after HPO concepts they are now named with the OMOP concept_id and label and additional fields have been added within the JSON files that includes the HPO ids, labels, mapping category, mapping logic, and mapping evidence.

File 1: concept_set_container

Generated Data: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_container.csv

Columns:

concept_set_id

concept_set_name

intention

assigned_informatician

assigned_sme

project_id

status

stage

n3c_reviewer

alias

archived

created_by

created_at

File 2: concept_set_expression_items

Generated Data: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv

Columns:

codeset_id

concept_id

code

codeSystem

ontology_id

ontology_label

mapping_category

mapping_logic

mapping_evidence

isExcluded

includeDescendants

includeMapped

item_id

annotation

created_by

created_at

File 3: concept_set_version

Generated Data: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv

Columns:

codeset_id

concept_set_id

concept_set_version_title

project

source_application

source_application_version

created_at

atlas_json

most_recent_version

comments

intention

limitations

issues

update_message

status

has_review

reviewed_by

created_by

provenance

atlas_json_resource_url

parent_version_id

is_draft

Generated Output:

OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_container.csv

OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv

OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv

atlas_json_files_v2.0.0.zip

omop2obo_enclave_codeset_id_dict_v2.0.0.json
County Cancer Death Rates
kaggle.com
Updated Dec 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). County Cancer Death Rates [Dataset]. https://www.kaggle.com/datasets/thedevastator/county-cancer-death-rates
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 3, 2023
Dataset provided by
Kaggle
Authors
The Devastator
Description
County Cancer Death Rates

County-level cancer death rates with related variables

By Noah Rippner [source]

About this dataset

This dataset provides comprehensive information on county-level cancer death and incidence rates, as well as various related variables. It includes data on age-adjusted death rates, average deaths per year, recent trends in cancer death rates, recent 5-year trends in death rates, and average annual counts of cancer deaths or incidence. The dataset also includes the federal information processing standards (FIPS) codes for each county.

Additionally, the dataset indicates whether each county met the objective of a targeted death rate of 45.5. The recent trend in cancer deaths or incidence is also captured for analysis purposes.

The purpose of the death.csv file within this dataset is to offer detailed information specifically concerning county-level cancer death rates and related variables. On the other hand, the incd.csv file contains data on county-level cancer incidence rates and additional relevant variables.

To provide more context and understanding about the included data points, there is a separate file named cancer_data_notes.csv. This file serves to provide informative notes and explanations regarding the various aspects of the cancer data used in this dataset.

Please note that this particular description provides an overview for a linear regression walkthrough using this dataset based on Python programming language. It highlights how to source and import the data properly before moving into data preparation steps such as exploratory analysis. The walkthrough further covers model selection and important model diagnostics measures.

It's essential to bear in mind that this example serves as an initial attempt at creating a multivariate Ordinary Least Squares regression model using these datasets from various sources like cancer.gov along with US Census American Community Survey data. This baseline model allows easy comparisons with future iterations intended for improvements or refinements.

Important columns found within this extensively documented Kaggle dataset include County names along with their corresponding FIPS codes—a standardized coding system by Federal Information Processing Standards (FIPS). Moreover,Met Objective of 45.5? (1) column denotes whether a specific county achieved the targeted objective of a death rate of 45.5 or not.

Overall, this dataset aims to offer valuable insights into county-level cancer death and incidence rates across various regions, providing policymakers, researchers, and healthcare professionals with essential information for analysis and decision-making purposes

How to use the dataset

Familiarize Yourself with the Columns:

County: The name of the county.

FIPS: The Federal Information Processing Standards code for the county.

Met Objective of 45.5? (1): Indicates whether the county met the objective of a death rate of 45.5 (Boolean).

Age-Adjusted Death Rate: The age-adjusted death rate for cancer in the county.

Average Deaths per Year: The average number of deaths per year due to cancer in the county.

Recent Trend (2): The recent trend in cancer death rates/incidence in the county.

Recent 5-Year Trend (2) in Death Rates: The recent 5-year trend in cancer death rates/incidence in the county.

Average Annual Count: The average annual count of cancer deaths/incidence in the county.

Determine Counties Meeting Objective: Use this dataset to identify counties that have met or not met an objective death rate threshold of 45.5%. Look for entries where Met Objective of 45.5? (1) is marked as True or False.

Analyze Age-Adjusted Death Rates: Study and compare age-adjusted death rates across different counties using Age-Adjusted Death Rate values provided as floats.

Explore Average Deaths per Year: Examine and compare average annual counts and trends regarding deaths caused by cancer, using Average Deaths per Year as a reference point.

Investigate Recent Trends: Assess recent trends related to cancer deaths or incidence by analyzing data under columns such as Recent Trend, Recent Trend (2), and Recent 5-Year Trend (2) in Death Rates. These columns provide information on how cancer death rates/incidence have changed over time.

Compare Counties: Utilize this dataset to compare counties based on their cancer death rates and related variables. Identify counties with lower or higher average annual counts, age-adjusted death rates, or recent trends to analyze and understand the factors contributing ...

Facebook

Twitter

Click to copy link

Link copied

Cite

Seair Exim (2025). Csv Exports And Imports | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/

Csv Exports And Imports | See Full Import/Export Data | Eximpedia

Eximpedia PTE LTD

Eximpedia Export Import Trade

Eximpedia Export Import Trade data API

Explore at:

.bin, .xml, .csv, .xlsAvailable download formats

Dataset updated

Feb 7, 2025

Dataset provided by

Eximpedia Export Import Trade Data
Eximpedia PTE LTD

Authors

Seair Exim

Area covered

Aruba, United Republic of, Albania, Georgia, Seychelles, Guam, Monaco, Antigua and Barbuda, United Arab Emirates, Northern Mariana Islands

Description

Csv Exports And Imports Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.

Clear search

Close search

Google apps

Main menu

Csv Exports And Imports | See Full Import/Export Data | Eximpedia

Dog Food Data Extracted from Chewy (USA) - 4,500 Records in CSV Format

Use Cases:

Central Bank of Brazil data of foreign capital transfers, 2000-2011

Data from: Large Landing Trajectory Data Set for Go-Around Analysis

load minimum data set

select London City Airport, go-arounds, and 2019-01-04

iterate over flights and pull the data from OpenSky Network

The flights can be converted into a Traffic object

Csv Pharmaceuticals India Private Limited | See Full Import/Export Data |...

Industrial Park Management Bureau of the Ministry of Economic...

Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic...

Speedtest Open Data - Four International cities - MEL, BKK, SHG, LAX plus...

Melb 14784 lines Avg download speed 69.4M Tests 0.39M

SHG 31207 lines Avg 233.7M Tests 0.56M

ALC 113 lines Avg 51.5M Test 1092

BKK 29684 lines Avg 215.9M Tests 1.2M

LAX 15505 lines Avg 218.5M Tests 0.74M

Csv Active S R L | See Full Import/Export Data | Eximpedia

Data from: WormJam Metabolites Local CSV for MetFrag

Data from: Data and code from: Environmental influences on drying rate of...

ckanext-trak

LeetCode CN Problems

Crypto Price Monitoring Dataset for On-chain Derivatives Research

Smartwatch Purchase Data

U.S. Food Imports

letter_recognition

Students Test Data

N3C-Formatted OMOP2OBO Mappings

Mapping Category: Automatic Exact - Concept

Mapping Provenance

County Cancer Death Rates

County Cancer Death Rates

County-level cancer death rates with related variables

About this dataset

How to use the dataset

Csv Exports And Imports | See Full Import/Export Data | EximpediaSee More Versions

Eximpedia PTE LTD

Eximpedia Export Import Trade

Eximpedia Export Import Trade data API

Csv Exports And Imports | See Full Import/Export Data | Eximpedia