94 datasets found
  1. e

    Csv Exports And Imports | See Full Import/Export Data | Eximpedia

    • eximpedia.app
    Updated Feb 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2025). Csv Exports And Imports | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Feb 7, 2025
    Dataset provided by
    Eximpedia Export Import Trade Data
    Eximpedia PTE LTD
    Authors
    Seair Exim
    Area covered
    Aruba, United Republic of, Albania, Georgia, Seychelles, Guam, Monaco, Antigua and Barbuda, United Arab Emirates, Northern Mariana Islands
    Description

    Csv Exports And Imports Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.

  2. c

    Dog Food Data Extracted from Chewy (USA) - 4,500 Records in CSV Format

    • crawlfeeds.com
    csv, zip
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dog Food Data Extracted from Chewy (USA) - 4,500 Records in CSV Format [Dataset]. https://crawlfeeds.com/datasets/dog-food-data-extracted-from-chewy-usa-4-500-records-in-csv-format
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    The Dog Food Data Extracted from Chewy (USA) dataset contains 4,500 detailed records of dog food products sourced from one of the leading pet supply platforms in the United States, Chewy. This dataset is ideal for businesses, researchers, and data analysts who want to explore and analyze the dog food market, including product offerings, pricing strategies, brand diversity, and customer preferences within the USA.

    The dataset includes essential information such as product names, brands, prices, ingredient details, product descriptions, weight options, and availability. Organized in a CSV format for easy integration into analytics tools, this dataset provides valuable insights for those looking to study the pet food market, develop marketing strategies, or train machine learning models.

    Key Features:

    • Record Count: 4,500 dog food product records.
    • Data Fields: Product names, brands, prices, descriptions, ingredients .. etc. Find more fields under data points section.
    • Format: CSV, easy to import into databases and data analysis tools.
    • Source: Extracted from Chewy’s official USA platform.
    • Geography: Focused on the USA dog food market.

    Use Cases:

    • Market Research: Analyze trends and preferences in the USA dog food market, including popular brands, price ranges, and product availability.
    • E-commerce Analysis: Understand how Chewy presents and prices dog food products, helping businesses compare their own product offerings.
    • Competitor Analysis: Compare different brands and products to develop competitive strategies for dog food businesses.
    • Machine Learning Models: Use the dataset for machine learning tasks such as product recommendation systems, demand forecasting, and price optimization.

  3. f

    Central Bank of Brazil data of foreign capital transfers, 2000-2011

    • su.figshare.com
    • researchdata.se
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alice Dauriach; Emma Sundström; Beatrice Crona; Victor Galaz (2023). Central Bank of Brazil data of foreign capital transfers, 2000-2011 [Dataset]. http://doi.org/10.17045/sthlmuni.5857716.v4
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Stockholm University
    Authors
    Alice Dauriach; Emma Sundström; Beatrice Crona; Victor Galaz
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This data set is a subset of the "Records of foreign capital" (Registros de capitais estrangeiros", RCE) published by the Central Bank of Brazil (CBB) on their website.The data set consists of three data files and three corresponding metadata files. All files are in openly accessible .csv or .txt formats. See detailed outline below for data contained in each. Data files contain transaction-specific data such as unique identifier, currency, cancelled status and amount. Metadata files outline variables in the corresponding data file.RCE_Unclean_full_dataset.csv - all transactions published to the Central Bank website from the four main categories outlined belowMetadata_Unclean_full_dataset.csvRCE_Unclean_cancelled_dataset.csv - data extracted from the RCE_Unclean_full_dataset.csv where transactions were registered then cancelledMetadata_Unclean_cancelled_dataset.csvRCE_Clean_selection_dataset.csv - transaction data extracted from RCE_Unclean_full_dataset.csv and RCE_Unclean_cancelled_dataset.csv for the nine companies and criteria identified belowMetadata_Clean_selection_dataset.csvThe data include the period between October 2000 and July 2011. This is the only time span for the data provided by the Central Bank of Brazil at this stage. The records were published monthly by the Central Bank of Brazil as required by Art. 66 in Decree nº 55.762 of 17 February 1965, modified by Decree nº 4.842 of 17 September 2003. The records were published on the bank’s website starting October 2000, as per communique nº 011489 of 7 October 2003. This remained the case until August 2011, after which the amount of each transaction was no longer disclosed (and publication of these stopped altogether after October 2011). The disclosure of the records was suspended in order to review their legal and technical aspects, and ensure their suitability to the requirements of the rules governing the confidentiality of the information (Law nº 12.527 of 18 November 2011 and Decree nº 7724 of May 2012) (pers. comm. Central Bank of Brazil, 2016. Name of contact available upon request to Authors).The records track transfers of foreign capital made from abroad to companies domiciled in Brazil, with information on the foreign company (name and country) transferring the money, and on the company receiving the capital (name and federative unit). For the purpose of this study, we consider the four categories of foreign capital transactions which are published with their amount and currency in the Central Bank’s data, and which are all part of the “Register of financial transactions” (abbreviated RDE-ROF): loans, leasing, financed import and cash in advance (see below for a detailed description). Additional categories exist, such as foreign direct investment (RDE-IED) and External Investment in Portfolio (RDE-Portfólio), for which no amount is published and which are therefore not included.We used the data posted online as PDFs on the bank’s website, and created a script to extract the data automatically from these four categories into the RCE_Unclean_full_dataset.csv file. This data set has not been double-checked manually and may contain errors. We used a similar script to extract rows from the "cancelled transactions" sections of the PDFs into the RCE_Unclean_cancelled_dataset.csv file. This is useful to identify transactions that have been registered to the Central Bank but later cancelled. This data set has not been double-checked manually and may contain errors.From these raw data sets, we conducted the following selections and calculations in order to create the RCE_Clean_selection_dataset.csv file. This data set has been double-checked manually to secure that no errors have been made in the extraction process.We selected all transactions whose recipient company name corresponds to one of these nine companies, or to one of their known subsidiaries in Brazil, according to the list of subsidiaries recorded in the Orbis database, maintained by Bureau Van Dijk. Transactions are included if the recipient company name matches one of the following:- the current or former name of one of the nine companies in our sample (former names are identified using Orbis, Bloomberg’s company profiles or the company website);- the name of a known subsidiary of one of the nine companies, if and only if we find evidence (in Orbis, Bloomberg’s company profiles or on the company website) that this subsidiary was owned at some point during the period 2000-2011, and that it operated in a sector related to the soy or beef industry (including fertilizers and trading activities).For each transaction, we extracted the name of the company sending capital and when possible, attributed the transaction to the known ultimate owner.The name of the countries of origin sometimes comes with typos or different denominations: we harmonized them.A manual check of all the selected data unveiled that a few transactions (n=14), appear twice in the database while bearing the same unique identification number. According to the Central Bank of Brazil (pers. comm., November 2016), this is due to errors in their routine of data extraction. We therefore deleted duplicates in our database, keeping only the latest occurrence of each unique transaction. Six (6) transactions recorded with an amount of zero were also deleted. Two (2) transactions registered in August 2003 with incoherent currencies (Deutsche Mark and Dutch guilder, which were demonetised in early 2002) were also deleted.To secure that the import of data from PDF to the database did not contain any systematic errors, for instance due to mistakes in coding, data were checked in two ways. First, because the script identifies the end of the row in the PDF using the amount of the transaction, which can sometimes fail if the amount is not entered correctly, we went through the extracted raw data (2798 rows) and cleaned all rows whose end had not been correctly identified by the script. Next, we manually double-checked the 486 largest transactions representing 90% of the total amount of capital inflows, as well as 140 randomly selected additional rows representing 5% of the total rows, compared the extracted data to the original PDFs, and found no mistakes.Transfers recorded in the database have been made in different currencies, including US dollars, Euros, Japanese Yens, Brazilian Reais, and more. The conversion to US dollars of all amounts denominated in other currencies was done using the average monthly exchange rate as published by the International Monetary Fund (International Financial Statistics: Exchange rates, national currency per US dollar, period average). Due to the limited time period, we have not corrected for inflation but aggregated nominal amounts in USD over the period 2000-2011.The categories loans, cash in advance (anticipated payment for exports), financed import, and leasing/rental, are those used by the Central Bank of Brazil in their published data. They are denominated respectively: “Loans” (“emprestimos” in original source) - : includes all loans, either contracted directly with creditors or indirectly through the issuance of securities, brokered by foreign agents. “Anticipated payment for exports” (“pagamento/renovacao pagamento antecipado de exportacao” in original source): defined as a type of loan (used in trade finance)“Financed import” (“importacao financiada” in original source): comprises all import financing transactions either direct (contracted by the importer with a foreign bank or with a foreign supplier), or indirect (contracted by Brazilian banks with foreign banks on behalf of Brazilian importers). They must be declared to the Central Bank if their term of payment is superior to 360 days.“Leasing/rental” (“arrendamento mercantil, leasing e aluguel” in original source) : concerns all types of external leasing operations consented by a Brazilian entity to a foreign one. They must be declared if the term of payment is superior to 360 days.More information about the different categories can be found through the Central Bank online.(Research Data Support provided by Springer Nature)

  4. Z

    Data from: Large Landing Trajectory Data Set for Go-Around Analysis

    • data.niaid.nih.gov
    • zenodo.org
    Updated Dec 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcel Dettling (2022). Large Landing Trajectory Data Set for Go-Around Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7148116
    Explore at:
    Dataset updated
    Dec 16, 2022
    Dataset provided by
    Raphael Monstein
    Manuel Waltert
    Timothé Krauth
    Benoit Figuet
    Marcel Dettling
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.

    If you use this data for a scientific publication, please consider citing our paper.

    The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:

    go_arounds_minimal.csv.gz

    Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:

        Column name
        Type
        Description
    
    
    
    
        time
        date time
        UTC time of landing or first GA attempt
    
    
        icao24
        string
        Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    
    
        callsign
        string
        Aircraft identifier in air-ground communications
    
    
        airport
        string
        ICAO airport code where the aircraft is landing
    
    
        runway
        string
        Runway designator on which the aircraft landed
    
    
        has_ga
        string
        "True" if at least one GA was performed, otherwise "False"
    
    
        n_approaches
        integer
        Number of approaches identified for this flight
    
    
        n_rwy_approached
        integer
        Number of unique runways approached by this flight
    

    The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.

    go_arounds_augmented.csv.gz

    Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:

        Column name
        Type
        Description
    
    
    
    
        time
        date time
        UTC time of landing or first GA attempt
    
    
        icao24
        string
        Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    
    
        callsign
        string
        Aircraft identifier in air-ground communications
    
    
        airport
        string
        ICAO airport code where the aircraft is landing
    
    
        runway
        string
        Runway designator on which the aircraft landed
    
    
        has_ga
        string
        "True" if at least one GA was performed, otherwise "False"
    
    
        n_approaches
        integer
        Number of approaches identified for this flight
    
    
        n_rwy_approached
        integer
        Number of unique runways approached by this flight
    
    
        registration
        string
        Aircraft registration
    
    
        typecode
        string
        Aircraft ICAO typecode
    
    
        icaoaircrafttype
        string
        ICAO aircraft type
    
    
        wtc
        string
        ICAO wake turbulence category
    
    
        glide_slope_angle
        float
        Angle of the ILS glide slope in degrees
    
    
        has_intersection
    

    string

        Boolean that is true if the runway has an other runway intersecting it, otherwise false
    
    
        rwy_length
        float
        Length of the runway in kilometre
    
    
        airport_country
        string
        ISO Alpha-3 country code of the airport
    
    
        airport_region
        string
        Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
    
    
        operator_country
        string
        ISO Alpha-3 country code of the operator
    
    
        operator_region
        string
        Geographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania)
    
    
        wind_speed_knts
        integer
        METAR, surface wind speed in knots
    
    
        wind_dir_deg
        integer
        METAR, surface wind direction in degrees
    
    
        wind_gust_knts
        integer
        METAR, surface wind gust speed in knots
    
    
        visibility_m
        float
        METAR, visibility in m
    
    
        temperature_deg
        integer
        METAR, temperature in degrees Celsius
    
    
        press_sea_level_p
        float
        METAR, sea level pressure in hPa
    
    
        press_p
        float
        METAR, QNH in hPA
    
    
        weather_intensity
        list
        METAR, list of present weather codes: qualifier - intensity
    
    
        weather_precipitation
        list
        METAR, list of present weather codes: weather phenomena - precipitation
    
    
        weather_desc
        list
        METAR, list of present weather codes: qualifier - descriptor
    
    
        weather_obscuration
        list
        METAR, list of present weather codes: weather phenomena - obscuration
    
    
        weather_other
        list
        METAR, list of present weather codes: weather phenomena - other
    

    This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.

    go_arounds_agg.csv.gz

    Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:

        Column name
        Type
        Description
    
    
    
    
        airport
        string
        ICAO airport code where the aircraft is landing
    
    
        runway
        string
        Runway designator on which the aircraft landed
    
    
        n_landings
        integer
        Total number of landings observed on this runway in 2019
    
    
        ga_rate
        float
        Go-around rate, per 1000 landings
    
    
        glide_slope_angle
        float
        Angle of the ILS glide slope in degrees
    
    
        has_intersection
        string
        Boolean that is true if the runway has an other runway intersecting it, otherwise false
    
    
        rwy_length
        float
        Length of the runway in kilometres
    
    
        airport_country
        string
        ISO Alpha-3 country code of the airport
    
    
        airport_region
        string
        Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
    

    This aggregated data set is used in the paper for the generalized linear regression model.

    Downloading the trajectories

    Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:

    import datetime from tqdm.auto import tqdm import pandas as pd from traffic.data import opensky from traffic.core import Traffic

    load minimum data set

    df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False) df["time"] = pd.to_datetime(df["time"])

    select London City Airport, go-arounds, and 2019-01-04

    airport = "EGLC" start = datetime.datetime(year=2019, month=1, day=4).replace( tzinfo=datetime.timezone.utc ) stop = datetime.datetime(year=2019, month=1, day=5).replace( tzinfo=datetime.timezone.utc )

    df_selection = df.query("airport==@airport & has_ga & (@start <= time <= @stop)")

    iterate over flights and pull the data from OpenSky Network

    flights = [] delta_time = pd.Timedelta(minutes=10) for _, row in tqdm(df_selection.iterrows(), total=df_selection.shape[0]): # take at most 10 minutes before and 10 minutes after the landing or go-around start_time = row["time"] - delta_time stop_time = row["time"] + delta_time

    # fetch the data from OpenSky Network
    flights.append(
      opensky.history(
        start=start_time.strftime("%Y-%m-%d %H:%M:%S"),
        stop=stop_time.strftime("%Y-%m-%d %H:%M:%S"),
        callsign=row["callsign"],
        return_flight=True,
      )
    )
    

    The flights can be converted into a Traffic object

    Traffic.from_flights(flights)

    Additional files

    Additional files are available to check the quality of the classification into GA/not GA and the selection of the landing runway. These are:

    validation_table.xlsx: This Excel sheet was manually completed during the review of the samples for each runway in the data set. It provides an estimate of the false positive and false negative rate of the go-around classification. It also provides an estimate of the runway misclassification rate when the airport has two or more parallel runways. The columns with the headers highlighted in red were filled in manually, the rest is generated automatically.

    validation_sample.zip: For each runway, 8 batches of 500 randomly selected trajectories (or as many as available, if fewer than 4000) classified as not having a GA and up to 8 batches of 10 random landings, classified as GA, are plotted. This allows the interested user to visually inspect a random sample of the landings and go-arounds easily.

  5. e

    Csv Pharmaceuticals India Private Limited | See Full Import/Export Data |...

    • eximpedia.app
    Updated Jan 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2024). Csv Pharmaceuticals India Private Limited | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Jan 19, 2024
    Dataset provided by
    Eximpedia Export Import Trade Data
    Eximpedia PTE LTD
    Authors
    Seair Exim
    Area covered
    India
    Description

    Eximpedia Export import trade data lets you search trade data and active Exporters, Importers, Buyers, Suppliers, manufacturers exporters from over 209 countries

  6. Industrial Park Management Bureau of the Ministry of Economic...

    • data.gov.tw
    csv
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bureau of Industrial Parks, Ministry of Economic Affairs (2025). Industrial Park Management Bureau of the Ministry of Economic Affairs_Statistics on Import and Export Trade Volume of Science and Technology Industrial Parks [Dataset]. https://data.gov.tw/en/datasets/25792
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 2, 2025
    Dataset authored and provided by
    Bureau of Industrial Parks, Ministry of Economic Affairs
    License

    https://data.gov.tw/licensehttps://data.gov.tw/license

    Description

    Provide "Statistics of Import and Export Trade Volume of Each Park" to let the public understand the import and export and its growth trend of each park. In addition to updating this information every month, CSV file format is also provided for free download and use by the public.The dataset includes statistics on the import and export trade volume of parks such as Nanzih, Kaohsiung, Taichung, Zhonggang, Pingtung, and other parks (Lingguang, Chenggong, Gaoruan), with main fields including "Park, Import and Export (This Month, Year-to-Date)", "Export (This Month, Year-to-Date)", "Import (This Month, Year-to-Date)", and other important information.

  7. Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic...

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv, zip
    Updated Dec 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander R. Hartloper; Alexander R. Hartloper; Selimcan Ozden; Albano de Castro e Sousa; Dimitrios G. Lignos; Dimitrios G. Lignos; Selimcan Ozden; Albano de Castro e Sousa (2022). Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic Materials [Dataset]. http://doi.org/10.5281/zenodo.6965147
    Explore at:
    bin, zip, csvAvailable download formats
    Dataset updated
    Dec 24, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander R. Hartloper; Alexander R. Hartloper; Selimcan Ozden; Albano de Castro e Sousa; Dimitrios G. Lignos; Dimitrios G. Lignos; Selimcan Ozden; Albano de Castro e Sousa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic Materials

    Background

    This dataset contains data from monotonic and cyclic loading experiments on structural metallic materials. The materials are primarily structural steels and one iron-based shape memory alloy is also included. Summary files are included that provide an overview of the database and data from the individual experiments is also included.

    The files included in the database are outlined below and the format of the files is briefly described. Additional information regarding the formatting can be found through the post-processing library (https://github.com/ahartloper/rlmtp/tree/master/protocols).

    Usage

    • The data is licensed through the Creative Commons Attribution 4.0 International.
    • If you have used our data and are publishing your work, we ask that you please reference both:
      1. this database through its DOI, and
      2. any publication that is associated with the experiments. See the Overall_Summary and Database_References files for the associated publication references.

    Included Files

    • Overall_Summary_2022-08-25_v1-0-0.csv: summarises the specimen information for all experiments in the database.
    • Summarized_Mechanical_Props_Campaign_2022-08-25_v1-0-0.csv: summarises the average initial yield stress and average initial elastic modulus per campaign.
    • Unreduced_Data-#_v1-0-0.zip: contain the original (not downsampled) data
      • Where # is one of: 1, 2, 3, 4, 5, 6. The unreduced data is broken into separate archives because of upload limitations to Zenodo. Together they provide all the experimental data.
      • We recommend you un-zip all the folders and place them in one "Unreduced_Data" directory similar to the "Clean_Data"
      • The experimental data is provided through .csv files for each test that contain the processed data. The experiments are organised by experimental campaign and named by load protocol and specimen. A .pdf file accompanies each test showing the stress-strain graph.
      • There is a "db_tag_clean_data_map.csv" file that is used to map the database summary with the unreduced data.
      • The computed yield stresses and elastic moduli are stored in the "yield_stress" directory.
    • Clean_Data_v1-0-0.zip: contains all the downsampled data
      • The experimental data is provided through .csv files for each test that contain the processed data. The experiments are organised by experimental campaign and named by load protocol and specimen. A .pdf file accompanies each test showing the stress-strain graph.
      • There is a "db_tag_clean_data_map.csv" file that is used to map the database summary with the clean data.
      • The computed yield stresses and elastic moduli are stored in the "yield_stress" directory.
    • Database_References_v1-0-0.bib
      • Contains a bibtex reference for many of the experiments in the database. Corresponds to the "citekey" entry in the summary files.

    File Format: Downsampled Data

    These are the "LP_

    • The header of the first column is empty: the first column corresponds to the index of the sample point in the original (unreduced) data
    • Time[s]: time in seconds since the start of the test
    • e_true: true strain
    • Sigma_true: true stress in MPa
    • (optional) Temperature[C]: the surface temperature in degC

    These data files can be easily loaded using the pandas library in Python through:

    import pandas
    data = pandas.read_csv(data_file, index_col=0)

    The data is formatted so it can be used directly in RESSPyLab (https://github.com/AlbanoCastroSousa/RESSPyLab). Note that the column names "e_true" and "Sigma_true" were kept for backwards compatibility reasons with RESSPyLab.

    File Format: Unreduced Data

    These are the "LP_

    • The first column is the index of each data point
    • S/No: sample number recorded by the DAQ
    • System Date: Date and time of sample
    • Time[s]: time in seconds since the start of the test
    • C_1_Force[kN]: load cell force
    • C_1_DĂ©form1[mm]: extensometer displacement
    • C_1_DĂ©placement[mm]: cross-head displacement
    • Eng_Stress[MPa]: engineering stress
    • Eng_Strain[]: engineering strain
    • e_true: true strain
    • Sigma_true: true stress in MPa
    • (optional) Temperature[C]: specimen surface temperature in degC

    The data can be loaded and used similarly to the downsampled data.

    File Format: Overall_Summary

    The overall summary file provides data on all the test specimens in the database. The columns include:

    • hidden_index: internal reference ID
    • grade: material grade
    • spec: specifications for the material
    • source: base material for the test specimen
    • id: internal name for the specimen
    • lp: load protocol
    • size: type of specimen (M8, M12, M20)
    • gage_length_mm_: unreduced section length in mm
    • avg_reduced_dia_mm_: average measured diameter for the reduced section in mm
    • avg_fractured_dia_top_mm_: average measured diameter of the top fracture surface in mm
    • avg_fractured_dia_bot_mm_: average measured diameter of the bottom fracture surface in mm
    • fy_n_mpa_: nominal yield stress
    • fu_n_mpa_: nominal ultimate stress
    • t_a_deg_c_: ambient temperature in degC
    • date: date of test
    • investigator: person(s) who conducted the test
    • location: laboratory where test was conducted
    • machine: setup used to conduct test
    • pid_force_k_p, pid_force_t_i, pid_force_t_d: PID parameters for force control
    • pid_disp_k_p, pid_disp_t_i, pid_disp_t_d: PID parameters for displacement control
    • pid_extenso_k_p, pid_extenso_t_i, pid_extenso_t_d: PID parameters for extensometer control
    • citekey: reference corresponding to the Database_References.bib file
    • yield_stress_mpa_: computed yield stress in MPa
    • elastic_modulus_mpa_: computed elastic modulus in MPa
    • fracture_strain: computed average true strain across the fracture surface
    • c,si,mn,p,s,n,cu,mo,ni,cr,v,nb,ti,al,b,zr,sn,ca,h,fe: chemical compositions in units of %mass
    • file: file name of corresponding clean (downsampled) stress-strain data

    File Format: Summarized_Mechanical_Props_Campaign

    Meant to be loaded in Python as a pandas DataFrame with multi-indexing, e.g.,

    tab1 = pd.read_csv('Summarized_Mechanical_Props_Campaign_' + date + version + '.csv',
              index_col=[0, 1, 2, 3], skipinitialspace=True, header=[0, 1],
              keep_default_na=False, na_values='')
    • citekey: reference in "Campaign_References.bib".
    • Grade: material grade.
    • Spec.: specifications (e.g., J2+N).
    • Yield Stress [MPa]: initial yield stress in MPa
      • size, count, mean, coefvar: number of experiments in campaign, number of experiments in mean, mean value for campaign, coefficient of variation for campaign
    • Elastic Modulus [MPa]: initial elastic modulus in MPa
      • size, count, mean, coefvar: number of experiments in campaign, number of experiments in mean, mean value for campaign, coefficient of variation for campaign

    Caveats

    • The files in the following directories were tested before the protocol was established. Therefore, only the true stress-strain is available for each:
      • A500
      • A992_Gr50
      • BCP325
      • BCR295
      • HYP400
      • S460NL
      • S690QL/25mm
      • S355J2_Plates/S355J2_N_25mm and S355J2_N_50mm
  8. Speedtest Open Data - Four International cities - MEL, BKK, SHG, LAX plus...

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard Ferrers; Speedtest Global Index (2023). Speedtest Open Data - Four International cities - MEL, BKK, SHG, LAX plus ALC - 2020, 2022 [Dataset]. http://doi.org/10.6084/m9.figshare.13621169.v24
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Richard Ferrers; Speedtest Global Index
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset compares four cities FIXED-line broadband internet speeds: - Melbourne, AU - Bangkok, TH - Shanghai, CN - Los Angeles, US - Alice Springs, AU

    ERRATA: 1.Data is for Q3 2020, but some files are labelled incorrectly as 02-20 of June 20. They all should read Sept 20, or 09-20 as Q3 20, rather than Q2. Will rename and reload. Amended in v7.

    1. LAX file named 0320, when should be Q320. Amended in v8.

    *lines of data for each geojson file; a line equates to a 600m^2 location, inc total tests, devices used, and average upload and download speed - MEL 16181 locations/lines => 0.85M speedtests (16.7 tests per 100people) - SHG 31745 lines => 0.65M speedtests (2.5/100pp) - BKK 29296 lines => 1.5M speedtests (14.3/100pp) - LAX 15899 lines => 1.3M speedtests (10.4/100pp) - ALC 76 lines => 500 speedtests (2/100pp)

    Geojsons of these 2* by 2* extracts for MEL, BKK, SHG now added, and LAX added v6. Alice Springs added v15.

    This dataset unpacks, geospatially, data summaries provided in Speedtest Global Index (linked below). See Jupyter Notebook (*.ipynb) to interrogate geo data. See link to install Jupyter.

    ** To Do Will add Google Map versions so everyone can see without installing Jupyter. - Link to Google Map (BKK) added below. Key:Green > 100Mbps(Superfast). Black > 500Mbps (Ultrafast). CSV provided. Code in Speedtestv1.1.ipynb Jupyter Notebook. - Community (Whirlpool) surprised [Link: https://whrl.pl/RgAPTl] that Melb has 20% at or above 100Mbps. Suggest plot Top 20% on map for community. Google Map link - now added (and tweet).

    ** Python melb = au_tiles.cx[144:146 , -39:-37] #Lat/Lon extract shg = tiles.cx[120:122 , 30:32] #Lat/Lon extract bkk = tiles.cx[100:102 , 13:15] #Lat/Lon extract lax = tiles.cx[-118:-120, 33:35] #lat/Lon extract ALC=tiles.cx[132:134, -22:-24] #Lat/Lon extract

    Histograms (v9), and data visualisations (v3,5,9,11) will be provided. Data Sourced from - This is an extract of Speedtest Open data available at Amazon WS (link below - opendata.aws).

    **VERSIONS v.24 Add tweet and google map of Top 20% (over 100Mbps locations) in Mel Q322. Add v.1.5 MEL-Superfast notebook, and CSV of results (now on Google Map; link below). v23. Add graph of 2022 Broadband distribution, and compare 2020 - 2022. Updated v1.4 Jupyter notebook. v22. Add Import ipynb; workflow-import-4cities. v21. Add Q3 2022 data; five cities inc ALC. Geojson files. (2020; 4.3M tests 2022; 2.9M tests)

    Melb 14784 lines Avg download speed 69.4M Tests 0.39M

    SHG 31207 lines Avg 233.7M Tests 0.56M

    ALC 113 lines Avg 51.5M Test 1092

    BKK 29684 lines Avg 215.9M Tests 1.2M

    LAX 15505 lines Avg 218.5M Tests 0.74M

    v20. Speedtest - Five Cities inc ALC. v19. Add ALC2.ipynb. v18. Add ALC line graph. v17. Added ipynb for ALC. Added ALC to title.v16. Load Alice Springs Data Q221 - csv. Added Google Map link of ALC. v15. Load Melb Q1 2021 data - csv. V14. Added Melb Q1 2021 data - geojson. v13. Added Twitter link to pics. v12 Add Line-Compare pic (fastest 1000 locations) inc Jupyter (nbn-intl-v1.2.ipynb). v11 Add Line-Compare pic, plotting Four Cities on a graph. v10 Add Four Histograms in one pic. v9 Add Histogram for Four Cities. Add NBN-Intl.v1.1.ipynb (Jupyter Notebook). v8 Renamed LAX file to Q3, rather than 03. v7 Amended file names of BKK files to correctly label as Q3, not Q2 or 06. v6 Added LAX file. v5 Add screenshot of BKK Google Map. v4 Add BKK Google map(link below), and BKK csv mapping files. v3 replaced MEL map with big key version. Prev key was very tiny in top right corner. v2 Uploaded MEL, SHG, BKK data and Jupyter Notebook v1 Metadata record

    ** LICENCE AWS data licence on Speedtest data is "CC BY-NC-SA 4.0", so use of this data must be: - non-commercial (NC) - reuse must be share-alike (SA)(add same licence). This restricts the standard CC-BY Figshare licence.

    ** Other uses of Speedtest Open Data; - see link at Speedtest below.

  9. e

    Csv Active S R L | See Full Import/Export Data | Eximpedia

    • eximpedia.app
    Updated Feb 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2025). Csv Active S R L | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Feb 6, 2025
    Dataset provided by
    Eximpedia Export Import Trade Data
    Eximpedia PTE LTD
    Authors
    Seair Exim
    Area covered
    Poland, India, Israel, Moldova (Republic of), Equatorial Guinea, Paraguay, Suriname, Morocco, Guinea-Bissau, Timor-Leste
    Description

    Csv Active S R L Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.

  10. Z

    Data from: WormJam Metabolites Local CSV for MetFrag

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Witting, Michael (2020). WormJam Metabolites Local CSV for MetFrag [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3403364
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Schymanski, Emma
    Witting, Michael
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a local CSV file of WormJam (https://www.tandfonline.com/doi/full/10.1080/21624054.2017.1373939) for MetFrag (https://msbi.ipb-halle.de/MetFrag/).

    The text file provided by Michael (also part of this dataset) was modified into CSV by adding identifiers and adjusting headers for MetFrag import.

    This CSV file is for users wanting to integrate WormJam into MetFrag CL workflows (offline), this file will be integrated into MetFrag online; please use the file in the dropdown menu rather than uploading this one.

    Update 10 Sept 2019: curated truncated InChIKey, InChI entries, added missing SMILES, added DTXSIDs by InChIKey match.

  11. Data from: Data and code from: Environmental influences on drying rate of...

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Data and code from: Environmental influences on drying rate of spray applied disinfestants from horticultural production services [Dataset]. https://catalog.data.gov/dataset/data-and-code-from-environmental-influences-on-drying-rate-of-spray-applied-disinfestants-
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    This dataset includes all the data and R code needed to reproduce the analyses in a forthcoming manuscript:Copes, W. E., Q. D. Read, and B. J. Smith. Environmental influences on drying rate of spray applied disinfestants from horticultural production services. PhytoFrontiers, DOI pending.Study description: Instructions for disinfestants typically specify a dose and a contact time to kill plant pathogens on production surfaces. A problem occurs when disinfestants are applied to large production areas where the evaporation rate is affected by weather conditions. The common contact time recommendation of 10 min may not be achieved under hot, sunny conditions that promote fast drying. This study is an investigation into how the evaporation rates of six commercial disinfestants vary when applied to six types of substrate materials under cool to hot and cloudy to sunny weather conditions. Initially, disinfestants with low surface tension spread out to provide 100% coverage and disinfestants with high surface tension beaded up to provide about 60% coverage when applied to hard smooth surfaces. Disinfestants applied to porous materials were quickly absorbed into the body of the material, such as wood and concrete. Even though disinfestants evaporated faster under hot sunny conditions than under cool cloudy conditions, coverage was reduced considerably in the first 2.5 min under most weather conditions and reduced to less than or equal to 50% coverage by 5 min. Dataset contents: This dataset includes R code to import the data and fit Bayesian statistical models using the model fitting software CmdStan, interfaced with R using the packages brms and cmdstanr. The models (one for 2022 and one for 2023) compare how quickly different spray-applied disinfestants dry, depending on what chemical was sprayed, what surface material it was sprayed onto, and what the weather conditions were at the time. Next, the statistical models are used to generate predictions and compare mean drying rates between the disinfestants, surface materials, and weather conditions. Finally, tables and figures are created. These files are included:Drying2022.csv: drying rate data for the 2022 experimental runWeather2022.csv: weather data for the 2022 experimental runDrying2023.csv: drying rate data for the 2023 experimental runWeather2023.csv: weather data for the 2023 experimental rundisinfestant_drying_analysis.Rmd: RMarkdown notebook with all data processing, analysis, and table creation codedisinfestant_drying_analysis.html: rendered output of notebookMS_figures.R: additional R code to create figures formatted for journal requirementsfit2022_discretetime_weather_solar.rds: fitted brms model object for 2022. This will allow users to reproduce the model prediction results without having to refit the model, which was originally fit on a high-performance computing clusterfit2023_discretetime_weather_solar.rds: fitted brms model object for 2023data_dictionary.xlsx: descriptions of each column in the CSV data files

  12. c

    ckanext-trak

    • catalog.civicdataecosystem.org
    Updated Jun 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). ckanext-trak [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-trak
    Explore at:
    Dataset updated
    Jun 4, 2025
    Description

    The trak extension for CKAN enhances the platform's tracking capabilities by providing tools to import Google Analytics data and modify the presentation of page view statistics. It introduces a paster command for importing page view data from exported Google Analytics CSV files, enabling users to supplement CKAN's built-in tracking. The extension also includes template customizations to alter how page view counts are displayed on dataset and resource listing pages. Key Features: Google Analytics Data Import: Imports page view data directly from a stripped-down CSV of Google Analytics data using a dedicated paster command (csv2table). The CSV should contain a list of page views, where each row starts with '/'. The PageViews column is expected to be the 3rd column. Customizable Page View Display: Changes the default presentation of page view statistics within CKAN, removing the minimum view count restriction (default is 10) so all views can be seen and modifies UI elements. Altered Page Tracking Stats: Alters the placement of page tracking statistics, moving them below Package Data (on dataset list pages) and Resource Data (on resource list pages) for better integration of tracking data. UI/UX Enhancements: Replaces the flame icon typically used for page tracking and substitutes it with more subtle background styling to modernize the presentation of tracking data. Backend Data Manipulation Uses a 'floor date' of 2011-01-01 for page view calculation. Entries are made in the trackingraw table for each view, with a unique UUID. Integration with CKAN: The extension integrates into CKAN's core functionalities by introducing a new paster command and modifying existing templates for displaying page view statistics. It relies on CKAN's built-in tracking to be enabled, but supplements its capabilities with imported data and presentation adjustments. After importing data using the csv2table paster command, the standard tracking update and search-index rebuild paster tasks need to be run to process the imported data and update the search index.. Benefits & Impact: By importing data from Google Analytics, the trak extension allows administrators to see a holistic view of page views. It changes the user experience to facilitate tracking statistics in a more integrated fashion. This allows for a better understanding of the impact and utilization of resources within the CKAN instance, based on Google Analytics data.

  13. LeetCode CN Problems

    • kaggle.com
    Updated Apr 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    imba-tjd (2024). LeetCode CN Problems [Dataset]. https://www.kaggle.com/datasets/imbatjd/leetcode-cn-problems/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 5, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    imba-tjd
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The data was collected on 2024-04-05 containing 3492 problems.

    Cleaned via the following script.

    import json
    import csv
    from io import TextIOWrapper
    
    
    def clean(data: dict):
      questions = data['data']['problemsetQuestionList']['questions']
      for q in questions:
        yield {
          'id': q['frontendQuestionId'],
          'difficulty': q['difficulty'],
          'title': q['title'],
          'titleCn': q['titleCn'],
          'titleSlug': q['titleSlug'],
          'paidOnly': q['paidOnly'],
          'acRate': round(q['acRate'], 3),
          'topicTags': [t['name'] for t in q['topicTags']],
        }
    
    
    def out_jsonl(f: TextIOWrapper):
      for id in range(0, 35):
        with open(f'data/{id}.json', encoding='u8') as f2:
          data = json.load(f2)
    
        for q in clean(data):
          f.write(json.dumps(q, ensure_ascii=False))
          f.write('
    ')
    
    
    def out_json(f: TextIOWrapper):
      l = []
      for id in range(0, 35):
        with open(f'data/{id}.json', encoding='u8') as f2:
          data = json.load(f2)
    
        for q in clean(data):
          l.append(q)
    
      json.dump(l, f, ensure_ascii=False)
    
    
    def out_csv(f: TextIOWrapper):
      writer = csv.DictWriter(f, fieldnames=[
        'id', 'difficulty', 'title', 'titleCn', 'titleSlug', 'paidOnly', 'acRate', 'topicTags'
      ])
      writer.writeheader()
    
      for id in range(0, 35):
        with open(f'data/{id}.json', encoding='u8') as f2:
          data = json.load(f2)
    
        writer.writerows(clean(data))
    
    
    with open('data.jsonl', 'w', encoding='u8') as f:
      out_jsonl(f)
    
    with open('data.json', 'w', encoding='u8') as f:
      out_json(f)
    
    with open('data.csv', 'w', encoding='u8', newline='') as f:
      out_csv(f)
    
  14. Crypto Price Monitoring Dataset for On-chain Derivatives Research

    • zenodo.org
    csv
    Updated Mar 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ivan Vakhmyanin; Yana Volkovich; Ivan Vakhmyanin; Yana Volkovich (2023). Crypto Price Monitoring Dataset for On-chain Derivatives Research [Dataset]. http://doi.org/10.5281/zenodo.7749133
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 19, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ivan Vakhmyanin; Yana Volkovich; Ivan Vakhmyanin; Yana Volkovich
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    # Crypto Price Monitoring Repository

    This repository contains two CSV data files that were created to support the research titled "Price Arbitrage for DeFi Derivatives." This research is to be presented at the IEEE International Conference on Blockchain and Cryptocurrencies, taking place on 5th May 2023 in Dubai, UAE. The data files include monitoring prices for various crypto assets from several sources. The data files are structured with five columns, providing information about the symbol, unified symbol, time, price, and source of the price.

    ## Data Files

    There are two CSV data files in this repository (one for each date):

    1. `Pricemon_results_2023_01_13.csv`
    2. `Pricemon_results_2023_01_14.csv`

    ## Data Format

    Both data files have the same format and structure, with the following five columns:

    1. `symbol`: The trading symbol for the crypto asset (e.g., BTC, ETH).
    2. `unified_symbol`: A standardized symbol used across different platforms.
    3. `time`: Timestamp for when the price data was recorded (in UTC format).
    4. `price`: The price of the crypto asset at the given time (in USD).
    5. `source`: The name of the price source for the data.

    ## Price Sources

    The `source` column in the data files refers to the provider of the price data for each record. The sources include:

    - `chainlink`: Chainlink Price Oracle
    - `mycellium`: Built-in oracle of the Mycellium platform
    - `bitfinex`: Bitfinex cryptocurrency exchange
    - `ftx`: FTX cryptocurrency exchange
    - `binance`: Binance cryptocurrency exchange

    ## Usage

    You can use these data files for various purposes, such as analyzing price discrepancies across different sources, identifying trends, or developing trading algorithms. To use the data, simply import the CSV files into your preferred data processing or analysis tool.

    ### Example

    Here's an example of how you can read and display the data using Python and the pandas library:

    import pandas as pd

    # Read the data from CSV file
    data = pd.read_csv('Pricemon_results_2023_01_13.csv')

    # Display the first 5 rows of the data
    print(data.head())`

    ## Acknowledgements

    These datasets were recorded and supported by Datamint company (value-added on-chain data provider) and its team.


    ## Contributing

    If you have any suggestions or find any issues with the data, please feel free to contact authors.

  15. Smartwatch Purchase Data

    • kaggle.com
    Updated Dec 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aayush Chourasiya (2022). Smartwatch Purchase Data [Dataset]. https://www.kaggle.com/datasets/albedo0/smartwatch-purchase-data/versions/2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 30, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aayush Chourasiya
    Description

    Disclaimer: This is an artificially generated data using a python script based on arbitrary assumptions listed down.

    The data consists of 100,000 examples of training data and 10,000 examples of test data, each representing a user who may or may not buy a smart watch.

    ----- Version 1 -------

    trainingDataV1.csv, testDataV1.csv or trainingData.csv, testData.csv The data includes the following features for each user: 1. age: The age of the user (integer, 18-70) 1. income: The income of the user (integer, 25,000-200,000) 1. gender: The gender of the user (string, "male" or "female") 1. maritalStatus: The marital status of the user (string, "single", "married", or "divorced") 1. hour: The hour of the day (integer, 0-23) 1. weekend: A boolean indicating whether it is the weekend (True or False) 1. The data also includes a label for each user indicating whether they are likely to buy a smart watch or not (string, "yes" or "no"). The label is determined based on the following arbitrary conditions: - If the user is divorced and a random number generated by the script is less than 0.4, the label is "no" (i.e., assuming 40% of divorcees are not likely to buy a smart watch) - If it is the weekend and a random number generated by the script is less than 1.3, the label is "yes". (i.e., assuming sales are 30% more likely to occur on weekends) - If the user is male and under 30 with an income over 75,000, the label is "yes". - If the user is female and 30 or over with an income over 100,000, the label is "yes". Otherwise, the label is "no".

    The training data is intended to be used to build and train a classification model, and the test data is intended to be used to evaluate the performance of the trained model.

    Following Python script was used to generate this dataset

    import random
    import csv
    
    # Set the number of examples to generate
    numExamples = 100000
    
    # Generate the training data
    with open("trainingData.csv", "w", newline="") as csvfile:
      fieldnames = ["age", "income", "gender", "maritalStatus", "hour", "weekend", "buySmartWatch"]
      writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    
      writer.writeheader()
    
      for i in range(numExamples):
        age = random.randint(18, 70)
        income = random.randint(25000, 200000)
        gender = random.choice(["male", "female"])
        maritalStatus = random.choice(["single", "married", "divorced"])
        hour = random.randint(0, 23)
        weekend = random.choice([True, False])
    
        # Randomly assign the label based on some arbitrary conditions
        # assuming 40% of divorcees won't buy a smart watch
        if maritalStatus == "divorced" and random.random() < 0.4:
          buySmartWatch = "no"
        # assuming sales are 30% more likely to occur on weekends.
        elif weekend == True and random.random() < 1.3:
          buySmartWatch = "yes"
        elif gender == "male" and age < 30 and income > 75000:
          buySmartWatch = "yes"
        elif gender == "female" and age >= 30 and income > 100000:
          buySmartWatch = "yes"
        else:
          buySmartWatch = "no"
    
        writer.writerow({
          "age": age,
          "income": income,
          "gender": gender,
          "maritalStatus": maritalStatus,
          "hour": hour,
          "weekend": weekend,
          "buySmartWatch": buySmartWatch
        })
    

    ----- Version 2 -------

    trainingDataV2.csv, testDataV2.csv The data includes the following features for each user: 1. age: The age of the user (integer, 18-70) 1. income: The income of the user (integer, 25,000-200,000) 1. gender: The gender of the user (string, "male" or "female") 1. maritalStatus: The marital status of the user (string, "single", "married", or "divorced") 1. educationLevel: The education level of the user (string, "high school", "associate's degree", "bachelor's degree", "master's degree", or "doctorate") 1. occupation: The occupation of the user (string, "tech worker", "manager", "executive", "sales", "customer service", "creative", "manual labor", "healthcare", "education", "government", "unemployed", or "student") 1. familySize: The number of people in the user's family (integer, 1-5) 1. fitnessInterest: A boolean indicating whether the user is interested in fitness (True or False) 1. priorSmartwatchOwnership: A boolean indicating whether the user has owned a smartwatch in the past (True or False) 1. hour: The hour of the day when the user was surveyed (integer, 0-23) 1. weekend: A boolean indicating whether the user was surveyed on a weekend (True or False) 1. buySmartWatch: A boolean indicating whether the user purchased a smartwatch (True or False)

    Python script used to generate the data:

    import random
    import csv
    
    # Set the number of examples to generate
    numExamples = 100000
    
    with open("t...
    
  16. U.S. Food Imports

    • catalog.data.gov
    • cloud.csiss.gmu.edu
    • +5more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Economic Research Service, Department of Agriculture (2025). U.S. Food Imports [Dataset]. https://catalog.data.gov/dataset/u-s-food-imports
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Economic Research Servicehttp://www.ers.usda.gov/
    Area covered
    United States
    Description

    U.S. consumers demand variety, quality, and convenience in the foods they consume. As Americans have become wealthier and more ethnically diverse, the American food basket reflects a growing share of tropical products, spices, and imported gourmet products. Seasonal and climatic factors drive U.S. imports of popular types of fruits and vegetables and tropical products, such as cocoa and coffee. In addition, a growing share of U.S. imports can be attributed to intra-industry trade, whereby agricultural-processing industries based in the United States carry out certain processing steps offshore and import products at different levels of processing from their subsidiaries in foreign markets. This data set provides import values of edible products (food and beverages) entering U.S. ports and their origin of shipment. Data are from the U.S. Department of Commerce, U.S. Census Bureau. Food and beverage import values are compiled by calendar year into food groups corresponding to major commodities or level of processing. At least 10 years of annual data are included, enabling users to track long-term growth patterns.

  17. h

    letter_recognition

    • huggingface.co
    Updated Sep 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pittawat Taveekitworachai (2023). letter_recognition [Dataset]. https://huggingface.co/datasets/pittawat/letter_recognition
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 17, 2023
    Authors
    Pittawat Taveekitworachai
    Description

    Dataset Card for "letter_recognition"

    Images in this dataset was generated using the script defined below. The original dataset in CSV format and more information of the original dataset is available at A-Z Handwritten Alphabets in .csv format. import os import pandas as pd import matplotlib.pyplot as plt

    CHARACTER_COUNT = 26

    data = pd.read_csv('./A_Z Handwritten Data.csv') mapping = {str(i): chr(i+65) for i in range(26)}

    def generate_dataset(folder, end, start=0): if not… See the full description on the dataset page: https://huggingface.co/datasets/pittawat/letter_recognition.

  18. Students Test Data

    • kaggle.com
    Updated Sep 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ATHARV BHARASKAR (2023). Students Test Data [Dataset]. https://www.kaggle.com/datasets/atharvbharaskar/students-test-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 12, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ATHARV BHARASKAR
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    Dataset Overview: This dataset pertains to the examination results of students who participated in a series of academic assessments at a fictitious educational institution named "University of Exampleville." The assessments were administered across various courses and academic levels, with a focus on evaluating students' performance in general management and domain-specific topics.

    Columns: The dataset comprises 12 columns, each representing specific attributes and performance indicators of the students. These columns encompass information such as the students' names (which have been anonymized), their respective universities, academic program names (including BBA and MBA), specializations, the semester of the assessment, the type of examination domain (general management or domain-specific), general management scores (out of 50), domain-specific scores (out of 50), total scores (out of 100), student ranks, and percentiles.

    Data Collection: The examination data was collected during a standardized assessment process conducted by the University of Exampleville. The exams were designed to assess students' knowledge and skills in general management and their chosen domain-specific subjects. It involved students from both BBA and MBA programs who were in their final year of study.

    Data Format: The dataset is available in a structured format, typically as a CSV file. Each row represents a unique student's performance in the examination, while columns contain specific information about their results and academic details.

    Data Usage: This dataset is valuable for analyzing and gaining insights into the academic performance of students pursuing BBA and MBA degrees. It can be used for various purposes, including statistical analysis, performance trend identification, program assessment, and comparison of scores across domains and specializations. Furthermore, it can be employed in predictive modeling or decision-making related to curriculum development and student support.

    Data Quality: The dataset has undergone preprocessing and anonymization to protect the privacy of individual students. Nevertheless, it is essential to use the data responsibly and in compliance with relevant data protection regulations when conducting any analysis or research.

    Data Format: The exam data is typically provided in a structured format, commonly as a CSV (Comma-Separated Values) file. Each row in the dataset represents a unique student's examination performance, and each column contains specific attributes and scores related to the examination. The CSV format allows for easy import and analysis using various data analysis tools and programming languages like Python, R, or spreadsheet software like Microsoft Excel.

    Here's a column-wise description of the dataset:

    Name OF THE STUDENT: The full name of the student who took the exam. (Anonymized)

    UNIVERSITY: The university where the student is enrolled.

    PROGRAM NAME: The name of the academic program in which the student is enrolled (BBA or MBA).

    Specialization: If applicable, the specific area of specialization or major that the student has chosen within their program.

    Semester: The semester or academic term in which the student took the exam.

    Domain: Indicates whether the exam was divided into two parts: general management and domain-specific.

    GENERAL MANAGEMENT SCORE (OUT of 50): The score obtained by the student in the general management part of the exam, out of a maximum possible score of 50.

    Domain-Specific Score (Out of 50): The score obtained by the student in the domain-specific part of the exam, also out of a maximum possible score of 50.

    TOTAL SCORE (OUT of 100): The total score obtained by adding the scores from the general management and domain-specific parts, out of a maximum possible score of 100.

  19. Z

    N3C-Formatted OMOP2OBO Mappings

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Oct 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    N3C OMOP to OBO Working Group (2022). N3C-Formatted OMOP2OBO Mappings [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7249165
    Explore at:
    Dataset updated
    Oct 27, 2022
    Dataset provided by
    Callahan, Tiffany J
    N3C OMOP to OBO Working Group
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    OMOP2OBO Mappings - N3C OMOP to OBO Working group

    This repository stores OMOP2OBO mappings which have been processed for use within the National COVID Cohort Collaborative (N3C) Enclave. The version of the mappings stored in this repository have been specifically formatted for use within the N3C Enclave.

    N3C OMOP to OBO Working Group: https://covid.cd2h.org/ontology

    Accessing the N3C-Formatted Mappings

    You can access the three OMOP2OBO HPO mapping files in the Enclave from the Knowledge store using the following link: https://unite.nih.gov/workspace/compass/view/ri.compass.main.folder.1719efcf-9a87-484f-9a67-be6a29598567.

    The mapping set includes three files, but you only need to merge the following two files with existing data in the Enclave in order to be able to create the concept sets:

    OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv

    OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv

    The first file OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv, contains columns for the OMOP concept ids and codes as well as specifies information like whether or not the OMOP concept’s descendants should be included when deriving the concept sets (defaults to FALSE). The other file OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv, contains details on the mapping’s label (i.e., the HPO curie and label in the concept_set_id field) and its provenance/evidence (the specific column to access for this information is called intention).

    Creating Concept Sets

    Merge these files together on the column named codeset_id and then join them with existing Enclave tables like concept and condition_occurrence to populate the actual concept sets. The name of the concept set can be obtained from the OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv file and is stored as a string in the column called concept_set_id. Although not ideal (but is the best way to approach this currently given what fields are available in the Enclave), to get the HPO CURIE and label will require applying a regex to this column.

    An example mapping is shown below (highlighting some of the most useful columns):

    codeset_id: 900000000 concept_set_id: [OMOP2OBO] hp_0002031-abnormal_esophagus_morphology concept: 23868 code: 69771008 codeSystem: SNOMED includeDescendants: False intention:

    Mixed - This mapping was created using the OMOP2OBO mapping algorithm (https://github.com/callahantiff/OMOP2OBO).

    The Mapping Category and Evidence supporting the mappings are provided below, by OMOP concept:

    23868

    Mapping Category: Automatic Exact - Concept

    Mapping Provenance

    OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:snomed_69771008 | OBO_DbXref-OMOP_CONCEPT_SOURCE_CODE:snomed_69771008 | CONCEPT_SIMILARITY:HP_0002031_0.713

    Release Notes - v2.0.0

    Preparation

    In order to import data into the Enclave, the following items are needed:

    Obtain API Token, which will be included in the authorization header (stored as GitHub Secret)

    Obtain username hash from the Enclave

    OMOP2OBO Mappings (v1.5.0)

    Data

    Concept Set Container (concept_set_container): CreateNewConceptSet

    Concept Set Version (code_sets): CreateNewDraftOMOPConceptSetVersion

    Concept Set Expression Items (concept_set_version_item): addCodeAsVersionExpression

    Script

    n3c_mapping_conversion.py

    Generated Output

    Need to have the codeset_id filled from self-generation (ideally, from a conserved range) prior to beginning any of the API steps. The current list of assigned identifiers is stored in the file named omop2obo_enclave_codeset_id_dict_v2.0.0.json. Note that in order to accommodate the 1:Many mappings the codeset ids were re-generated and rather than being ampped to HPO concepts, they are mapped to SNOMED-CT concepts. This creates a cleaner mapping and will easily scale to future mapping builds.

    To be consistent with OMOP tools, specifically Atlas, we have also created Atlas-formatted json files for each mapping, which are stored in the zipped directory named atlas_json_files_v2.0.0.zip. Note that as mentioned above, to enable the representation of 1:Many mappings the filenames are no longer named after HPO concepts they are now named with the OMOP concept_id and label and additional fields have been added within the JSON files that includes the HPO ids, labels, mapping category, mapping logic, and mapping evidence.

    File 1: concept_set_container

    Generated Data: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_container.csv

    Columns:

    concept_set_id

    concept_set_name

    intention

    assigned_informatician

    assigned_sme

    project_id

    status

    stage

    n3c_reviewer

    alias

    archived

    created_by

    created_at

    File 2: concept_set_expression_items

    Generated Data: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv

    Columns:

    codeset_id

    concept_id

    code

    codeSystem

    ontology_id

    ontology_label

    mapping_category

    mapping_logic

    mapping_evidence

    isExcluded

    includeDescendants

    includeMapped

    item_id

    annotation

    created_by

    created_at

    File 3: concept_set_version

    Generated Data: OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv

    Columns:

    codeset_id

    concept_set_id

    concept_set_version_title

    project

    source_application

    source_application_version

    created_at

    atlas_json

    most_recent_version

    comments

    intention

    limitations

    issues

    update_message

    status

    has_review

    reviewed_by

    created_by

    provenance

    atlas_json_resource_url

    parent_version_id

    is_draft

    Generated Output:

    OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_container.csv

    OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv

    OMOP2OBO_v2.0.0_N3C_Enclave_CSV_concept_set_version.csv

    atlas_json_files_v2.0.0.zip

    omop2obo_enclave_codeset_id_dict_v2.0.0.json

  20. County Cancer Death Rates

    • kaggle.com
    Updated Dec 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). County Cancer Death Rates [Dataset]. https://www.kaggle.com/datasets/thedevastator/county-cancer-death-rates
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    County Cancer Death Rates

    County-level cancer death rates with related variables

    By Noah Rippner [source]

    About this dataset

    This dataset provides comprehensive information on county-level cancer death and incidence rates, as well as various related variables. It includes data on age-adjusted death rates, average deaths per year, recent trends in cancer death rates, recent 5-year trends in death rates, and average annual counts of cancer deaths or incidence. The dataset also includes the federal information processing standards (FIPS) codes for each county.

    Additionally, the dataset indicates whether each county met the objective of a targeted death rate of 45.5. The recent trend in cancer deaths or incidence is also captured for analysis purposes.

    The purpose of the death.csv file within this dataset is to offer detailed information specifically concerning county-level cancer death rates and related variables. On the other hand, the incd.csv file contains data on county-level cancer incidence rates and additional relevant variables.

    To provide more context and understanding about the included data points, there is a separate file named cancer_data_notes.csv. This file serves to provide informative notes and explanations regarding the various aspects of the cancer data used in this dataset.

    Please note that this particular description provides an overview for a linear regression walkthrough using this dataset based on Python programming language. It highlights how to source and import the data properly before moving into data preparation steps such as exploratory analysis. The walkthrough further covers model selection and important model diagnostics measures.

    It's essential to bear in mind that this example serves as an initial attempt at creating a multivariate Ordinary Least Squares regression model using these datasets from various sources like cancer.gov along with US Census American Community Survey data. This baseline model allows easy comparisons with future iterations intended for improvements or refinements.

    Important columns found within this extensively documented Kaggle dataset include County names along with their corresponding FIPS codes—a standardized coding system by Federal Information Processing Standards (FIPS). Moreover,Met Objective of 45.5? (1) column denotes whether a specific county achieved the targeted objective of a death rate of 45.5 or not.

    Overall, this dataset aims to offer valuable insights into county-level cancer death and incidence rates across various regions, providing policymakers, researchers, and healthcare professionals with essential information for analysis and decision-making purposes

    How to use the dataset

    • Familiarize Yourself with the Columns:

      • County: The name of the county.
      • FIPS: The Federal Information Processing Standards code for the county.
      • Met Objective of 45.5? (1): Indicates whether the county met the objective of a death rate of 45.5 (Boolean).
      • Age-Adjusted Death Rate: The age-adjusted death rate for cancer in the county.
      • Average Deaths per Year: The average number of deaths per year due to cancer in the county.
      • Recent Trend (2): The recent trend in cancer death rates/incidence in the county.
      • Recent 5-Year Trend (2) in Death Rates: The recent 5-year trend in cancer death rates/incidence in the county.
      • Average Annual Count: The average annual count of cancer deaths/incidence in the county.
    • Determine Counties Meeting Objective: Use this dataset to identify counties that have met or not met an objective death rate threshold of 45.5%. Look for entries where Met Objective of 45.5? (1) is marked as True or False.

    • Analyze Age-Adjusted Death Rates: Study and compare age-adjusted death rates across different counties using Age-Adjusted Death Rate values provided as floats.

    • Explore Average Deaths per Year: Examine and compare average annual counts and trends regarding deaths caused by cancer, using Average Deaths per Year as a reference point.

    • Investigate Recent Trends: Assess recent trends related to cancer deaths or incidence by analyzing data under columns such as Recent Trend, Recent Trend (2), and Recent 5-Year Trend (2) in Death Rates. These columns provide information on how cancer death rates/incidence have changed over time.

    • Compare Counties: Utilize this dataset to compare counties based on their cancer death rates and related variables. Identify counties with lower or higher average annual counts, age-adjusted death rates, or recent trends to analyze and understand the factors contributing ...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Seair Exim (2025). Csv Exports And Imports | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/

Csv Exports And Imports | See Full Import/Export Data | Eximpedia

Eximpedia PTE LTD

Eximpedia Export Import Trade

Eximpedia Export Import Trade data API

Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Feb 7, 2025
Dataset provided by
Eximpedia Export Import Trade Data
Eximpedia PTE LTD
Authors
Seair Exim
Area covered
Aruba, United Republic of, Albania, Georgia, Seychelles, Guam, Monaco, Antigua and Barbuda, United Arab Emirates, Northern Mariana Islands
Description

Csv Exports And Imports Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.

Search
Clear search
Close search
Google apps
Main menu