100+ datasets found
  1. s

    Python Import Data India – Buyers & Importers List

    • seair.co.in
    Updated Nov 17, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2016). Python Import Data India – Buyers & Importers List [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Nov 17, 2016
    Dataset provided by
    Seair Info Solutions PVT LTD
    Authors
    Seair Exim
    Area covered
    India
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  2. Data from: A large synthetic dataset for machine learning applications in...

    • zenodo.org
    csv, json, png, zip
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marc Gillioz; Marc Gillioz; Guillaume Dubuis; Philippe Jacquod; Philippe Jacquod; Guillaume Dubuis (2025). A large synthetic dataset for machine learning applications in power transmission grids [Dataset]. http://doi.org/10.5281/zenodo.13378476
    Explore at:
    zip, png, csv, jsonAvailable download formats
    Dataset updated
    Mar 25, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Marc Gillioz; Marc Gillioz; Guillaume Dubuis; Philippe Jacquod; Philippe Jacquod; Guillaume Dubuis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    With the ongoing energy transition, power grids are evolving fast. They operate more and more often close to their technical limit, under more and more volatile conditions. Fast, essentially real-time computational approaches to evaluate their operational safety, stability and reliability are therefore highly desirable. Machine Learning methods have been advocated to solve this challenge, however they are heavy consumers of training and testing data, while historical operational data for real-world power grids are hard if not impossible to access.

    This dataset contains long time series for production, consumption, and line flows, amounting to 20 years of data with a time resolution of one hour, for several thousands of loads and several hundreds of generators of various types representing the ultra-high-voltage transmission grid of continental Europe. The synthetic time series have been statistically validated agains real-world data.

    Data generation algorithm

    The algorithm is described in a Nature Scientific Data paper. It relies on the PanTaGruEl model of the European transmission network -- the admittance of its lines as well as the location, type and capacity of its power generators -- and aggregated data gathered from the ENTSO-E transparency platform, such as power consumption aggregated at the national level.

    Network

    The network information is encoded in the file europe_network.json. It is given in PowerModels format, which it itself derived from MatPower and compatible with PandaPower. The network features 7822 power lines and 553 transformers connecting 4097 buses, to which are attached 815 generators of various types.

    Time series

    The time series forming the core of this dataset are given in CSV format. Each CSV file is a table with 8736 rows, one for each hourly time step of a 364-day year. All years are truncated to exactly 52 weeks of 7 days, and start on a Monday (the load profiles are typically different during weekdays and weekends). The number of columns depends on the type of table: there are 4097 columns in load files, 815 for generators, and 8375 for lines (including transformers). Each column is described by a header corresponding to the element identifier in the network file. All values are given in per-unit, both in the model file and in the tables, i.e. they are multiples of a base unit taken to be 100 MW.

    There are 20 tables of each type, labeled with a reference year (2016 to 2020) and an index (1 to 4), zipped into archive files arranged by year. This amount to a total of 20 years of synthetic data. When using loads, generators, and lines profiles together, it is important to use the same label: for instance, the files loads_2020_1.csv, gens_2020_1.csv, and lines_2020_1.csv represent a same year of the dataset, whereas gens_2020_2.csv is unrelated (it actually shares some features, such as nuclear profiles, but it is based on a dispatch with distinct loads).

    Usage

    The time series can be used without a reference to the network file, simply using all or a selection of columns of the CSV files, depending on the needs. We show below how to select series from a particular country, or how to aggregate hourly time steps into days or weeks. These examples use Python and the data analyis library pandas, but other frameworks can be used as well (Matlab, Julia). Since all the yearly time series are periodic, it is always possible to define a coherent time window modulo the length of the series.

    Selecting a particular country

    This example illustrates how to select generation data for Switzerland in Python. This can be done without parsing the network file, but using instead gens_by_country.csv, which contains a list of all generators for any country in the network. We start by importing the pandas library, and read the column of the file corresponding to Switzerland (country code CH):

    import pandas as pd
    CH_gens = pd.read_csv('gens_by_country.csv', usecols=['CH'], dtype=str)

    The object created in this way is Dataframe with some null values (not all countries have the same number of generators). It can be turned into a list with:

    CH_gens_list = CH_gens.dropna().squeeze().to_list()

    Finally, we can import all the time series of Swiss generators from a given data table with

    pd.read_csv('gens_2016_1.csv', usecols=CH_gens_list)

    The same procedure can be applied to loads using the list contained in the file loads_by_country.csv.

    Averaging over time

    This second example shows how to change the time resolution of the series. Suppose that we are interested in all the loads from a given table, which are given by default with a one-hour resolution:

    hourly_loads = pd.read_csv('loads_2018_3.csv')

    To get a daily average of the loads, we can use:

    daily_loads = hourly_loads.groupby([t // 24 for t in range(24 * 364)]).mean()

    This results in series of length 364. To average further over entire weeks and get series of length 52, we use:

    weekly_loads = hourly_loads.groupby([t // (24 * 7) for t in range(24 * 364)]).mean()

    Source code

    The code used to generate the dataset is freely available at https://github.com/GeeeHesso/PowerData. It consists in two packages and several documentation notebooks. The first package, written in Python, provides functions to handle the data and to generate synthetic series based on historical data. The second package, written in Julia, is used to perform the optimal power flow. The documentation in the form of Jupyter notebooks contains numerous examples on how to use both packages. The entire workflow used to create this dataset is also provided, starting from raw ENTSO-E data files and ending with the synthetic dataset given in the repository.

    Funding

    This work was supported by the Cyber-Defence Campus of armasuisse and by an internal research grant of the Engineering and Architecture domain of HES-SO.

  3. Storage and Transit Time Data and Code

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jun 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Felton; Andrew Felton (2024). Storage and Transit Time Data and Code [Dataset]. http://doi.org/10.5281/zenodo.11582455
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrew Felton; Andrew Felton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description


    Author: Andrew J. Felton
    Date: 5/5/2024

    This R project contains the primary code and data (following pre-processing in python) used for data production, manipulation, visualization, and analysis and figure production for the study entitled:

    "Global estimates of the storage and transit time of water through vegetation"

    Please note that 'turnover' and 'transit' are used interchangeably in this project.

    Data information:

    The data folder contains key data sets used for analysis. In particular:

    "data/turnover_from_python/updated/annual/multi_year_average/average_annual_turnover.nc" contains a global array summarizing five year (2016-2020) averages of annual transit, storage, canopy transpiration, and number of months of data. This is the core dataset for the analysis; however, each folder has much more data, including a dataset for each year of the analysis. Data are also available is separate .csv files for each land cover type. Oterh data can be found for the minimum, monthly, and seasonal transit time found in their respective folders. These data were produced using the python code found in the "supporting_code" folder given the ease of working with .nc and EASE grid in the xarray python module. R was used primarily for data visualization purposes. The remaining files in the "data" and "data/supporting_data"" folder primarily contain ground-based estimates of storage and transit found in public databases or through a literature search, but have been extensively processed and filtered here.

    #Code information

    Python scripts can be found in the "supporting_code" folder.

    Each R script in this project has a particular function:

    01_start.R: This script loads the R packages used in the analysis, sets the
    directory, and imports custom functions for the project. You can also load in the
    main transit time (turnover) datasets here using the `source()` function.

    02_functions.R: This script contains the custom function for this analysis,
    primarily to work with importing the seasonal transit data. Load this using the
    `source()` function in the 01_start.R script.

    03_generate_data.R: This script is not necessary to run and is primarily
    for documentation. The main role of this code was to import and wrangle
    the data needed to calculate ground-based estimates of aboveground water storage.

    04_annual_turnover_storage_import.R: This script imports the annual turnover and
    storage data for each landcover type. You load in these data from the 01_start.R script
    using the `source()` function.

    05_minimum_turnover_storage_import.R: This script imports the minimum turnover and
    storage data for each landcover type. Minimum is defined as the lowest monthly
    estimate.You load in these data from the 01_start.R script
    using the `source()` function.

    06_figures_tables.R: This is the main workhouse for figure/table production and
    supporting analyses. This script generates the key figures and summary statistics
    used in the study that then get saved in the manuscript_figures folder. Note that all
    maps were produced using Python code found in the "supporting_code"" folder.

  4. Z

    Event Data and Queries for Multi-Dimensional Event Data in the Neo4j Graph...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fahland, Dirk (2021). Event Data and Queries for Multi-Dimensional Event Data in the Neo4j Graph Database [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3865221
    Explore at:
    Dataset updated
    Apr 22, 2021
    Dataset provided by
    Esser, Stefan
    Fahland, Dirk
    Description

    Data model and generic query templates for translating and integrating a set of related CSV event logs into a single event graph for as used in https://dx.doi.org/10.1007/s13740-021-00122-1

    Provides input data for 5 datasets (BPIC14, BPIC15, BPIC16, BPIC17, BPIC19)

    Provides Python scripts to prepare and import each dataset into a Neo4j database instance through Cypher queries, representing behavioral information not globally (as in an event log), but locally per entity and per relation between entities.

    Provides Python scripts to retrieve event data from a Neo4j database instance and render it using Graphviz dot.

    The data model and queries are described in detail in: Stefan Esser, Dirk Fahland: Multi-Dimensional Event Data in Graph Databases (2020) https://arxiv.org/abs/2005.14552 and https://dx.doi.org/10.1007/s13740-021-00122-1

    Fork the query code from Github: https://github.com/multi-dimensional-process-mining/graphdb-eventlogs

  5. Python Import Data in January - Seair.co.in

    • seair.co.in
    Updated Jan 29, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2016). Python Import Data in January - Seair.co.in [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Jan 29, 2016
    Dataset provided by
    Authors
    Seair Exim
    Area covered
    Marshall Islands, Iceland, Equatorial Guinea, Ecuador, Bosnia and Herzegovina, Congo, Indonesia, Chile, Bahrain, Vietnam
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  6. Infra Red Recordings of Opposed-Flow Flame Propagation on Discrete PMMA...

    • zenodo.org
    bin, csv
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hans-Christoph Ries; Hans-Christoph Ries; Keno Krieger; Keno Krieger; Florian Meyer; Florian Meyer; Christian Eigenbrod; Christian Eigenbrod (2025). Infra Red Recordings of Opposed-Flow Flame Propagation on Discrete PMMA Sheets Under Exploration Atmospheres in Microgravity [Dataset]. http://doi.org/10.5281/zenodo.14959942
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Hans-Christoph Ries; Hans-Christoph Ries; Keno Krieger; Keno Krieger; Florian Meyer; Florian Meyer; Christian Eigenbrod; Christian Eigenbrod
    License

    https://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html

    Description

    About

    This dataset contains infra red recordings from a series of drop tower experiment conducted at ZARM. Thin PMMA samples were combusted under controlled atmospheric conditions and forced opposed flow. An overview of all experiment conditions and sample types is given in the index.csv file. Two kinds of samples were investigated: "full samples", i.e. continuous fuel strips, and "gap samples", i.e. fuel strips separated by air gaps of different widths. Consequently, if the `Gaps [Sequence]` or `Gap Material [Material]` fields are empty, the "full sample" was tested. For the "gap samples" the sequence of gaps is given as the individual widths of each gap in mm separated by slashes, e.g. '3/4/5' to refer to a sample with 3, 4 and 5 mm gaps.

    The radial lens distortion of the individual frames was corrected as well as the temperature. The frames are stored as 3 dimensional arrays with the first two dimensions being the spatial and the third the time (or frame count) dimension. The value of each pixel corresponds to the measured temperature in °C at the pixel's location.

    Usage

    Experiment Overview

    The experiment overview is given in the index.csv file. Here, the metadata for each experiment is given. The corresponding IR data can be found in the `Data Location [File]` column.

    Loading the frames

    The frames are stored in hdf5 files as the 'data' dataset. The corresponding metadata is given as serialized JSON as the 'metadata' dataset. An example for loading the files in Python is given below:

    import json
    import h5py
    import numpy as np
    data_file = "exp0.hdf5"
    with h5py.File(data_file, 'r') as f:
      metadata = json.loads(f["metadata"][()]) # dictionary with date, sample and experiment conditions
      data = np.array(f["data"]) # individual IR frames as a 3 dimensional array of shape (x, y, t)

    Acknowledgments

    The authors thank Jan Heißmeier and Michael Peters for technical support during the design phase and the experimental campaigns. This research has been funded by the German Federal Ministry for Economic Affairs and Climate Action through the German Space Agency at DLR in the framework of FLARE-G II & III (grants 50WM2160 & 50WM2456).

  7. Stage One Experiment - Datasets

    • figshare.com
    bin
    Updated Jan 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luke Yerbury (2025). Stage One Experiment - Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.27427155.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 21, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Luke Yerbury
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data used in the stage one 1NN classification experiment in: "Comparing Clustering Approaches for Smart Meter Time Series: Investigating the Influence of Dataset Properties on Performance"All datasets are stored in a dict with tuples of (time series array, class labels). To access data in python:import picklefilename = "dataset.txt"with open(filename, 'rb') as f: data = pickle.load(f)

  8. Z

    Data from: Russian Financial Statements Database: A firm-level collection of...

    • data.niaid.nih.gov
    Updated Mar 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ledenev, Victor (2025). Russian Financial Statements Database: A firm-level collection of the universe of financial statements [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14622208
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    Bondarkov, Sergey
    Ledenev, Victor
    Skougarevskiy, Dmitriy
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    Russia
    Description

    The Russian Financial Statements Database (RFSD) is an open, harmonized collection of annual unconsolidated financial statements of the universe of Russian firms:

    • 🔓 First open data set with information on every active firm in Russia.

    • 🗂️ First open financial statements data set that includes non-filing firms.

    • 🏛️ Sourced from two official data providers: the Rosstat and the Federal Tax Service.

    • 📅 Covers 2011-2023 initially, will be continuously updated.

    • 🏗️ Restores as much data as possible through non-invasive data imputation, statement articulation, and harmonization.

    The RFSD is hosted on 🤗 Hugging Face and Zenodo and is stored in a structured, column-oriented, compressed binary format Apache Parquet with yearly partitioning scheme, enabling end-users to query only variables of interest at scale.

    The accompanying paper provides internal and external validation of the data: http://arxiv.org/abs/2501.05841.

    Here we present the instructions for importing the data in R or Python environment. Please consult with the project repository for more information: http://github.com/irlcode/RFSD.

    Importing The Data

    You have two options to ingest the data: download the .parquet files manually from Hugging Face or Zenodo or rely on 🤗 Hugging Face Datasets library.

    Python

    🤗 Hugging Face Datasets

    It is as easy as:

    from datasets import load_dataset import polars as pl

    This line will download 6.6GB+ of all RFSD data and store it in a 🤗 cache folder

    RFSD = load_dataset('irlspbru/RFSD')

    Alternatively, this will download ~540MB with all financial statements for 2023# to a Polars DataFrame (requires about 8GB of RAM)

    RFSD_2023 = pl.read_parquet('hf://datasets/irlspbru/RFSD/RFSD/year=2023/*.parquet')

    Please note that the data is not shuffled within year, meaning that streaming first n rows will not yield a random sample.

    Local File Import

    Importing in Python requires pyarrow package installed.

    import pyarrow.dataset as ds import polars as pl

    Read RFSD metadata from local file

    RFSD = ds.dataset("local/path/to/RFSD")

    Use RFSD_dataset.schema to glimpse the data structure and columns' classes

    print(RFSD.schema)

    Load full dataset into memory

    RFSD_full = pl.from_arrow(RFSD.to_table())

    Load only 2019 data into memory

    RFSD_2019 = pl.from_arrow(RFSD.to_table(filter=ds.field('year') == 2019))

    Load only revenue for firms in 2019, identified by taxpayer id

    RFSD_2019_revenue = pl.from_arrow( RFSD.to_table( filter=ds.field('year') == 2019, columns=['inn', 'line_2110'] ) )

    Give suggested descriptive names to variables

    renaming_df = pl.read_csv('local/path/to/descriptive_names_dict.csv') RFSD_full = RFSD_full.rename({item[0]: item[1] for item in zip(renaming_df['original'], renaming_df['descriptive'])})

    R

    Local File Import

    Importing in R requires arrow package installed.

    library(arrow) library(data.table)

    Read RFSD metadata from local file

    RFSD <- open_dataset("local/path/to/RFSD")

    Use schema() to glimpse into the data structure and column classes

    schema(RFSD)

    Load full dataset into memory

    scanner <- Scanner$create(RFSD) RFSD_full <- as.data.table(scanner$ToTable())

    Load only 2019 data into memory

    scan_builder <- RFSD$NewScan() scan_builder$Filter(Expression$field_ref("year") == 2019) scanner <- scan_builder$Finish() RFSD_2019 <- as.data.table(scanner$ToTable())

    Load only revenue for firms in 2019, identified by taxpayer id

    scan_builder <- RFSD$NewScan() scan_builder$Filter(Expression$field_ref("year") == 2019) scan_builder$Project(cols = c("inn", "line_2110")) scanner <- scan_builder$Finish() RFSD_2019_revenue <- as.data.table(scanner$ToTable())

    Give suggested descriptive names to variables

    renaming_dt <- fread("local/path/to/descriptive_names_dict.csv") setnames(RFSD_full, old = renaming_dt$original, new = renaming_dt$descriptive)

    Use Cases

    🌍 For macroeconomists: Replication of a Bank of Russia study of the cost channel of monetary policy in Russia by Mogiliat et al. (2024) — interest_payments.md

    🏭 For IO: Replication of the total factor productivity estimation by Kaukin and Zhemkova (2023) — tfp.md

    🗺️ For economic geographers: A novel model-less house-level GDP spatialization that capitalizes on geocoding of firm addresses — spatialization.md

    FAQ

    Why should I use this data instead of Interfax's SPARK, Moody's Ruslana, or Kontur's Focus?hat is the data period?

    To the best of our knowledge, the RFSD is the only open data set with up-to-date financial statements of Russian companies published under a permissive licence. Apart from being free-to-use, the RFSD benefits from data harmonization and error detection procedures unavailable in commercial sources. Finally, the data can be easily ingested in any statistical package with minimal effort.

    What is the data period?

    We provide financials for Russian firms in 2011-2023. We will add the data for 2024 by July, 2025 (see Version and Update Policy below).

    Why are there no data for firm X in year Y?

    Although the RFSD strives to be an all-encompassing database of financial statements, end users will encounter data gaps:

    We do not include financials for firms that we considered ineligible to submit financial statements to the Rosstat/Federal Tax Service by law: financial, religious, or state organizations (state-owned commercial firms are still in the data).

    Eligible firms may enjoy the right not to disclose under certain conditions. For instance, Gazprom did not file in 2022 and we had to impute its 2022 data from 2023 filings. Sibur filed only in 2023, Novatek — in 2020 and 2021. Commercial data providers such as Interfax's SPARK enjoy dedicated access to the Federal Tax Service data and therefore are able source this information elsewhere.

    Firm may have submitted its annual statement but, according to the Uniform State Register of Legal Entities (EGRUL), it was not active in this year. We remove those filings.

    Why is the geolocation of firm X incorrect?

    We use Nominatim to geocode structured addresses of incorporation of legal entities from the EGRUL. There may be errors in the original addresses that prevent us from geocoding firms to a particular house. Gazprom, for instance, is geocoded up to a house level in 2014 and 2021-2023, but only at street level for 2015-2020 due to improper handling of the house number by Nominatim. In that case we have fallen back to street-level geocoding. Additionally, streets in different districts of one city may share identical names. We have ignored those problems in our geocoding and invite your submissions. Finally, address of incorporation may not correspond with plant locations. For instance, Rosneft has 62 field offices in addition to the central office in Moscow. We ignore the location of such offices in our geocoding, but subsidiaries set up as separate legal entities are still geocoded.

    Why is the data for firm X different from https://bo.nalog.ru/?

    Many firms submit correcting statements after the initial filing. While we have downloaded the data way past the April, 2024 deadline for 2023 filings, firms may have kept submitting the correcting statements. We will capture them in the future releases.

    Why is the data for firm X unrealistic?

    We provide the source data as is, with minimal changes. Consider a relatively unknown LLC Banknota. It reported 3.7 trillion rubles in revenue in 2023, or 2% of Russia's GDP. This is obviously an outlier firm with unrealistic financials. We manually reviewed the data and flagged such firms for user consideration (variable outlier), keeping the source data intact.

    Why is the data for groups of companies different from their IFRS statements?

    We should stress that we provide unconsolidated financial statements filed according to the Russian accounting standards, meaning that it would be wrong to infer financials for corporate groups with this data. Gazprom, for instance, had over 800 affiliated entities and to study this corporate group in its entirety it is not enough to consider financials of the parent company.

    Why is the data not in CSV?

    The data is provided in Apache Parquet format. This is a structured, column-oriented, compressed binary format allowing for conditional subsetting of columns and rows. In other words, you can easily query financials of companies of interest, keeping only variables of interest in memory, greatly reducing data footprint.

    Version and Update Policy

    Version (SemVer): 1.0.0.

    We intend to update the RFSD annualy as the data becomes available, in other words when most of the firms have their statements filed with the Federal Tax Service. The official deadline for filing of previous year statements is April, 1. However, every year a portion of firms either fails to meet the deadline or submits corrections afterwards. Filing continues up to the very end of the year but after the end of April this stream quickly thins out. Nevertheless, there is obviously a trade-off between minimization of data completeness and version availability. We find it a reasonable compromise to query new data in early June, since on average by the end of May 96.7% statements are already filed, including 86.4% of all the correcting filings. We plan to make a new version of RFSD available by July.

    Licence

    Creative Commons License Attribution 4.0 International (CC BY 4.0).

    Copyright © the respective contributors.

    Citation

    Please cite as:

    @unpublished{bondarkov2025rfsd, title={{R}ussian {F}inancial {S}tatements {D}atabase}, author={Bondarkov, Sergey and Ledenev, Victor and Skougarevskiy, Dmitriy}, note={arXiv preprint arXiv:2501.05841}, doi={https://doi.org/10.48550/arXiv.2501.05841}, year={2025}}

    Acknowledgments and Contacts

    Data collection and processing: Sergey Bondarkov, sbondarkov@eu.spb.ru, Viktor Ledenev, vledenev@eu.spb.ru

    Project conception, data validation, and use cases: Dmitriy Skougarevskiy, Ph.D.,

  9. h

    Turkish_Basketball_Super_League_Dataset

    • huggingface.co
    Updated Aug 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammed Onur ulu (2025). Turkish_Basketball_Super_League_Dataset [Dataset]. https://huggingface.co/datasets/onurulu17/Turkish_Basketball_Super_League_Dataset
    Explore at:
    Dataset updated
    Aug 10, 2025
    Authors
    Muhammed Onur ulu
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    📥 Load Dataset in Python

    To load this dataset in Google Colab or any Python environment:
    !pip install huggingface_hub pandas openpyxl

    from huggingface_hub import hf_hub_download import pandas as pd

    repo_id = "onurulu17/Turkish_Basketball_Super_League_Dataset"

    files = [ "leaderboard.xlsx", "player_data.xlsx", "team_data.xlsx", "team_matches.xlsx", "player_statistics.xlsx", "technic_roster.xlsx" ]

    datasets = {}

    for f in files: path =… See the full description on the dataset page: https://huggingface.co/datasets/onurulu17/Turkish_Basketball_Super_League_Dataset.

  10. h

    CF-MS_Homo_sapiens_PPI

    • huggingface.co
    Updated Nov 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miles Woodcock-Girard (2024). CF-MS_Homo_sapiens_PPI [Dataset]. https://huggingface.co/datasets/viridono/CF-MS_Homo_sapiens_PPI
    Explore at:
    Dataset updated
    Nov 30, 2024
    Authors
    Miles Woodcock-Girard
    Description

    Quickstart Usage

    This dataset can be loaded into python using the Huggingface datasets library. First, install the datasets library via command line: $ pip install datasets

    With datasets installed, the user should then import it into their python script / environment:

    import datasets

    The user can then load the CF-MS_Homo_sapiens_PPI dataset using datasets.load_dataset(...). There are two configurations, or 'views' for the set. The user can choose between them via the name… See the full description on the dataset page: https://huggingface.co/datasets/viridono/CF-MS_Homo_sapiens_PPI.

  11. T

    mnist

    • tensorflow.org
    • universe.roboflow.com
    • +3more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The MNIST database of handwritten digits.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('mnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

  12. s

    Python Import Data in August - Seair.co.in

    • seair.co.in
    Updated Aug 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2016). Python Import Data in August - Seair.co.in [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Aug 20, 2016
    Dataset provided by
    Seair Info Solutions PVT LTD
    Authors
    Seair Exim
    Area covered
    Belgium, Virgin Islands (U.S.), Nepal, Saint Pierre and Miquelon, Falkland Islands (Malvinas), Lebanon, Gambia, Christmas Island, South Africa, Ecuador
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  13. h

    AttentiveSkin

    • huggingface.co
    Updated Jul 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maom Lab (2024). AttentiveSkin [Dataset]. https://huggingface.co/datasets/maomlab/AttentiveSkin
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 7, 2024
    Dataset authored and provided by
    Maom Lab
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Attentive Skin

    To Predict Skin Corrosion/Irritation Potentials of Chemicals via Explainable Machine Learning Methods Download: https://github.com/BeeBeeWong/AttentiveSkin/releases/tag/v1.0

      Quickstart Usage
    
    
    
    
    
      Load a dataset in python
    

    Each subset can be loaded into python using the Huggingface datasets library. First, from the command line install the datasets library $ pip install datasets

    then, from within python load the datasets library

    import datasets… See the full description on the dataset page: https://huggingface.co/datasets/maomlab/AttentiveSkin.

  14. h

    Magicoder-Evol-Instruct-110K-python

    • huggingface.co
    Updated Nov 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    pxy (2024). Magicoder-Evol-Instruct-110K-python [Dataset]. https://huggingface.co/datasets/pxyyy/Magicoder-Evol-Instruct-110K-python
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 17, 2024
    Authors
    pxy
    Description

    Dataset Card for "Magicoder-Evol-Instruct-110K-python"

    from datasets import load_dataset

    Load your dataset

    dataset = load_dataset("pxyyy/Magicoder-Evol-Instruct-110K", split="train") # Replace with your dataset and split

    Define a filter function

    def contains_python(entry): for c in entry["messages"]: if "python" in c['content'].lower(): return True # return "python" in entry["messages"].lower() # Replace 'column_name' with the column to search

    … See the full description on the dataset page: https://huggingface.co/datasets/pxyyy/Magicoder-Evol-Instruct-110K-python.

  15. IndoorOutdoorNet-20K

    • kaggle.com
    • huggingface.co
    Updated Apr 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PRITHIV SAKTHI U R (2025). IndoorOutdoorNet-20K [Dataset]. http://doi.org/10.34740/kaggle/dsv/11530480
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 23, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    PRITHIV SAKTHI U R
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13593643%2Fccdb51d9736ccf80043501aada4fce85%2FVhKJwA7Tysql8UyvoQWiM.png?generation=1745449425197291&alt=media" alt="">

    IndoorOutdoorNet-20K

    IndoorOutdoorNet-20K is a labeled image dataset designed for the task of image classification, particularly focused on distinguishing between indoor and outdoor scenes. The dataset is publicly available on Hugging Face Datasets and is useful for scene understanding, transfer learning, and model benchmarking.

    Dataset Summary

    • Task: Image Classification
    • Modalities: Image
    • Labels: Indoor, Outdoor (2 classes)
    • Total Images: 19,998
    • Split: Train (100%)
    • Languages: English (metadata)
    • Size: ~451 MB
    • License: Apache-2.0

    Features

    ColumnTypeDescription
    imageImageInput image file
    labelClassScene label: Indoor or Outdoor

    Example

    ImageLabel
    Indoor
    Outdoor

    Note: For full visualization, visit the dataset viewer on Hugging Face.

    Usage

    You can use this dataset directly with the datasets library:

    from datasets import load_dataset
    
    dataset = load_dataset("prithivMLmods/IndoorOutdoorNet-20K")
    

    To visualize a sample:

    import matplotlib.pyplot as plt
    
    sample = dataset['train'][0]
    plt.imshow(sample['image'])
    plt.title(sample['label'])
    plt.axis('off')
    plt.show()
    

    Applications

    • Scene classification
    • Image context recognition
    • Smart surveillance
    • Autonomous navigation
    • Indoor-outdoor transition detection in robotics

    Citation

    If you use this dataset in your research or project, please cite it appropriately. (You can include a BibTeX entry here if available.)

    License

    This dataset is licensed under the Apache 2.0 License.

    Curated & Maintained by @prithivMLmods.

  16. Stage Two Experiments - Datasets

    • figshare.com
    bin
    Updated Jan 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luke Yerbury (2025). Stage Two Experiments - Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.27427629.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 21, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Luke Yerbury
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data used in the various stage two experiments in: "Comparing Clustering Approaches for Smart Meter Time Series: Investigating the Influence of Dataset Properties on Performance". This includes datasets with varied characteristics.All datasets are stored in a dict with tuples of (time series array, class labels). To access data in python:import picklefilename = "dataset.txt"with open(filename, 'rb') as f: data = pickle.load(f)

  17. h

    Arabic_IFEval

    • huggingface.co
    Updated Apr 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inception (2025). Arabic_IFEval [Dataset]. https://huggingface.co/datasets/inceptionai/Arabic_IFEval
    Explore at:
    Dataset updated
    Apr 5, 2025
    Dataset authored and provided by
    Inception
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    IFEval is the first publicly available benchmark dataset specifically designed to evaluate Arabic Large Language Models (LLMs) on instruction-following capabilities in Arabic. The dataset includes 404 high-quality, manually verified samples covering various constraints such as linguistic patterns, punctuation rules, and formatting guidelines.

      Loading the Dataset
    

    To load this dataset in Python using the 🤗 Datasets library, run the following: from datasets import load_dataset

    ifeval… See the full description on the dataset page: https://huggingface.co/datasets/inceptionai/Arabic_IFEval.

  18. 3xM 10 10 (RGB-D Instance Seg. for bin-picking)

    • kaggle.com
    Updated Nov 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tobia Ippolito (2024). 3xM 10 10 (RGB-D Instance Seg. for bin-picking) [Dataset]. https://www.kaggle.com/datasets/tobiaippolito/3xm-10-10/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 12, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Tobia Ippolito
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    In short

    This dataset used to investigate the influence of the unique amount of 3D-Models (Shapes) and Materials (Textures) towards the shape-textures bias, performance and generalization of deep neural network instance segmentation in my bachelor exam.

    • one of nine datasets created in Unreal Engine 5 with an NVIDIA RTX A4500
    • It uses 160 unique shapes and 80 unique textures
    • RGB, depth and solution masks are available
    • 20.000 Scenes
    • Ready to use Dataloader, training and inference -> see next section

    Usage

    You can load the images like:

    import cv2
    
    image = cv2.imread(img_path)
    if image is None:
      raise FileNotFoundError(f"Error during data loading: there is no '{img_path}'")
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        
    depth = cv2.imread(depth_path, cv2.IMREAD_UNCHANGED)
    if len(depth.shape) > 2:
      _, depth, _, _ = cv2.split(depth)
          
    mask = cv2.imread(mask_path, cv2.IMREAD_UNCHANGED)  # cv2.IMREAD_GRAYSCALE)
    

    For easy use I recommend to use my own code. You can directly use it to train Mask R-CNN or just use the dataloader. Both are shown now:

    First: Clone my torch github project into your project terminal cd ./path/to/your/project git clone https://github.com/xXAI-botXx/torch-mask-rcnn-instance-segmentation.git Second: Install the anaconda env (optional) terminal cd ./path/to/your/project cd ./torch-mask-rcnn-instance-segmentation conda env create -f conda_env.yml Third: You are ready to use

    Using only the dataloader for your custom project: ```python import os import numpy as np import matplotlib.pyplot as plt import cv2 from torch.utils.data import DataLoader

    import sys sys.path.append("./torch-mask-rcnn-instance-segmentation")

    from maskrcnn_toolkit import DATA_LOADING_MODE, Dual_Dir_Dataset, collate_fn, extract_and_visualize_mask

    data_mode = DATA_LOADING_MODE.ALL

    dataset = Dual_Dir_Dataset(img_dir="/path/to/rgb-folder", depth_dir="/path/to/depth-folder", mask_dir="/path/to/mask-folder", transform=None, amount=1, start_idx=0, end_idx=0, image_name="...", data_mode=data_mode, use_mask=True, use_depth=False, log_path="./logs", width=1920, height=1080, should_log=True, should_print=True, should_verify=False) data_loader = DataLoader(dataset, batch_size=5, shuffle=True, num_workers=4, collate_fn=collate_fn)

    plot

    for data in data_loader: for batch_idx in range(len(data[0])): if len(data) == 3: image = data[0][batch_idx].cpu().unsqueeze(0) masks = data[1][batch_idx]["masks"] masks = masks.cpu() name = data[2][batch_idx] else: image = data[0][batch_idx].cpu().unsqueeze(0) name = data[1][batch_idx]

      image = image.cpu().numpy().squeeze(0)
      image = np.transpose(image, (1, 2, 0)) # Convert to HWC
    
      # Remove 4.th channel if existing
      if image.shape[2] == 4:
        depth = image[:, :, 3]
        image = image[:, :, :3]
      else:
        depth = None
    
      masks_gt = masks.cpu().numpy()
      masks_gt = np.transpose(masks_gt, (1, 2, 0))
      mask = extract_and_visualize_mask(masks_gt, image=None, ax=None, visualize=False, color_map=None, soft_join=False)
    
      # plot
      cols = 1
      if depth is not None:
        cols += 1
      if mask is not None:
        cols += 1
    
      fig, ax = plt.subplots(nrows=1, ncols=cols, figsize=(20, 15*cols))
      fig.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.05, hspace=0.05)
    
      plot_idx = 0
      ax[plot_idx].imshow(image)
      ax[plot_idx].set_title("RGB Input Image")
      ax[plot_idx].axis("off")
    
      if depth is not None:
        plot_idx += 1
        ax[plot_idx].imshow(depth, cmap="gray")
        ax[plot_idx].set_title("Depth Input Image")
        ax[plot_idx].axis("off")
    
      if mask is not None:
        plot_idx += 1
        ax[plot_idx].imshow(mask)
        ax[plot_idx].set_title("Mask Ground Truth")
        ax[plot_idx].axis("off")
    
      plt.show()
    
    
    **Using the whole Mask R-CNN training pipeline:**
    ```python
    import sys
    sys.path.append("./torch-mask-rcnn-instance-segmentation")
    
    from maskrcnn_toolkit import DATA_LOADING_MODE, train
    
    
    # set the vars as you need
    
    WEIGHTS_PATH = None   # Path to the model weights file
    USE_DEPTH = False      # Whether to include depth information -> as rgb and depth on green channel
    VERIFY_DATA = False     # True is recommended
    
    GROUND_PATH = "D:/3xM"  
    DATASET_NAME = "3xM_Dataset_10_10"
    IMG_DIR = os.path.join(GR...
    
  19. Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

    • zenodo.org
    • data.europa.eu
    zip
    Updated Oct 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. http://doi.org/10.5281/zenodo.6832242
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LifeSnaps Dataset Documentation

    Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

    The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

    Data Import: Reading CSV

    For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

    Data Import: Setting up a MongoDB (Recommended)

    To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

    To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

    For the Fitbit data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c fitbit 

    For the SEMA data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c sema 

    For surveys data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c surveys 

    If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

    Data Availability

    The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

    {
      _id: 
  20. R

    Custom Yolov7 On Kaggle On Custom Dataset

    • universe.roboflow.com
    zip
    Updated Jan 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Owais Ahmad (2023). Custom Yolov7 On Kaggle On Custom Dataset [Dataset]. https://universe.roboflow.com/owais-ahmad/custom-yolov7-on-kaggle-on-custom-dataset-rakiq/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 29, 2023
    Dataset authored and provided by
    Owais Ahmad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Person Car Bounding Boxes
    Description

    Custom Training with YOLOv7 🔥

    Some Important links

    Contact Information

    Objective

    To Showcase custom Object Detection on the Given Dataset to train and Infer the Model using newly launched YoloV7.

    Data Acquisition

    The goal of this task is to train a model that can localize and classify each instance of Person and Car as accurately as possible.

    from IPython.display import Markdown, display
    
    display(Markdown("../input/Car-Person-v2-Roboflow/README.roboflow.txt"))
    

    Custom Training with YOLOv7 🔥

    In this Notebook, I have processed the images with RoboFlow because in COCO formatted dataset was having different dimensions of image and Also data set was not splitted into different Format. To train a custom YOLOv7 model we need to recognize the objects in the dataset. To do so I have taken the following steps:

    • Export the dataset to YOLOv7
    • Train YOLOv7 to recognize the objects in our dataset
    • Evaluate our YOLOv7 model's performance
    • Run test inference to view performance of YOLOv7 model at work

    📦 YOLOv7

    https://raw.githubusercontent.com/Owaiskhan9654/Yolo-V7-Custom-Dataset-Train-on-Kaggle/main/car-person-2.PNG" width=800>

    Image Credit - jinfagang

    Step 1: Install Requirements

    !git clone https://github.com/WongKinYiu/yolov7 # Downloading YOLOv7 repository and installing requirements
    %cd yolov7
    !pip install -qr requirements.txt
    !pip install -q roboflow
    

    Downloading YOLOV7 starting checkpoint

    !wget "https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7.pt"
    
    import os
    import glob
    import wandb
    import torch
    from roboflow import Roboflow
    from kaggle_secrets import UserSecretsClient
    from IPython.display import Image, clear_output, display # to display images
    
    
    
    print(f"Setup complete. Using torch {torch._version_} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})")
    

    https://camo.githubusercontent.com/dd842f7b0be57140e68b2ab9cb007992acd131c48284eaf6b1aca758bfea358b/68747470733a2f2f692e696d6775722e636f6d2f52557469567a482e706e67">

    I will be integrating W&B for visualizations and logging artifacts and comparisons of different models!

    YOLOv7-Car-Person-Custom

    try:
      user_secrets = UserSecretsClient()
      wandb_api_key = user_secrets.get_secret("wandb_api")
      wandb.login(key=wandb_api_key)
      anonymous = None
    except:
      wandb.login(anonymous='must')
      print('To use your W&B account,
    Go to Add-ons -> Secrets and provide your W&B access token. Use the Label name as WANDB. 
    Get your W&B access token from here: https://wandb.ai/authorize')
      
      
      
    wandb.init(project="YOLOvR",name=f"7. YOLOv7-Car-Person-Custom-Run-7")
    

    Step 2: Assemble Our Dataset

    https://uploads-ssl.webflow.com/5f6bc60e665f54545a1e52a5/615627e5824c9c6195abfda9_computer-vision-cycle.png" alt="">

    In order to train our custom model, we need to assemble a dataset of representative images with bounding box annotations around the objects that we want to detect. And we need our dataset to be in YOLOv7 format.

    In Roboflow, We can choose between two paths:

    Version v2 Aug 12, 2022 Looks like this.

    https://raw.githubusercontent.com/Owaiskhan9654/Yolo-V7-Custom-Dataset-Train-on-Kaggle/main/Roboflow.PNG" alt="">

    user_secrets = UserSecretsClient()
    roboflow_api_key = user_secrets.get_secret("roboflow_api")
    
    rf = Roboflow(api_key=roboflow_api_key)
    project = rf.workspace("owais-ahmad").project("custom-yolov7-on-kaggle-on-custom-dataset-rakiq")
    dataset = project.version(2).download("yolov7")
    

    Step 3: Training Custom pretrained YOLOv7 model

    Here, I am able to pass a number of arguments: - img: define input image size - batch: determine

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Seair Exim (2016). Python Import Data India – Buyers & Importers List [Dataset]. https://www.seair.co.in

Python Import Data India – Buyers & Importers List

Seair Exim Solutions

Seair Info Solutions PVT LTD

Explore at:
23 scholarly articles cite this dataset (View in Google Scholar)
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Nov 17, 2016
Dataset provided by
Seair Info Solutions PVT LTD
Authors
Seair Exim
Area covered
India
Description

Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

Search
Clear search
Close search
Google apps
Main menu