100+ datasets found
  1. Python Import Data India – Buyers & Importers List

    • seair.co.in
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim, Python Import Data India – Buyers & Importers List [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    India
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  2. Python Import Data in March - Seair.co.in

    • seair.co.in
    Updated Mar 30, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2016). Python Import Data in March - Seair.co.in [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Mar 30, 2016
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    Cyprus, Tuvalu, French Polynesia, Lao People's Democratic Republic, Tanzania, Chad, Israel, Germany, Bermuda, Maldives
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  3. Python Import Data in December - Seair.co.in

    • seair.co.in
    Updated Dec 31, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2015). Python Import Data in December - Seair.co.in [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Dec 31, 2015
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    Korea (Democratic People's Republic of), Guinea, Pitcairn, Mauritius, Bulgaria, French Guiana, Tonga, Bhutan, Nicaragua, Palau
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  4. Z

    Open Context Database SQL Dump

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kansa, Sarah Whitcher (2025). Open Context Database SQL Dump [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14728228
    Explore at:
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    Kansa, Sarah Whitcher
    Kansa, Eric
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Open Context (https://opencontext.org) publishes free and open access research data for archaeology and related disciplines. An open source (but bespoke) Django (Python) application supports these data publishing services. The software repository is here: https://github.com/ekansa/open-context-py

    The Open Context team runs ETL (extract, transform, load) workflows to import data contributed by researchers from various source relational databases and spreadsheets. Open Context uses PostgreSQL (https://www.postgresql.org) relational database to manage these imported data in a graph style schema. The Open Context Python application interacts with the PostgreSQL database via the Django Object-Relational-Model (ORM).

    This database dump includes all published structured data organized used by Open Context (table names that start with 'oc_all_'). The binary media files referenced by these structured data records are stored elsewhere. Binary media files for some projects, still in preparation, are not yet archived with long term digital repositories.

    These data comprehensively reflect the structured data currently published and publicly available on Open Context. Other data (such as user and group information) used to run the Website are not included.

    IMPORTANT

    This database dump contains data from roughly 190+ different projects. Each project dataset has its own metadata and citation expectations. If you use these data, you must cite each data contributor appropriately, not just this Zenodo archived database dump.

  5. Storage and Transit Time Data and Code

    • zenodo.org
    zip
    Updated Oct 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Felton; Andrew Felton (2024). Storage and Transit Time Data and Code [Dataset]. http://doi.org/10.5281/zenodo.14009758
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 29, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrew Felton; Andrew Felton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Author: Andrew J. Felton
    Date: 10/29/2024

    This R project contains the primary code and data (following pre-processing in python) used for data production, manipulation, visualization, and analysis, and figure production for the study entitled:

    "Global estimates of the storage and transit time of water through vegetation"

    Please note that 'turnover' and 'transit' are used interchangeably. Also please note that this R project has been updated multiple times as the analysis has updated.

    Data information:

    The data folder contains key data sets used for analysis. In particular:

    "data/turnover_from_python/updated/august_2024_lc/" contains the core datasets used in this study including global arrays summarizing five year (2016-2020) averages of mean (annual) and minimum (monthly) transit time, storage, canopy transpiration, and number of months of data able as both an array (.nc) or data table (.csv). These data were produced in python using the python scripts found in the "supporting_code" folder. The remaining files in the "data" and "data/supporting_data"" folder primarily contain ground-based estimates of storage and transit found in public databases or through a literature search, but have been extensively processed and filtered here. The "supporting_data"" folder also contains annual (2016-2020) MODIS land cover data used in the analysis and contains separate filters containing the original data (.hdf) and then the final process (filtered) data in .nc format. The resulting annual land cover distributions were used in the pre-processing of data in python.

    #Code information

    Python scripts can be found in the "supporting_code" folder.

    Each R script in this project has a role:

    "01_start.R": This script sets the working directory, loads in the tidyverse package (the remaining packages in this project are called using the `::` operator), and can run two other scripts: one that loads the customized functions (02_functions.R) and one for importing and processing the key dataset for this analysis (03_import_data.R).

    "02_functions.R": This script contains custom functions. Load this using the
    `source()` function in the 01_start.R script.

    "03_import_data.R": This script imports and processes the .csv transit data. It joins the mean (annual) transit time data with the minimum (monthly) transit data to generate one dataset for analysis: annual_turnover_2. Load this using the
    `source()` function in the 01_start.R script.

    "04_figures_tables.R": This is the main workhouse for figure/table production and
    supporting analyses. This script generates the key figures and summary statistics
    used in the study that then get saved in the manuscript_figures folder. Note that all
    maps were produced using Python code found in the "supporting_code"" folder.

    "supporting_generate_data.R": This script processes supporting data used in the analysis, primarily the varying ground-based datasets of leaf water content.

    "supporting_process_land_cover.R": This takes annual MODIS land cover distributions and processes them through a multi-step filtering process so that they can be used in preprocessing of datasets in python.

  6. Python package Datatable

    • kaggle.com
    Updated Oct 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaihua Zhang (2020). Python package Datatable [Dataset]. https://www.kaggle.com/zhangkaihua88/python-datatable/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 22, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kaihua Zhang
    Description

    Context

    This is a Python package for manipulating 2-dimensional tabular data structures (aka data frames). It is close in spirit to pandas or SFrame; however we put specific emphasis on speed and big data support. As the name suggests, the package is closely related to R's data.table and attempts to mimic its core algorithms and API.

    Content

    The wheel file for installing datatable v0.11.0

    Installation

    !pip install ../input/python-datatable/datatable-0.11.0-cp37-cp37m-manylinux2010_x86_64.whl > /dev/null
    

    Using

    import datatable as dt
    data = dt.fread("filename").to_pandas()
    

    Acknowledgements

    https://github.com/h2oai/datatable

    Documentation

    https://datatable.readthedocs.io/en/latest/index.html

    License

    https://github.com/h2oai/datatable/blob/main/LICENSE

  7. e

    Ballroom Python South | See Full Import/Export Data | Eximpedia

    • eximpedia.app
    Updated Jan 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2025). Ballroom Python South | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Jan 8, 2025
    Dataset provided by
    Eximpedia Export Import Trade Data
    Eximpedia PTE LTD
    Authors
    Seair Exim
    Area covered
    Croatia, Guyana, Luxembourg, Myanmar, Zambia, El Salvador, Eritrea, Iceland, Mayotte, State of
    Description

    Ballroom Python South Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.

  8. h

    Python-DPO

    • huggingface.co
    Updated Jul 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NextWealth Entrepreneurs Private Limited (2024). Python-DPO [Dataset]. https://huggingface.co/datasets/NextWealth/Python-DPO
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 18, 2024
    Dataset authored and provided by
    NextWealth Entrepreneurs Private Limited
    Description

    Dataset Card for Python-DPO

    This dataset is the smaller version of Python-DPO-Large dataset and has been created using Argilla.

      Load with datasets
    

    To load this dataset with datasets, you'll just need to install datasets as pip install datasets --upgrade and then use the following code: from datasets import load_dataset

    ds = load_dataset("NextWealth/Python-DPO")

      Data Fields
    

    Each data instance contains:

    instruction: The problem description/requirements… See the full description on the dataset page: https://huggingface.co/datasets/NextWealth/Python-DPO.

  9. Smartwatch Purchase Data

    • kaggle.com
    Updated Dec 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aayush Chourasiya (2022). Smartwatch Purchase Data [Dataset]. https://www.kaggle.com/datasets/albedo0/smartwatch-purchase-data/versions/2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 30, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aayush Chourasiya
    Description

    Disclaimer: This is an artificially generated data using a python script based on arbitrary assumptions listed down.

    The data consists of 100,000 examples of training data and 10,000 examples of test data, each representing a user who may or may not buy a smart watch.

    ----- Version 1 -------

    trainingDataV1.csv, testDataV1.csv or trainingData.csv, testData.csv The data includes the following features for each user: 1. age: The age of the user (integer, 18-70) 1. income: The income of the user (integer, 25,000-200,000) 1. gender: The gender of the user (string, "male" or "female") 1. maritalStatus: The marital status of the user (string, "single", "married", or "divorced") 1. hour: The hour of the day (integer, 0-23) 1. weekend: A boolean indicating whether it is the weekend (True or False) 1. The data also includes a label for each user indicating whether they are likely to buy a smart watch or not (string, "yes" or "no"). The label is determined based on the following arbitrary conditions: - If the user is divorced and a random number generated by the script is less than 0.4, the label is "no" (i.e., assuming 40% of divorcees are not likely to buy a smart watch) - If it is the weekend and a random number generated by the script is less than 1.3, the label is "yes". (i.e., assuming sales are 30% more likely to occur on weekends) - If the user is male and under 30 with an income over 75,000, the label is "yes". - If the user is female and 30 or over with an income over 100,000, the label is "yes". Otherwise, the label is "no".

    The training data is intended to be used to build and train a classification model, and the test data is intended to be used to evaluate the performance of the trained model.

    Following Python script was used to generate this dataset

    import random
    import csv
    
    # Set the number of examples to generate
    numExamples = 100000
    
    # Generate the training data
    with open("trainingData.csv", "w", newline="") as csvfile:
      fieldnames = ["age", "income", "gender", "maritalStatus", "hour", "weekend", "buySmartWatch"]
      writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    
      writer.writeheader()
    
      for i in range(numExamples):
        age = random.randint(18, 70)
        income = random.randint(25000, 200000)
        gender = random.choice(["male", "female"])
        maritalStatus = random.choice(["single", "married", "divorced"])
        hour = random.randint(0, 23)
        weekend = random.choice([True, False])
    
        # Randomly assign the label based on some arbitrary conditions
        # assuming 40% of divorcees won't buy a smart watch
        if maritalStatus == "divorced" and random.random() < 0.4:
          buySmartWatch = "no"
        # assuming sales are 30% more likely to occur on weekends.
        elif weekend == True and random.random() < 1.3:
          buySmartWatch = "yes"
        elif gender == "male" and age < 30 and income > 75000:
          buySmartWatch = "yes"
        elif gender == "female" and age >= 30 and income > 100000:
          buySmartWatch = "yes"
        else:
          buySmartWatch = "no"
    
        writer.writerow({
          "age": age,
          "income": income,
          "gender": gender,
          "maritalStatus": maritalStatus,
          "hour": hour,
          "weekend": weekend,
          "buySmartWatch": buySmartWatch
        })
    

    ----- Version 2 -------

    trainingDataV2.csv, testDataV2.csv The data includes the following features for each user: 1. age: The age of the user (integer, 18-70) 1. income: The income of the user (integer, 25,000-200,000) 1. gender: The gender of the user (string, "male" or "female") 1. maritalStatus: The marital status of the user (string, "single", "married", or "divorced") 1. educationLevel: The education level of the user (string, "high school", "associate's degree", "bachelor's degree", "master's degree", or "doctorate") 1. occupation: The occupation of the user (string, "tech worker", "manager", "executive", "sales", "customer service", "creative", "manual labor", "healthcare", "education", "government", "unemployed", or "student") 1. familySize: The number of people in the user's family (integer, 1-5) 1. fitnessInterest: A boolean indicating whether the user is interested in fitness (True or False) 1. priorSmartwatchOwnership: A boolean indicating whether the user has owned a smartwatch in the past (True or False) 1. hour: The hour of the day when the user was surveyed (integer, 0-23) 1. weekend: A boolean indicating whether the user was surveyed on a weekend (True or False) 1. buySmartWatch: A boolean indicating whether the user purchased a smartwatch (True or False)

    Python script used to generate the data:

    import random
    import csv
    
    # Set the number of examples to generate
    numExamples = 100000
    
    with open("t...
    
  10. e

    Eximpedia Export Import Trade

    • eximpedia.app
    Updated Feb 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2025). Eximpedia Export Import Trade [Dataset]. https://www.eximpedia.app/
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Feb 7, 2025
    Dataset provided by
    Eximpedia Export Import Trade Data
    Eximpedia PTE LTD
    Authors
    Seair Exim
    Area covered
    Curaçao, Bosnia and Herzegovina, United States Minor Outlying Islands, Romania, Germany, Djibouti, Argentina, Eritrea, Antarctica, Malta
    Description

    Eximpedia Export import trade data lets you search trade data and active Exporters, Importers, Buyers, Suppliers, manufacturers exporters from over 209 countries

  11. Python Import Data in August - Seair.co.in

    • seair.co.in
    Updated Aug 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2016). Python Import Data in August - Seair.co.in [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Aug 20, 2016
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    Lebanon, Belgium, Saint Pierre and Miquelon, Falkland Islands (Malvinas), Gambia, Christmas Island, Nepal, Virgin Islands (U.S.), South Africa, Ecuador
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  12. H

    Hydroinformatics Instruction Module Example Code: Programmatic Data Access...

    • hydroshare.org
    • beta.hydroshare.org
    • +1more
    zip
    Updated Mar 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amber Spackman Jones; Jeffery S. Horsburgh (2022). Hydroinformatics Instruction Module Example Code: Programmatic Data Access with USGS Data Retrieval [Dataset]. https://www.hydroshare.org/resource/a58b5d522d7f4ab08c15cd05f3fd2ad3
    Explore at:
    zip(34.5 KB)Available download formats
    Dataset updated
    Mar 3, 2022
    Dataset provided by
    HydroShare
    Authors
    Amber Spackman Jones; Jeffery S. Horsburgh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This resource contains Jupyter Notebooks with examples for accessing USGS NWIS data via web services and performing subsequent analysis related to drought with particular focus on sites in Utah and the southwestern United States (could be modified to any USGS sites). The code uses the Python DataRetrieval package. The resource is part of set of materials for hydroinformatics and water data science instruction. Complete learning module materials are found in HydroLearn: Jones, A.S., Horsburgh, J.S., Bastidas Pacheco, C.J. (2022). Hydroinformatics and Water Data Science. HydroLearn. https://edx.hydrolearn.org/courses/course-v1:USU+CEE6110+2022/about.

    This resources consists of 6 example notebooks: 1. Example 1: Import and plot daily flow data 2. Example 2: Import and plot instantaneous flow data for multiple sites 3. Example 3: Perform analyses with USGS annual statistics data 4. Example 4: Retrieve data and find daily flow percentiles 3. Example 5: Further examination of drought year flows 6. Coding challenge: Assess drought severity

  13. m

    Customers order for a Printing Company (2D Bin Packing and Scheduling)

    • data.mendeley.com
    Updated May 26, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mahdi mostajabdaveh (2019). Customers order for a Printing Company (2D Bin Packing and Scheduling) [Dataset]. http://doi.org/10.17632/bxh46tps75.4
    Explore at:
    Dataset updated
    May 26, 2019
    Authors
    mahdi mostajabdaveh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These data belongs to an actual printing company . Each record in Excel file Raw Data/Big_Data present an order from customers. In column "ColorMode" ; 4+0 means the order is one sided and 4+4 means it is two-sided. Files in Instances folder correspond to the instances used for computational tests in the article. Each of these instances has two related file with the same characteristics. One with gdx suffix and one with out any file extension. These files are used to import data to the python implementation. The code and relevant description can be found in Read_input.py file.

  14. k

    experiment_evaluation

    • radar.kit.edu
    • radar-service.eu
    tar
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mervin Seiberlich (2023). experiment_evaluation [Dataset]. http://doi.org/10.35097/1426
    Explore at:
    tar(89088 bytes)Available download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    Karlsruhe Institute of Technology
    Authors
    Mervin Seiberlich
    Description

    🔬️ Experiment Evaluation

    Python module for the evaluation of lab experiments. The module implements functions to import meta-data of measurements, filters to search for subsets of them and routines to import and plot data from this meta-data. It works well in its original context but is currently in open alpha since it will be restructured in order to be compatible with new lab environments. Examples of its usage in scientific works will soon be published by the author that can be used to reference it. Feel free to use it for your own projects and to ask questions. For now you can cite this repository as source.

    💻️ Installation

    You need a running python3 installation on your OS. The module was written on Debian/GNU-Linux, was tested on Windows and should also run on other OS. It is recommended to work in an virtual environment (see the official python documentation -> from bash: python3 -m venv exp_env source exp_env/bin/activate) or conda installation.

    Dependencies

    Dependencies are the usual scientific modules like numpy, matplotlib, pandas but also astropy. See the requirements.txt from that you should be able to install the library with pip install pip -U # Update pip itself pip install -r /path/to/requirements.txt Alternatively you can also install the required modules from the shell etc.. The author recommends to also install jupyter that includes the interactive ipython: ```

    Example via pip

    pip install jupyter pip install numpy pip install matplotlib pip install scipy pip install pandas pip install astropy pip install mplcursors pip install pynufft

    pip install python-slugify # make sure this version of slugify is installed and not 'slugify'

    ## The module itself
    Inside your virtual environment there is a folder `exp_env/lib/python3.../site-packages`. Place the file `experiment_evaluation.py` inside this folder (or a new sub-folder with all your personal scientific code) to make it accessible.
    From within your code (try it from an interactive ipython session) you should now be able to import it via:
    

    import experiment_evaluation as ee

    or from subfolder: import my_scientific_modules.experiment_evaluation as ee

    ### Matplotlib style
    In order to use the fancy custom styles (for example for consistent looking graphs throughout your publication) it is advised to use matplotlib styles. For the provided styles, copy the custom styles "thesis_default.mplstyle" etc. from the folder `stylelib` inside your matplotlib library folder:
    `lib/python3.9/site-packages/matplotlib/mpl-data/stylelib/*.mplstyle`
    # 🧑‍💻 Usage
    A good way to learn its usage is to have a look at the [example](examples/example_experiment_evaluation.ipynb) file. But since the module is work in progress we first explain some concepts.
    ## ✨ Why meta-data?
    The module automates several steps of experiment evaluations. But the highlight is its capability to handle experimental meta-data. This enables the user to automatically choose and plot data with a question in mind (example: plot all EQE-curves at -2V and 173Hz) instead of repeatedly choosing files manually. For calculations that need more than one measurement this becomes extremely useful but also for implementing statistics.
    Meta data include things like experimental settings (applied voltage on a diode, time of the measurement, temperature etc.), the experimentalist and technical informations (file-format etc., manufacturer experimental device).
    The module includes some generic functions but to use it for your specific lab environment you might need to add experiment and plot specific functions.
    ## 💾️ How to save your experiment files?
    In general lab measurement files stem from different devices and export routines. So frankly speaking lab-data is often a mess! But to use automatic evaluation tools some sort of system to recognize the measurement-type and store the meta-data is needed. In an ideal world a lab would decide on one file format for all measurements and labels them systematically. To include different data-types and their meta-data within one file-type there exists the *.asdf (advanced scientific data format, see their [documentation](https://asdf.readthedocs.io/en/stable/index.html) for further insight). So if you are just starting with your PhD try to use this file format everywhere ;).
    Also to make experiments distinguishable every experiment needs an unique identifier. So you also should number every new experiment with an increasing number and the type of the experiment.
    Example of useful file naming for EQE measurements: `Nr783_EQE.asdf`
    In the case of my PhD I decided to use what I found: store the different file formats, store them in folders with the name of the experiment and include meta-data in the file-names (bad example: `EQE/Nr783_3volt_pix1.csv`). This was not the best idea (so learn from what I learned :P)
    To handle that mess, this module therefore implements also some regular-expressions to extract meta-data from file-names (`ee.meta_from_filename()`), but in general it is advised to store all meta-data in the file-header (with the exception of the unique identifier and experiment type). Like this you could store your files in whatever folder structure you like and still find them from within the script. The module then imports meta-data from the files into a database and you can do fancy data-science with your data!
    ## 📑️ Database
    For calculations and filtering of datasets the meta-data and data needs to be accessible in a machine readable form. For the time being the module imports all meta-data into a pandas DataFrame that represents our database (For very large datasets this would possibly be needed to be changed). For this we have to name the root folder that includes all experiment files/folders.
    **Hint**: If you did not follow the unique labeling/numbering for all your experiments you can still use this module by choosing a root folder that only includes the current experiment.
    

    from pathlib import Path measurement_root_folder = Path("/home/PhD/Data/") We can specify some pre-filtering for the specific experiment we want to evaluate:

    make use of the '/' operator to build OS independant paths

    measurement_folder = measurement_root_folder / "LaserLab" / "proximity-sensor" / "OPD-Lens" / "OPD-Lens_v2"

    Define some pre-filter

    devices = [nr for nr in range(1035, 1043)] # Unique sample numbers of the experiment listed by list-comprehension explst = "Mervin Seiberlich" Then we import the metadata into the pandas DataFrame database via `ee.list_measurements()` and call it *meta-table*: meta_table = ee.list_measurements(measurement_root_folder, devices, experimentalist=explst, sort_by=["measurement_type", "nr", "pix", "v"]) ```

    💡️ Advanced note:

    Internally ee.list_measurements() uses custom functions to import the experiment specific meta-data. Have a look into the source-code and search for read_meta for an example how this works in detail. With the *.asdf file-format only the generalized import function would be needed.

    Import data and meta-data

    To import now some measurement data for plotting we use the information inside meta_table with custom import routines and python dictionaries implementing our filters: ```

    Distinguish between reference and other measurments

    lens = {"nr":devices[:5]} ref = {"nr":devices[5:]}

    Select by bias and compare reference samples with lens (**dict unpacks the values to combine two or mor dictionaries)

    eqe_lens_0V = ee.import_eqe(meta_table, mask_dict={**lens, {"v":0}}) eqe_ref_0V = ee.import_eqe(meta_table, mask_dict={ref, **{"v":0}}) ``This yields python listseqe_lens_0V = [table1, table2, ... tableN]` with the selected data ready for plotting (Lists are maybe not smart for huge dataset and some N-dimensional object can replace this in future). Note: The tables inside the list are astropy.QTable() objects including the data and meta-data, as well as units! So with this few lines of code you already did some advanced data filtering and import!

    🌡️ Physical units

    The module astropy includes a submodule astropy.units. Since we deal with real world data, it is a good idea to also include units in calculations. ``` import astropy.units as u

    Radius of one microlens:

    r = 98 * u.um ```

    📝️ Calculations

    If you have to repeatedly do some advanced calculations or fits for some plots, include them as functions in the source-code. An example would be ee.pink_noise()

    📊️ Plots

    For plotting there exists many modules in python. Due to its grate power we use matplotlib. This comes with the cost of some complexity (definitely have a look at its documentation!). But this enables us for example to have a consistence color style, figure-size and text-size in large projects like a PhD-thesis: mpl.style.use(["thesis_default", "thesis_talk"]) # We use style-sheets to set things like figure-size and text-size, see https://matplotlib.org/stable/tutorials/introductory/customizing.html#composing-styles w,h = plt.rcParams['figure.figsize'] # get the default size for figures to scale plots accordingly In order to not invent the wheel over and over again it makes sense to wrap some plotting routines for each experiment inside some custom functions. For further detail see the documentation/recommended function signature for matplotlib specialized functions. This enables easy experiment-type specific plotting (even with statistics) once all functions are set up: ```

    %% plot eqe statistics

    fig, ax = plt.subplots(1,1, figsize=(w, h), layout="constrained") ee.plot_eqe(ax, eqe_lens_0V, statistics=True, color="tab:green", plot_type="EQE", marker=True, ncol=2) ee.plot_eqe(ax, eqe_ref_0V, statistics=True, color="tab:blue", plot_type="EQE", marker=True,

  15. o

    Demographic Analysis Workflow using Census API in Jupyter Notebook:...

    • openicpsr.org
    delimited
    Updated Jul 23, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Donghwan Gu; Nathanael Rosenheim (2020). Demographic Analysis Workflow using Census API in Jupyter Notebook: 1990-2000 Population Size and Change [Dataset]. http://doi.org/10.3886/E120381V1
    Explore at:
    delimitedAvailable download formats
    Dataset updated
    Jul 23, 2020
    Dataset provided by
    Texas A&M University
    Authors
    Donghwan Gu; Nathanael Rosenheim
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Boone County, Kentucky, US Counties
    Description

    This archive reproduces a table titled "Table 3.1 Boone county population size, 1990 and 2000" from Wang and vom Hofe (2007, p.58). The archive provides a Jupyter Notebook that uses Python and can be run in Google Colaboratory. The workflow uses Census API to retrieve data, reproduce the table, and ensure reproducibility for anyone accessing this archive.The Python code was developed in Google Colaboratory, or Google Colab for short, which is an Integrated Development Environment (IDE) of JupyterLab and streamlines package installation, code collaboration and management. The Census API is used to obtain population counts from the 1990 and 2000 Decennial Census (Summary File 1, 100% data). All downloaded data are maintained in the notebook's temporary working directory while in use. The data are also stored separately with this archive.The notebook features extensive explanations, comments, code snippets, and code output. The notebook can be viewed in a PDF format or downloaded and opened in Google Colab. References to external resources are also provided for the various functional components. The notebook features code to perform the following functions:install/import necessary Python packagesintroduce a Census API Querydownload Census data via CensusAPI manipulate Census tabular data calculate absolute change and percent changeformatting numbersexport the table to csvThe notebook can be modified to perform the same operations for any county in the United States by changing the State and County FIPS code parameters for the Census API downloads. The notebook could be adapted for use in other environments (i.e., Jupyter Notebook) as well as reading and writing files to a local or shared drive, or cloud drive (i.e., Google Drive).

  16. Python Import Data in September - Seair.co.in

    • seair.co.in
    Updated Sep 28, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2016). Python Import Data in September - Seair.co.in [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Sep 28, 2016
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    Slovenia, Grenada, Djibouti, Seychelles, Luxembourg, Turks and Caicos Islands, Saint Martin (French part), Christmas Island, Gibraltar, Holy See
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  17. Optiver Precomputed Features Numpy Array

    • kaggle.com
    Updated Aug 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tal Perry (2021). Optiver Precomputed Features Numpy Array [Dataset]. https://www.kaggle.com/lighttag/optiver-precomputed-features-numpy-array
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 14, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Tal Perry
    Description

    What's In This

    This is a single numpy array with all the Optiver data joined together. It also has some of the features from this notebook It's designed to be mmapped so that you can read small pieces at once.

    This is one big array with the trade and book data joined together plus some pre-computed features. The dtype of the array if fp16. The arrays shape is (n_times,n_stocks,600,27) where 600 is the max second_in_bucket and 27 is the number of columns.

    How To Use It

    Add the dataset to your notebook and then python import numpy as np ntimeids=3830 nstocks=112 ncolumns = 27 nseq = 600 arr = np.memmap('../input/optiver-precomputed-features-numpy-array/data.array',mode='r',dtype=np.float16,shape=(ntimeids,nstocks,600,ncolumns))

    Caveats

    Handling Varying Sequence Sizes

    There are gaps in the stock ids and time ids, which doesn't work great with an array format. So we have time and stocks indexes as well (_ix suffix instead of _id). To calculate these:

    import numpy as np
    
    
    
    import pandas as pd
    import numpy as np
    
    targets = pd.read_csv('/kaggle/input/optiver-realized-volatility-prediction/train.csv')
    ntimeids = targets.time_id.nunique()
    stock_ids = list(sorted(targets.stock_id.unique()))
    timeids = sorted(targets.time_id.unique())
    timeid_to_ix = {time_id:i for i,time_id in enumerate(timeids)}
    stock_id_to_ix = {stock_id:i for i,stock_id in enumerate(stock_ids)}
    
    

    Getting data For a particular stock id / time id

    So to get the data for stock_id 13 on time_id 146 you'd do stock_ix = stock_id_to_ix[13] time_ix = timeid_to_ix[146] arr[time_ix,stock_ix]

    Notice that the third dimension is of size 600 (the max number of points for a given time_ix,stock_id. Some of these will be empty. To get truncate a single stocks data do max_seq_ix = (arr[time_ix,stock_ix,:,-1]>0).cumsum().max() arr[time_ix,stock_ix,:max_seq_ix,]

    Column Mappings

    There are 27 columns in the last dimension these are:

    ['time_id', 'seconds_in_bucket', 'bid_price1', 'ask_price1', 'bid_price2', 'ask_price2', 'bid_size1', 'ask_size1', 'bid_size2', 'ask_size2', 'stock_id', 'wap1', 'wap2', 'log_return1', 'log_return2', 'wap_balance', 'price_spread', 'bid_spread', 'ask_spread', 'total_volume', 'volume_imbalance', 'price', 'size', 'order_count', 'stock_id_y', 'log_return_trade', 'target']

  18. Z

    Data from: Large Landing Trajectory Data Set for Go-Around Analysis

    • data.niaid.nih.gov
    • zenodo.org
    Updated Dec 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcel Dettling (2022). Large Landing Trajectory Data Set for Go-Around Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7148116
    Explore at:
    Dataset updated
    Dec 16, 2022
    Dataset provided by
    Marcel Dettling
    Manuel Waltert
    Raphael Monstein
    Benoit Figuet
    Timothé Krauth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.

    If you use this data for a scientific publication, please consider citing our paper.

    The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:

    go_arounds_minimal.csv.gz

    Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:

        Column name
        Type
        Description
    
    
    
    
        time
        date time
        UTC time of landing or first GA attempt
    
    
        icao24
        string
        Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    
    
        callsign
        string
        Aircraft identifier in air-ground communications
    
    
        airport
        string
        ICAO airport code where the aircraft is landing
    
    
        runway
        string
        Runway designator on which the aircraft landed
    
    
        has_ga
        string
        "True" if at least one GA was performed, otherwise "False"
    
    
        n_approaches
        integer
        Number of approaches identified for this flight
    
    
        n_rwy_approached
        integer
        Number of unique runways approached by this flight
    

    The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.

    go_arounds_augmented.csv.gz

    Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:

        Column name
        Type
        Description
    
    
    
    
        time
        date time
        UTC time of landing or first GA attempt
    
    
        icao24
        string
        Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    
    
        callsign
        string
        Aircraft identifier in air-ground communications
    
    
        airport
        string
        ICAO airport code where the aircraft is landing
    
    
        runway
        string
        Runway designator on which the aircraft landed
    
    
        has_ga
        string
        "True" if at least one GA was performed, otherwise "False"
    
    
        n_approaches
        integer
        Number of approaches identified for this flight
    
    
        n_rwy_approached
        integer
        Number of unique runways approached by this flight
    
    
        registration
        string
        Aircraft registration
    
    
        typecode
        string
        Aircraft ICAO typecode
    
    
        icaoaircrafttype
        string
        ICAO aircraft type
    
    
        wtc
        string
        ICAO wake turbulence category
    
    
        glide_slope_angle
        float
        Angle of the ILS glide slope in degrees
    
    
        has_intersection
    

    string

        Boolean that is true if the runway has an other runway intersecting it, otherwise false
    
    
        rwy_length
        float
        Length of the runway in kilometre
    
    
        airport_country
        string
        ISO Alpha-3 country code of the airport
    
    
        airport_region
        string
        Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
    
    
        operator_country
        string
        ISO Alpha-3 country code of the operator
    
    
        operator_region
        string
        Geographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania)
    
    
        wind_speed_knts
        integer
        METAR, surface wind speed in knots
    
    
        wind_dir_deg
        integer
        METAR, surface wind direction in degrees
    
    
        wind_gust_knts
        integer
        METAR, surface wind gust speed in knots
    
    
        visibility_m
        float
        METAR, visibility in m
    
    
        temperature_deg
        integer
        METAR, temperature in degrees Celsius
    
    
        press_sea_level_p
        float
        METAR, sea level pressure in hPa
    
    
        press_p
        float
        METAR, QNH in hPA
    
    
        weather_intensity
        list
        METAR, list of present weather codes: qualifier - intensity
    
    
        weather_precipitation
        list
        METAR, list of present weather codes: weather phenomena - precipitation
    
    
        weather_desc
        list
        METAR, list of present weather codes: qualifier - descriptor
    
    
        weather_obscuration
        list
        METAR, list of present weather codes: weather phenomena - obscuration
    
    
        weather_other
        list
        METAR, list of present weather codes: weather phenomena - other
    

    This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.

    go_arounds_agg.csv.gz

    Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:

        Column name
        Type
        Description
    
    
    
    
        airport
        string
        ICAO airport code where the aircraft is landing
    
    
        runway
        string
        Runway designator on which the aircraft landed
    
    
        n_landings
        integer
        Total number of landings observed on this runway in 2019
    
    
        ga_rate
        float
        Go-around rate, per 1000 landings
    
    
        glide_slope_angle
        float
        Angle of the ILS glide slope in degrees
    
    
        has_intersection
        string
        Boolean that is true if the runway has an other runway intersecting it, otherwise false
    
    
        rwy_length
        float
        Length of the runway in kilometres
    
    
        airport_country
        string
        ISO Alpha-3 country code of the airport
    
    
        airport_region
        string
        Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
    

    This aggregated data set is used in the paper for the generalized linear regression model.

    Downloading the trajectories

    Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:

    import datetime from tqdm.auto import tqdm import pandas as pd from traffic.data import opensky from traffic.core import Traffic

    load minimum data set

    df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False) df["time"] = pd.to_datetime(df["time"])

    select London City Airport, go-arounds, and 2019-01-04

    airport = "EGLC" start = datetime.datetime(year=2019, month=1, day=4).replace( tzinfo=datetime.timezone.utc ) stop = datetime.datetime(year=2019, month=1, day=5).replace( tzinfo=datetime.timezone.utc )

    df_selection = df.query("airport==@airport & has_ga & (@start <= time <= @stop)")

    iterate over flights and pull the data from OpenSky Network

    flights = [] delta_time = pd.Timedelta(minutes=10) for _, row in tqdm(df_selection.iterrows(), total=df_selection.shape[0]): # take at most 10 minutes before and 10 minutes after the landing or go-around start_time = row["time"] - delta_time stop_time = row["time"] + delta_time

    # fetch the data from OpenSky Network
    flights.append(
      opensky.history(
        start=start_time.strftime("%Y-%m-%d %H:%M:%S"),
        stop=stop_time.strftime("%Y-%m-%d %H:%M:%S"),
        callsign=row["callsign"],
        return_flight=True,
      )
    )
    

    The flights can be converted into a Traffic object

    Traffic.from_flights(flights)

    Additional files

    Additional files are available to check the quality of the classification into GA/not GA and the selection of the landing runway. These are:

    validation_table.xlsx: This Excel sheet was manually completed during the review of the samples for each runway in the data set. It provides an estimate of the false positive and false negative rate of the go-around classification. It also provides an estimate of the runway misclassification rate when the airport has two or more parallel runways. The columns with the headers highlighted in red were filled in manually, the rest is generated automatically.

    validation_sample.zip: For each runway, 8 batches of 500 randomly selected trajectories (or as many as available, if fewer than 4000) classified as not having a GA and up to 8 batches of 10 random landings, classified as GA, are plotted. This allows the interested user to visually inspect a random sample of the landings and go-arounds easily.

  19. Digitisation of Weather Records of Seungjeongwon Ilgi: A Historical Weather...

    • zenodo.org
    bin, csv, json, txt
    Updated Sep 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeyu Lyu; Zeyu Lyu; Kohei Ichikawa; Kohei Ichikawa; Yongchao Cheng; Yongchao Cheng; Hisashi Hayakawa; Hisashi Hayakawa; Yukiko Kawamoto; Yukiko Kawamoto (2023). Digitisation of Weather Records of Seungjeongwon Ilgi: A Historical Weather Dynamics Dataset of the Korean Peninsula (1623-1910) [Dataset]. http://doi.org/10.5281/zenodo.7453644
    Explore at:
    csv, json, bin, txtAvailable download formats
    Dataset updated
    Sep 27, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Zeyu Lyu; Zeyu Lyu; Kohei Ichikawa; Kohei Ichikawa; Yongchao Cheng; Yongchao Cheng; Hisashi Hayakawa; Hisashi Hayakawa; Yukiko Kawamoto; Yukiko Kawamoto
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Korea
    Description

    Introduction

    This study has exploited the daily weather records of Seungjeongwon Ilgi from the NIKH database. Seungjeongwon Ilgi (http://sjw.history.go.kr/main.do) is a daily record of the Seungjeongwon, the Royal Secretariat of the Joseon Dynasty of Korea. These diaries span from 1623 to 1910 and generally involve daily weather records in the entry header. Their observational site would be located in Seoul (N37°35′, E126°59′). We have exploited the weather records from the NIKH database and classified the daily weather using text mining method. We have also converted the report dates from the traditional lunisolar calendar to the Gregorian calendar, to better contextualise our data into the contemporary daily measurements.

    Data

    We provide different formats (csv, xlsx, json) to facilitate the usage of data. The main contents of data are listed as below.

    • ID: The unique identifier of a specific record in the metadata, which can also serve as the identifier to merge with external data in the NIKH digital database.
    • Traditional calendar: The original lunar dates in the NIKH digital database, which are listed in data format "YYYY-MM-DD". More specifically, "L0" implies the leap year and "L1" implies the common year.
    • Leap: The identifier of a leap year.
    • Gregorian calendar: The Gregorian calendar date that converted by the traditional calendar date.
    • Weather Text: The text that describe the weather conditions. Specifically, multiple weather descriptions of the same day have been put together.
    • Flag: The computed value that indicates different combinations of weather conditions.
    • Volume: The volume of text in the original record.
    • Herbal Volume: The volume of text in the herbal record.
    • Sunny: A dummy variable that represents whether the weather description contains the expression of sunny.
    • Cloudy: A dummy variable that represents whether the weather description contains the expression of cloudy.
    • Rainy: A dummy variable that represents whether the weather description contains the expression of rainy.
    • Snow: A dummy variable that represents whether the weather description contains the expression of snow.
    • Wind: A dummy variable that represents whether the weather description contains the expression of wind.

    Import Data

    # Python
    # CSV file
    import pandas as pd
    data=pd.read_csv('~/SJWilgi_Seoul_Weather_YR1623_1910.csv',encoding="utf-8") 
    # JSON file
    data=pd.read_json('~/SJWilgi_Seoul_Weather_YR1623_1910.json',encoding="utf-8")
    # Excel file
    data=pd.read_excel('~/SJWilgi_Seoul_Weather_YR1623_1910.xlsx') # Excel file
    # R
    # CSV file
    library(readr)
    data<- read_csv("~/SJWilgi_Seoul_Weather_YR1623_1910.csv")
    # Excel file
    library(readxl)
    data <- read_excel("~/SJWilgi_Seoul_Weather_YR1623_1910.xlsx")

  20. S

    FLUNT simulated trapezoidal PCHE flow and heat transfer data set

    • scidb.cn
    Updated Sep 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    zhao zi yan (2024). FLUNT simulated trapezoidal PCHE flow and heat transfer data set [Dataset]. http://doi.org/10.57760/sciencedb.hjs.00106
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 16, 2024
    Dataset provided by
    Science Data Bank
    Authors
    zhao zi yan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Build a trapezoidal PCHE two channel model using FLUNT software as a unit for processing, simulate by changing different input conditions, obtain corresponding results, export them as CSV files, use Python for data processing, remove unnecessary information columns, and combine certain information from each file to form a snapshot matrix CSV file. After processing the snapshot matrix CSV file in Python, import it into MATLAB for prediction, and finally export the MATLAB results as a result CSV file.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Seair Exim, Python Import Data India – Buyers & Importers List [Dataset]. https://www.seair.co.in
Organization logo

Python Import Data India – Buyers & Importers List

Seair Exim Solutions

Seair Info Solutions PVT LTD

Explore at:
21 scholarly articles cite this dataset (View in Google Scholar)
.bin, .xml, .csv, .xlsAvailable download formats
Dataset provided by
Seair Exim Solutions
Authors
Seair Exim
Area covered
India
Description

Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

Search
Clear search
Close search
Google apps
Main menu