100+ datasets found
  1. Exploratory Data Analysis on Automobile Dataset

    • kaggle.com
    zip
    Updated Sep 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Monis Ahmad (2022). Exploratory Data Analysis on Automobile Dataset [Dataset]. https://www.kaggle.com/datasets/monisahmad/automobile
    Explore at:
    zip(4915 bytes)Available download formats
    Dataset updated
    Sep 12, 2022
    Authors
    Monis Ahmad
    Description

    Dataset

    This dataset was created by Monis Ahmad

    Contents

  2. COVID-19 Data Visualization Using Python

    • kaggle.com
    zip
    Updated Apr 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adithya Wijesinghe (2023). COVID-19 Data Visualization Using Python [Dataset]. https://www.kaggle.com/datasets/adithyawijesinghe/covid-19-data
    Explore at:
    zip(1291081 bytes)Available download formats
    Dataset updated
    Apr 21, 2023
    Authors
    Adithya Wijesinghe
    License

    https://www.usa.gov/government-works/https://www.usa.gov/government-works/

    Description

    Data visualization using Python (Pandas, Plotly).

    Data was used to visualization of the infection rate and the death rate from 01/20 to 04/22.

    The data was made available on Github: https://raw.githubusercontent.com/datasets/covid-19/master/data/countries-aggregated.csv

  3. Pandas Practice Dataset

    • kaggle.com
    zip
    Updated Jan 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mrityunjay Pathak (2023). Pandas Practice Dataset [Dataset]. https://www.kaggle.com/datasets/themrityunjaypathak/pandas-practice-dataset/discussion
    Explore at:
    zip(493 bytes)Available download formats
    Dataset updated
    Jan 27, 2023
    Authors
    Mrityunjay Pathak
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    What is Pandas?

    Pandas is a Python library used for working with data sets.

    It has functions for analyzing, cleaning, exploring, and manipulating data.

    The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was created by Wes McKinney in 2008.

    Why Use Pandas?

    Pandas allows us to analyze big data and make conclusions based on statistical theories.

    Pandas can clean messy data sets, and make them readable and relevant.

    Relevant data is very important in data science.

    What Can Pandas Do?

    Pandas gives you answers about the data. Like:

    Is there a correlation between two or more columns?

    What is average value?

    Max value?

    Min value?

  4. d

    Python code used to download gridMET climate data for public-supply water...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Python code used to download gridMET climate data for public-supply water service areas [Dataset]. https://catalog.data.gov/dataset/python-code-used-to-download-gridmet-climate-data-for-public-supply-water-service-areas
    Explore at:
    Dataset updated
    Nov 12, 2025
    Dataset provided by
    U.S. Geological Survey
    Description

    This child item describes Python code used to retrieve gridMET climate data for a specific area and time period. Climate data were retrieved for public-supply water service areas, but the climate data collector could be used to retrieve data for other areas of interest. This dataset is part of a larger data release using machine learning to predict public supply water use for 12-digit hydrologic units from 2000-2020. Data retrieved by the climate data collector code were used as input feature variables in the public supply delivery and water use machine learning models. This page includes the following file: climate_data_collector.zip - a zip file containing the climate data collector Python code used to retrieve climate data and a README file.

  5. Datasets for manuscript "A data engineering framework for chemical flow...

    • catalog.data.gov
    • gimi9.com
    Updated Nov 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2021). Datasets for manuscript "A data engineering framework for chemical flow analysis of industrial pollution abatement operations" [Dataset]. https://catalog.data.gov/dataset/datasets-for-manuscript-a-data-engineering-framework-for-chemical-flow-analysis-of-industr
    Explore at:
    Dataset updated
    Nov 7, 2021
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The EPA GitHub repository PAU4ChemAs as described in the README.md file, contains Python scripts written to build the PAU dataset modules (technologies, capital and operating costs, and chemical prices) for tracking chemical flows transfers, releases estimation, and identification of potential occupation exposure scenarios in pollution abatement units (PAUs). These PAUs are employed for on-site chemical end-of-life management. The folder datasets contains the outputs for each framework step. The Chemicals_in_categories.csv contains the chemicals for the TRI chemical categories. The EPA GitHub repository PAU_case_study as described in its readme.md entry, contains the Python scripts to run the manuscript case study for designing the PAUs, the data-driven models, and the decision-making module for chemicals of concern and tracking flow transfers at the end-of-life stage. The data was obtained by means of data engineering using different publicly-available databases. The properties of chemicals were obtained using the GitHub repository Properties_Scraper, while the PAU dataset using the repository PAU4Chem. Finally, the EPA GitHub repository Properties_Scraper contains a Python script to massively gather information about exposure limits and physical properties from different publicly-available sources: EPA, NOAA, OSHA, and the institute for Occupational Safety and Health of the German Social Accident Insurance (IFA). Also, all GitHub repositories describe the Python libraries required for running their code, how to use them, the obtained outputs files after running the Python script modules, and the corresponding EPA Disclaimer. This dataset is associated with the following publication: Hernandez-Betancur, J.D., M. Martin, and G.J. Ruiz-Mercado. A data engineering framework for on-site end-of-life industrial operations. JOURNAL OF CLEANER PRODUCTION. Elsevier Science Ltd, New York, NY, USA, 327: 129514, (2021).

  6. Storage and Transit Time Data and Code

    • zenodo.org
    zip
    Updated Oct 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Felton; Andrew Felton (2024). Storage and Transit Time Data and Code [Dataset]. http://doi.org/10.5281/zenodo.14009758
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 29, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrew Felton; Andrew Felton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Author: Andrew J. Felton
    Date: 10/29/2024

    This R project contains the primary code and data (following pre-processing in python) used for data production, manipulation, visualization, and analysis, and figure production for the study entitled:

    "Global estimates of the storage and transit time of water through vegetation"

    Please note that 'turnover' and 'transit' are used interchangeably. Also please note that this R project has been updated multiple times as the analysis has updated.

    Data information:

    The data folder contains key data sets used for analysis. In particular:

    "data/turnover_from_python/updated/august_2024_lc/" contains the core datasets used in this study including global arrays summarizing five year (2016-2020) averages of mean (annual) and minimum (monthly) transit time, storage, canopy transpiration, and number of months of data able as both an array (.nc) or data table (.csv). These data were produced in python using the python scripts found in the "supporting_code" folder. The remaining files in the "data" and "data/supporting_data"" folder primarily contain ground-based estimates of storage and transit found in public databases or through a literature search, but have been extensively processed and filtered here. The "supporting_data"" folder also contains annual (2016-2020) MODIS land cover data used in the analysis and contains separate filters containing the original data (.hdf) and then the final process (filtered) data in .nc format. The resulting annual land cover distributions were used in the pre-processing of data in python.

    #Code information

    Python scripts can be found in the "supporting_code" folder.

    Each R script in this project has a role:

    "01_start.R": This script sets the working directory, loads in the tidyverse package (the remaining packages in this project are called using the `::` operator), and can run two other scripts: one that loads the customized functions (02_functions.R) and one for importing and processing the key dataset for this analysis (03_import_data.R).

    "02_functions.R": This script contains custom functions. Load this using the
    `source()` function in the 01_start.R script.

    "03_import_data.R": This script imports and processes the .csv transit data. It joins the mean (annual) transit time data with the minimum (monthly) transit data to generate one dataset for analysis: annual_turnover_2. Load this using the
    `source()` function in the 01_start.R script.

    "04_figures_tables.R": This is the main workhouse for figure/table production and
    supporting analyses. This script generates the key figures and summary statistics
    used in the study that then get saved in the manuscript_figures folder. Note that all
    maps were produced using Python code found in the "supporting_code"" folder.

    "supporting_generate_data.R": This script processes supporting data used in the analysis, primarily the varying ground-based datasets of leaf water content.

    "supporting_process_land_cover.R": This takes annual MODIS land cover distributions and processes them through a multi-step filtering process so that they can be used in preprocessing of datasets in python.

  7. d

    Python code used to download U.S. Census Bureau data for public-supply water...

    • catalog.data.gov
    • data.usgs.gov
    Updated Nov 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Python code used to download U.S. Census Bureau data for public-supply water service areas [Dataset]. https://catalog.data.gov/dataset/python-code-used-to-download-u-s-census-bureau-data-for-public-supply-water-service-areas
    Explore at:
    Dataset updated
    Nov 19, 2025
    Dataset provided by
    U.S. Geological Survey
    Description

    This child item describes Python code used to query census data from the TigerWeb Representational State Transfer (REST) services and the U.S. Census Bureau Application Programming Interface (API). These data were needed as input feature variables for a machine learning model to predict public supply water use for the conterminous United States. Census data were retrieved for public-supply water service areas, but the census data collector could be used to retrieve data for other areas of interest. This dataset is part of a larger data release using machine learning to predict public supply water use for 12-digit hydrologic units from 2000-2020. Data retrieved by the census data collector code were used as input features in the public supply delivery and water use machine learning models. This page includes the following file: census_data_collector.zip - a zip file containing the census data collector Python code used to retrieve data from the U.S. Census Bureau and a README file.

  8. H

    “Python code for accessing Numina and analyzing data.”-Using Computer Vision...

    • dataverse.harvard.edu
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Lowry; Don MacKenzie (2025). “Python code for accessing Numina and analyzing data.”-Using Computer Vision Data to Evaluate Bicycle and Pedestrian Improvements: [Dataset]. http://doi.org/10.7910/DVN/QIPCPS
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 3, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Michael Lowry; Don MacKenzie
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The file called analysis.pdf provides the code we used to analyze data. The filed called module.pdf is the module we created to access Numina data. The module requires the python packages called gql and pandas. The code requires a password from Numina.

  9. Vector datasets for workshop "Introduction to Geospatial Raster and Vector...

    • figshare.com
    Updated Oct 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan Avery (2022). Vector datasets for workshop "Introduction to Geospatial Raster and Vector Data with Python" [Dataset]. http://doi.org/10.6084/m9.figshare.21273837.v1
    Explore at:
    application/x-sqlite3Available download formats
    Dataset updated
    Oct 5, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Ryan Avery
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cadaster data from PDOK used to illustrate the use of geopandas and shapely, geospatial python packages for manipulating vector data. The brpgewaspercelen_definitief_2020.gpkg file has been subsetted in order to make the download manageable for workshops. Other datasets are copies of those available from PDOK.

  10. H

    Using Python Packages and HydroShare to Advance Open Data Science and...

    • hydroshare.org
    • beta.hydroshare.org
    zip
    Updated Sep 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeffery S. Horsburgh; Amber Spackman Jones; Anthony M. Castronova; Scott Black (2023). Using Python Packages and HydroShare to Advance Open Data Science and Analytics for Water [Dataset]. https://www.hydroshare.org/resource/4f4acbab5a8c4c55aa06c52a62a1d1fb
    Explore at:
    zip(31.0 MB)Available download formats
    Dataset updated
    Sep 28, 2023
    Dataset provided by
    HydroShare
    Authors
    Jeffery S. Horsburgh; Amber Spackman Jones; Anthony M. Castronova; Scott Black
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Scientific and management challenges in the water domain require synthesis of diverse data. Many data analysis tasks are difficult because datasets are large and complex; standard data formats are not always agreed upon or mapped to efficient structures for analysis; scientists may lack training for tackling large and complex datasets; and it can be difficult to share, collaborate around, and reproduce scientific work. Overcoming barriers to accessing, organizing, and preparing datasets for analyses can transform the way water scientists work. Building on the HydroShare repository’s cyberinfrastructure, we have advanced two Python packages that make data loading, organization, and curation for analysis easier, reducing time spent in choosing appropriate data structures and writing code to ingest data. These packages enable automated retrieval of data from HydroShare and the USGS’s National Water Information System (NWIS) (i.e., a Python equivalent of USGS’ R dataRetrieval package), loading data into performant structures that integrate with existing visualization, analysis, and data science capabilities available in Python, and writing analysis results back to HydroShare for sharing and publication. While these Python packages can be installed for use within any Python environment, we will demonstrate how the technical burden for scientists associated with creating a computational environment for executing analyses can be reduced and how sharing and reproducibility of analyses can be enhanced through the use of these packages within CUAHSI’s HydroShare-linked JupyterHub server.

    This HydroShare resource includes all of the materials presented in a workshop at the 2023 CUAHSI Biennial Colloquium.

  11. H

    Using Python and Jupyter Notebook to Retrieve and Visualize the Water...

    • hydroshare.org
    zip
    Updated Apr 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Farshid (2022). Using Python and Jupyter Notebook to Retrieve and Visualize the Water Temperature Data of the Logan River, Utah [Dataset]. https://www.hydroshare.org/resource/8c565dc2f9244182a575f91515e83d1d
    Explore at:
    zip(358.6 MB)Available download formats
    Dataset updated
    Apr 21, 2022
    Dataset provided by
    HydroShare
    Authors
    Ali Farshid
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2014 - Dec 18, 2021
    Area covered
    Description

    The mainstem Logan River is a suitable habitat for cold-water fishes such as native populations of cutthroat trout (Budy & Gaeta, 2018). On the other hand, high water temperatures can harm cold-water fish populations by creating physiological stresses, intensifying metabolic demands, and limiting suitable habitats (Williams & et al., 2015). In this regard, the State of Utah Department of Environmental Quality (UDEQ) has identified the Logan River as a suitable habitat for cold-water species, which can become unsuitable when the water temperature rises higher than 20 degrees Celsius (Rule R317-2, 2022). However, the UDEQ does not provide any details on how to evaluate the violations from the standard. One way to evaluate violations is to look at water temperature distributions (i.e., histograms) along the river from high elevations to low elevations at different locations. In this report, I used three different Python libraries to manipulate, extract, and explore the water temperature data of the Logan River from 2014 to 2021 obtained from the Logan River Observatory website. The results (i.e., the generated histograms by executing Jupyter Notebook in the HydroShare environment) show that the Logan River tends to experience higher water temperatures as its elevation drops regardless of the season. This can provide some insights for the UDEQ to simultaneously consider space and time in assessing violations from the standard.

  12. H

    Using Python to Access and Plot Streamflow data from NWIS

    • hydroshare.org
    • hydroshare.cuahsi.org
    zip
    Updated Apr 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron Sigman (2022). Using Python to Access and Plot Streamflow data from NWIS [Dataset]. https://www.hydroshare.org/resource/8553e1b0b1cd44b6885a5c6033b41038
    Explore at:
    zip(154.5 KB)Available download formats
    Dataset updated
    Apr 20, 2022
    Dataset provided by
    HydroShare
    Authors
    Aaron Sigman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1970 - Dec 31, 2021
    Description

    This code retrieves stream info from nwis using the dataretrieval tool in python. You can input site, parameters, and dates at the top. This code pulls daily measurements, annual_stats, and daily_stats. We calculate 30-year normals, as well as plot annual average flows, annual min, max, and mean flows, and percentile flows. This resource only pulls from the USGS ftp site and doesn't have or require any local storage.

  13. C

    CatCrops_identification: A Python Project for Early Crop Type Classification...

    • dataverse.csuc.cat
    txt, zip
    Updated Jun 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jordi Gené-Mola; Jordi Gené-Mola; Magí Pàmies Sans; Magí Pàmies Sans; César Minuesa; César Minuesa; Jaume Casadesus; Jaume Casadesus; Joaquim Bellvert; Joaquim Bellvert (2025). CatCrops_identification: A Python Project for Early Crop Type Classification Using Remote Sensing and Ancillary Data [Dataset]. http://doi.org/10.34810/data2322
    Explore at:
    zip(1258857), txt(21471)Available download formats
    Dataset updated
    Jun 11, 2025
    Dataset provided by
    CORA.Repositori de Dades de Recerca
    Authors
    Jordi Gené-Mola; Jordi Gené-Mola; Magí Pàmies Sans; Magí Pàmies Sans; César Minuesa; César Minuesa; Jaume Casadesus; Jaume Casadesus; Joaquim Bellvert; Joaquim Bellvert
    License

    https://dataverse.csuc.cat/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.34810/data2322https://dataverse.csuc.cat/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.34810/data2322

    Dataset funded by
    Agència per la Competitivitat de l’Empresa (ACCIÓ)
    Agencia Estatal de Investigación
    European Commission
    Description

    CatCrops_identification is a Python library developed for the early classification of crop types using remote sensing data (Sentinel-2) and ancillary information. It is based on a Transformer model adapted for the analysis of spectral time series with variable length, and it allows the integration of auxiliary data such as the previous year’s crop, irrigation system, cloud cover, elevation, and other geographic features. The library provides tools to download and prepare datasets, train deep learning models, and generate vector maps with plot-level classification. CatCrops_identification includes scripts to automate the entire workflow and offers a public dataset that combines declared and inspected information on crop types in the Lleida region. This approach improves classification accuracy in the early stages of the agricultural season, offering a robust and efficient tool for agricultural planning and water resource management.

  14. A

    Workflow Optimization Using Python Programming, a Tool Kit for Every...

    • data.amerigeoss.org
    • data.wu.ac.at
    pdf
    Updated Aug 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Energy Data Exchange (2019). Workflow Optimization Using Python Programming, a Tool Kit for Every Geoscientist [Dataset]. https://data.amerigeoss.org/fi/dataset/workflow-python-programming
    Explore at:
    pdf(2693899)Available download formats
    Dataset updated
    Aug 9, 2019
    Dataset provided by
    Energy Data Exchange
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    This presentation gives an overview of using Python programming to optimize CO2 storage simulations conducted with Computer Modelling Group (CMG) software. Solutions to two problems are discussed. First, spatially representing data from CMG simulation results (e.g., plume outlines) is addressed. Second, a streamlined process for optimizing well placement in the simulation model is given. Presented at the 2014 Rocky Mountain Section AAPG Annual Meeting.

  15. mumpcepy: A Python implementation of the Method of Uncertainty Minimization...

    • catalog.data.gov
    • datasets.ai
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2025). mumpcepy: A Python implementation of the Method of Uncertainty Minimization using Polynomial Chaos Expansions [Dataset]. https://catalog.data.gov/dataset/mumpcepy-a-python-implementation-of-the-method-of-uncertainty-minimization-using-polynomia
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    The Method of Uncertainty Minimization using Polynomial Chaos Expansions (MUM-PCE) was developed as a software tool to constrain physical models against experimental measurements. These models contain parameters that cannot be easily determined from first principles and so must be measured, and some which cannot even be easily measured. In such cases, the models are validated and tuned against a set of global experiments which may depend on the underlying physical parameters in a complex way. The measurement uncertainty will affect the uncertainty in the parameter values.

  16. H

    (HS 2) Automate Workflows using Jupyter notebook to create Large Extent...

    • hydroshare.org
    • search.dataone.org
    zip
    Updated Oct 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Young-Don Choi (2024). (HS 2) Automate Workflows using Jupyter notebook to create Large Extent Spatial Datasets [Dataset]. http://doi.org/10.4211/hs.a52df87347ef47c388d9633925cde9ad
    Explore at:
    zip(2.4 MB)Available download formats
    Dataset updated
    Oct 15, 2024
    Dataset provided by
    HydroShare
    Authors
    Young-Don Choi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We implemented automated workflows using Jupyter notebooks for each state. The GIS processing, crucial for merging, extracting, and projecting GeoTIFF data, was performed using ArcPy—a Python package for geographic data analysis, conversion, and management within ArcGIS (Toms, 2015). After generating state-scale LES (large extent spatial) datasets in GeoTIFF format, we utilized the xarray and rioxarray Python packages to convert GeoTIFF to NetCDF. Xarray is a Python package to work with multi-dimensional arrays and rioxarray is rasterio xarray extension. Rasterio is a Python library to read and write GeoTIFF and other raster formats. Xarray facilitated data manipulation and metadata addition in the NetCDF file, while rioxarray was used to save GeoTIFF as NetCDF. These procedures resulted in the creation of three HydroShare resources (HS 3, HS 4 and HS 5) for sharing state-scale LES datasets. Notably, due to licensing constraints with ArcGIS Pro, a commercial GIS software, the Jupyter notebook development was undertaken on a Windows OS.

  17. H

    Creating Curve Number Grid using PyQGIS through Jupyter Notebook in mygeohub...

    • hydroshare.org
    • beta.hydroshare.org
    • +1more
    zip
    Updated Apr 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sayan Dey; Shizhang Wang; Venkatesh Merwade (2020). Creating Curve Number Grid using PyQGIS through Jupyter Notebook in mygeohub [Dataset]. http://doi.org/10.4211/hs.abf67aad0eb64a53bf787d369afdcc84
    Explore at:
    zip(105.5 MB)Available download formats
    Dataset updated
    Apr 28, 2020
    Dataset provided by
    HydroShare
    Authors
    Sayan Dey; Shizhang Wang; Venkatesh Merwade
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This resource serves as a template for creating a curve number grid raster file which could be used to create corresponding maps or for further utilization, soil data and reclassified land-use raster files are created along the process, user has to provided or connect to a set of shape-files including boundary of watershed, soil data and land-use containing this watershed, land-use reclassification and curve number look up table. Script contained in this resource mainly uses PyQGIS through Jupyter Notebook for majority of the processing with a touch of Pandas for data manipulation. Detailed description of procedure are commented in the script.

  18. f

    Data from: Assignment of Regioisomers Using Infrared Spectroscopy: A Python...

    • acs.figshare.com
    zip
    Updated Jun 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel T. Cahill; Joseph E. B. Young; Max Howe; Ryan Clark; Andrew F. Worrall; Malcolm I. Stewart (2024). Assignment of Regioisomers Using Infrared Spectroscopy: A Python Coding Exercise in Data Processing and Machine Learning [Dataset]. http://doi.org/10.1021/acs.jchemed.4c00295.s003
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 20, 2024
    Dataset provided by
    ACS Publications
    Authors
    Samuel T. Cahill; Joseph E. B. Young; Max Howe; Ryan Clark; Andrew F. Worrall; Malcolm I. Stewart
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Machine learning is a set of tools that are increasingly used in the field of chemistry. The introduction of potential uses of machine learning to undergraduate chemistry students should help to increase their comprehension of and interest in machine learning processes and can help support them in their transition into graduate research and industrial environments that use such tools. Herein we present an exercise aimed at introducing machine learning alongside improving students’ general Python coding abilities. The exercise aims to identify the regioisomerism of disubstituted benzene systems solely from infrared spectra, a simple and ubiquitous undergraduate technique. The exercise culminates in students collecting their own spectra of compounds with unknown regioisomerism and predicting the results, allowing them to take ownership of their results and creating a larger database of information to draw upon for machine learning in the future.

  19. MOESM3 of Scoria: a Python module for manipulating 3D molecular data

    • springernature.figshare.com
    zip
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Ropp; Aaron Friedman; Jacob Durrant (2023). MOESM3 of Scoria: a Python module for manipulating 3D molecular data [Dataset]. http://doi.org/10.6084/m9.figshare.c.3882832_D3.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Patrick Ropp; Aaron Friedman; Jacob Durrant
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 3. An archived version of Scoria, derived from the main Scoria branch, that includes MDAnalysis support.

  20. H

    Python and R Basics for Environmental Data Sciences

    • hydroshare.org
    zip
    Updated Nov 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tao Wen (2020). Python and R Basics for Environmental Data Sciences [Dataset]. https://www.hydroshare.org/resource/114e5092ab684bd9beb9fc845a25a087
    Explore at:
    zip(282.7 MB)Available download formats
    Dataset updated
    Nov 1, 2020
    Dataset provided by
    HydroShare
    Authors
    Tao Wen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This resource collects teaching materials that are originally created for the in-person course 'GEOSC/GEOG 497 – Data Mining in Environmental Sciences' at Penn State University (co-taught by Tao Wen, Susan Brantley, and Alan Taylor) and then refined/revised by Tao Wen to be used in the online teaching module 'Data Science in Earth and Environmental Sciences' hosted on the NSF-sponsored HydroLearn platform.

    This resource includes both R Notebooks and Python Jupyter Notebooks to teach the basics of R and Python coding, data analysis and data visualization, as well as building machine learning models in both programming languages by using authentic research data and questions. All of these R/Python scripts can be executed either on the CUAHSI JupyterHub or on your local machine.

    This resource is shared under the CC-BY license. Please contact the creator Tao Wen at Syracuse University (twen08@syr.edu) for any questions you have about this resource. If you identify any errors in the files, please contact the creator.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Monis Ahmad (2022). Exploratory Data Analysis on Automobile Dataset [Dataset]. https://www.kaggle.com/datasets/monisahmad/automobile
Organization logo

Exploratory Data Analysis on Automobile Dataset

Data Visualization Using Python

Explore at:
zip(4915 bytes)Available download formats
Dataset updated
Sep 12, 2022
Authors
Monis Ahmad
Description

Dataset

This dataset was created by Monis Ahmad

Contents

Search
Clear search
Close search
Google apps
Main menu