51 datasets found
  1. h

    BrainBench_Human_v0.1.csv

    • huggingface.co
    Updated Aug 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BrainGPT (2024). BrainBench_Human_v0.1.csv [Dataset]. https://huggingface.co/datasets/BrainGPT/BrainBench_Human_v0.1.csv
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 18, 2024
    Dataset authored and provided by
    BrainGPT
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    What is BrainBench?

    BrainBench is a forward-looking benchmark for neuroscience. BrainBench evaluates test-takers' ability to predict neuroscience results.

      What is BrainBench made of?
    

    BrainBench's test cases were sourced from recent Journal of Neuroscience abstracts across five neuroscience domains: Behavioral/Cognitive, Systems/Circuits, Neurobiology of Disease, Cellular/Molecular, and Developmental/Plasticity/Repair. Test-takers chose between the original abstract and… See the full description on the dataset page: https://huggingface.co/datasets/BrainGPT/BrainBench_Human_v0.1.csv.

  2. Database with raw data (CSV file).

    • figshare.com
    txt
    Updated Jun 3, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bartosz Symonides (2018). Database with raw data (CSV file). [Dataset]. http://doi.org/10.6084/m9.figshare.6411002.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 3, 2018
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Bartosz Symonides
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Survival after open versus endovascular repair of abdominal aortic aneurysm. Polish population analysis. (in press)

  3. Z

    The dataset of the paper titled "Context-Aware Code Change Embedding for...

    • data.niaid.nih.gov
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous (2021). The dataset of the paper titled "Context-Aware Code Change Embedding for Better Patch Correctness Assessment" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4128943
    Explore at:
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    Anonymous
    Description

    This is the online repository of the paper "Context-Aware Code Change Embedding for Better Patch Correctness Assessment" under review by SANER2021. We release the source code of Cache, the patches used in our evaluation, as well as the experiment results.

    Patches: Three patch benchmarks included in our study.

    Tian: The patches from Tian's ASE20 paper.

    Wang: The patches from Wang's ASE20 paper.

    Cache: The patches collected by ourselves, which is consist of 17,377 deduplicated overfitting patches from RepairThemAll and 17,377 instances from ManySStuBs(used as correct patches).

    Results:

    RQ1: The detailed result files in RQ1, which are named by the format of [model]_[classifier].csv.

    For example, the file named BERT_DT.csv in the folder Tian's_dataset means that this file is the result of patches from Tian's study embedded by BERT and classified by Decision Tree.

    Tian's_dataset : The detailed result files on Tian's dataset.

    Cache_dataset : The detailed result files on our own dataset.

    Cross_dataset : The detailed result files of representation learning techniques when training on our own dataset and testing on Tian's dataset.

    RQ2: The detailed result files in RQ2.

    Wang_Cache.csv: The detailed result of Cache on the dataset from Wang's ASE20.

    ODS_Cache.csv: The datailed result of Cache on the dataset from Xiong's ICSE18 paper. We directly compare against the results reported by the authors of ODS on 139 patches from Xiong's paper since the data and source code of ODS is unavailable.

    Source: The source code and lib for running Cache is available at https://github.com/APR-Study/Cache.

  4. f

    Seville Car Repair Shops

    • flashmapy.com
    csv
    Updated Aug 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Seville Car Repair Shops [Dataset]. https://www.flashmapy.com/lists/car-repair-shop-seville
    Explore at:
    csvAvailable download formats
    Dataset updated
    Aug 22, 2024
    Area covered
    Seville
    Variables measured
    X, City, Email, Phone, State, Images, Country, Reviews, Youtube, Category, and 7 more
    Description

    A downloadable CSV file containing 150 Car Repair Shops in Seville with details like contact information, price range, reviews, and opening hours.

  5. f

    Water balance - stream discharge from Långbäcken, Catchment 13

    • meta.fieldsites.se
    Updated Mar 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Svartberget Research Station (2024). Water balance - stream discharge from Långbäcken, Catchment 13 [Dataset]. https://meta.fieldsites.se/objects/XjOoaOBQPMWb3db_RQJm-fIX
    Explore at:
    Dataset updated
    Mar 21, 2024
    Dataset provided by
    Svartberget Research Station
    SITES data portal
    Authors
    Svartberget Research Station
    License

    https://meta.fieldsites.se/ontologies/sites/sitesLicencehttps://meta.fieldsites.se/ontologies/sites/sitesLicence

    Time period covered
    Apr 27, 2009 - Dec 20, 2023
    Area covered
    Variables measured
    Q, TIMESTAMP
    Description

    Discharge calculations based on stream level measurements and a constantly validated stream discharge relation curve. For detailed information on calculations and installation read COMMENT in the header of the data set, which guides to related information document. Svartberget Research Station (2024). Water balance - stream discharge from Långbäcken, Catchment 13, 2009-04-28–2023-12-20 [Data set]. Swedish Infrastructure for Ecosystem Science (SITES). https://hdl.handle.net/11676.1/XjOoaOBQPMWb3db_RQJm-fIX

  6. Z

    Data from: ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Jan 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nagappan, Meiyappan (2022). ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5907001
    Explore at:
    Dataset updated
    Jan 27, 2022
    Dataset provided by
    Keshavarz, Hossein
    Nagappan, Meiyappan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction

    This archive contains the ApacheJIT dataset presented in the paper "ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction" as well as the replication package. The paper is submitted to MSR 2022 Data Showcase Track.

    The datasets are available under directory dataset. There are 4 datasets in this directory.

    1. apachejit_total.csv: This file contains the entire dataset. Commits are specified by their identifier and a set of commit metrics that are explained in the paper are provided as features. Column buggy specifies whether or not the commit introduced any bug into the system.
    2. apachejit_train.csv: This file is a subset of the entire dataset. It provides a balanced set that we recommend for models that are sensitive to class imbalance. This set is obtained from the first 14 years of data (2003 to 2016).
    3. apachejit_test_large.csv: This file is a subset of the entire dataset. The commits in this file are the commits from the last 3 years of data. This set is not balanced to represent a real-life scenario in a JIT model evaluation where the model is trained on historical data to be applied on future data without any modification.
    4. apachejit_test_small.csv: This file is a subset of the test file explained above. Since the test file has more than 30,000 commits, we also provide a smaller test set which is still unbalanced and from the last 3 years of data.

    In addition to the dataset, we also provide the scripts using which we built the dataset. These scripts are written in Python 3.8. Therefore, Python 3.8 or above is required. To set up the environment, we have provided a list of required packages in file requirements.txt. Additionally, one filtering step requires GumTree [1]. For Java, GumTree requires Java 11. For other languages, external tools are needed. Installation guide and more details can be found here.

    The scripts are comprised of Python scripts under directory src and Python notebooks under directory notebooks. The Python scripts are mainly responsible for conducting GitHub search via GitHub search API and collecting commits through PyDriller Package [2]. The notebooks link the fixed issue reports with their corresponding fixing commits and apply some filtering steps. The bug-inducing candidates then are filtered again using gumtree.py script that utilizes the GumTree package. Finally, the remaining bug-inducing candidates are combined with the clean commits in the dataset_construction notebook to form the entire dataset.

    More specifically, git_token.py handles GitHub API token that is necessary for requests to GitHub API. Script collector.py performs GitHub search. Tracing changed lines and git annotate is done in gitminer.py using PyDriller. Finally, gumtree.py applies 4 filtering steps (number of lines, number of files, language, and change significance).

    References:

    1. GumTree

    Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In ACM/IEEE International Conference on Automated Software Engineering, ASE ’14,Vasteras, Sweden - September 15 - 19, 2014. 313–324

    1. PyDriller
    • https://pydriller.readthedocs.io/en/latest/

    • Davide Spadini, Maurício Aniche, and Alberto Bacchelli. 2018. PyDriller: Python Framework for Mining Software Repositories. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering(Lake Buena Vista, FL, USA)(ESEC/FSE2018). Association for Computing Machinery, New York, NY, USA, 908–911

  7. f

    Bordeaux Car Repair Shops

    • flashmapy.com
    csv
    Updated Aug 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Bordeaux Car Repair Shops [Dataset]. https://www.flashmapy.com/lists/car-repair-shop-bordeaux
    Explore at:
    csvAvailable download formats
    Dataset updated
    Aug 22, 2024
    Variables measured
    X, City, Email, Phone, State, Images, Country, Reviews, Youtube, Category, and 7 more
    Description

    A downloadable CSV file containing 227 Car Repair Shops in Bordeaux with details like contact information, price range, reviews, and opening hours.

  8. Fix auto locations in USA

    • agenty.com
    csv
    Updated May 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agenty (2025). Fix auto locations in USA [Dataset]. https://agenty.com/marketplace/stores/fix-auto-locations-in-usa
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 25, 2025
    Dataset provided by
    Agenty
    Time period covered
    2025
    Area covered
    United States
    Description

    Complete list of all 211 Fix auto POI locations in the USA with name, geo-coded address, city, email, phone number etc for download in CSV format or via the API.

  9. f

    DataSheet1_Applicability of Anticancer Drugs for the Triple-Negative Breast...

    • frontiersin.figshare.com
    txt
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaoming Liao; Yiran Yang; Aimin Xie; Zedong Jiang; Jianlong Liao; Min Yan; Yao Zhou; Jiali Zhu; Jing Hu; Yunpeng Zhang; Yun Xiao; Xia Li (2023). DataSheet1_Applicability of Anticancer Drugs for the Triple-Negative Breast Cancer Based on Homologous Recombination Repair Deficiency.CSV [Dataset]. http://doi.org/10.3389/fcell.2022.845950.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    Frontiers
    Authors
    Gaoming Liao; Yiran Yang; Aimin Xie; Zedong Jiang; Jianlong Liao; Min Yan; Yao Zhou; Jiali Zhu; Jing Hu; Yunpeng Zhang; Yun Xiao; Xia Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Triple-negative breast cancer (TNBC) is a highly aggressive disease with historically poor outcomes, primarily due to the lack of effective targeted therapies. Here, we established a drug sensitivity prediction model based on the homologous recombination deficiency (HRD) using 83 TNBC patients from TCGA. Through analyzing the effect of HRD status on response efficacy of anticancer drugs and elucidating its related mechanisms of action, we found rucaparib (PARP inhibitor) and doxorubicin (anthracycline) sensitive in HR-deficient patients, while paclitaxel sensitive in the HR-proficient. Further, we identified a HRD signature based on gene expression data and constructed a transcriptomic HRD score, for analyzing the functional association between anticancer drug perturbation and HRD. The results revealed that CHIR99021 (GSK3 inhibitor) and doxorubicin have similar expression perturbation patterns with HRD, and talazoparib (PARP inhibitor) could kill tumor cells by reversing the HRD activity. Genomic characteristics indicated that doxorubicin inhibited tumor cells growth by hindering the process of DNA damage repair, while the resistance of cisplatin was related to the activation of angiogenesis and epithelial-mesenchymal transition. The negative correlation of HRD signature score could interpret the association of doxorubicin pIC50 with worse chemotherapy response and shorter survival of TNBC patients. In summary, these findings explain the applicability of anticancer drugs in TNBC and underscore the importance of HRD in promoting personalized treatment development.

  10. Auto Value Repair Center locations in the USA

    • agenty.com
    csv
    Updated Feb 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agenty (2025). Auto Value Repair Center locations in the USA [Dataset]. https://agenty.com/marketplace/stores/auto-value-repair-center-locations-in-the-usa
    Explore at:
    csvAvailable download formats
    Dataset updated
    Feb 27, 2025
    Dataset provided by
    Agenty
    Time period covered
    2025
    Area covered
    United States
    Description

    Complete list of all 1082 Auto Value Repair Center POI locations in the the USA with name, geo-coded address, city, email, phone number etc for download in CSV format or via the API.

  11. Bumper To Bumper Repair Center locations in the USA

    • agenty.com
    csv
    Updated Mar 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agenty (2025). Bumper To Bumper Repair Center locations in the USA [Dataset]. https://agenty.com/marketplace/stores/bumper-to-bumper-repair-center-locations-in-the-usa
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 6, 2025
    Dataset provided by
    Agenty
    Time period covered
    2025
    Area covered
    United States
    Description

    Complete list of all 538 Bumper To Bumper Repair Center POI locations in the the USA with name, geo-coded address, city, email, phone number etc for download in CSV format or via the API.

  12. Z

    Data from: A dataset of GitHub Actions workflow histories

    • data.niaid.nih.gov
    Updated Oct 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cardoen, Guillaume (2024). A dataset of GitHub Actions workflow histories [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10259013
    Explore at:
    Dataset updated
    Oct 25, 2024
    Dataset authored and provided by
    Cardoen, Guillaume
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This replication package accompagnies the dataset and exploratory empirical analysis reported in the paper "A dataset of GitHub Actions workflow histories" published in the IEEE MSR 2024 conference. (The Jupyter notebook can be found in previous version of this dataset).

    Important notice : It looks like Zenodo is compressing gzipped files two times without notice, they are "double compressed". So, when you download them they should be named : x.gz.gz instead of x.gz. Notice that the provided MD5 refers to the original file.

    2024-10-25 update : updated repositories list and observation period. The filters relying on date were also updated.

    2024-07-09 update : fix sometimes invalid valid_yaml flag.

    The dataset was created as follow :

    First, we used GitHub SEART (on October 7th, 2024) to get a list of every non-fork repositories created before January 1st, 2024. having at least 300 commits and at least 100 stars where at least one commit was made after January 1st, 2024. (The goal of these filter is to exclude experimental and personnal repositories).

    We checked if a folder .github/workflows existed. We filtered out those that did not contained this folder and pulled the others (between 9th and 10thof October 2024).

    We applied the tool gigawork (version 1.4.2) to extract every files from this folder. The exact command used is python batch.py -d /ourDataFolder/repositories -e /ourDataFolder/errors -o /ourDataFolder/output -r /ourDataFolder/repositories_everything.csv.gz -- -w /ourDataFolder/workflows_auxiliaries. (The script batch.py can be found on GitHub).

    We concatenated every files in /ourDataFolder/output into a csv (using cat headers.csv output/*.csv > workflows_auxiliaries.csv in /ourDataFolder) and compressed it.

    We added the column uid via a script available on GitHub.

    Finally, we archived the folder with pigz /ourDataFolder/workflows (tar -c --use-compress-program=pigz -f workflows_auxiliaries.tar.gz /ourDataFolder/workflows)

    Using the extracted data, the following files were created :

    workflows.tar.gz contains the dataset of GitHub Actions workflow file histories.

    workflows_auxiliaries.tar.gz is a similar file containing also auxiliary files.

    workflows.csv.gz contains the metadata for the extracted workflow files.

    workflows_auxiliaries.csv.gz is a similar file containing also metadata for auxiliary files.

    repositories.csv.gz contains metadata about the GitHub repositories containing the workflow files. These metadata were extracted using the SEART Search tool.

    The metadata is separated in different columns:

    repository: The repository (author and repository name) from which the workflow was extracted. The separator "/" allows to distinguish between the author and the repository name

    commit_hash: The commit hash returned by git

    author_name: The name of the author that changed this file

    author_email: The email of the author that changed this file

    committer_name: The name of the committer

    committer_email: The email of the committer

    committed_date: The committed date of the commit

    authored_date: The authored date of the commit

    file_path: The path to this file in the repository

    previous_file_path: The path to this file before it has been touched

    file_hash: The name of the related workflow file in the dataset

    previous_file_hash: The name of the related workflow file in the dataset, before it has been touched

    git_change_type: A single letter (A,D, M or R) representing the type of change made to the workflow (Added, Deleted, Modified or Renamed). This letter is given by gitpython and provided as is.

    valid_yaml: A boolean indicating if the file is a valid YAML file.

    probably_workflow: A boolean representing if the file contains the YAML key on and jobs. (Note that it can still be an invalid YAML file).

    valid_workflow: A boolean indicating if the file respect the syntax of GitHub Actions workflow. A freely available JSON Schema (used by gigawork) was used in this goal.

    uid: Unique identifier for a given file surviving modifications and renames. It is generated on the addition of the file and stays the same until the file is deleted. Renamings does not change the identifier.

    Both workflows.csv.gz and workflows_auxiliaries.csv.gz are following this format.

  13. f

    Data_Sheet_1_Interactions between halotolerant nitrogen-fixing bacteria and...

    • frontiersin.figshare.com
    application/csv
    Updated Mar 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chao Ji; Yuhan Ge; Hua Zhang; Yingxiang Zhang; Zhiwen Xin; Jian Li; Jinghe Zheng; Zengwen Liang; Hui Cao; Kun Li (2024). Data_Sheet_1_Interactions between halotolerant nitrogen-fixing bacteria and arbuscular mycorrhizal fungi under saline stress.csv [Dataset]. http://doi.org/10.3389/fmicb.2024.1288865.s001
    Explore at:
    application/csvAvailable download formats
    Dataset updated
    Mar 13, 2024
    Dataset provided by
    Frontiers
    Authors
    Chao Ji; Yuhan Ge; Hua Zhang; Yingxiang Zhang; Zhiwen Xin; Jian Li; Jinghe Zheng; Zengwen Liang; Hui Cao; Kun Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background and aimsSoil salinity negatively affects crop development. Halotolerant nitrogen-fixing bacteria (HNFB) and arbuscular mycorrhizal fungi (AMF) are essential microorganisms that enhance crop nutrient availability and salt tolerance in saline soils. Studying the impact of HNFB on AMF communities and using HNFB in biofertilizers can help in selecting the optimal HNFB-AMF combinations to improve crop productivity in saline soils.MethodsWe established three experimental groups comprising apple plants treated with low-nitrogen (0 mg N/kg, N0), normal-nitrogen (200 mg N/kg, N1), and high-nitrogen (300 mg N/kg, N2) fertilizer under salt stress without bacteria (CK, with the addition of 1,500 mL sterile water +2 g sterile diatomite), or with bacteria [BIO, with the addition of 1,500 mL sterile water +2 g mixed bacterial preparation (including Bacillus subtilis HG-15 and Bacillus velezensis JC-K3)].ResultsHNFB inoculation significantly increased microbial biomass and the relative abundance of beta-glucosidase-related genes in the rhizosphere soil under identical nitrogen application levels (p 

  14. Data from: Data and scripts from: “Denoising autoencoder for reconstructing...

    • osti.gov
    Updated Jan 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bi, Xiangyu; Chou, Chunwei; Johnsen, Timothy; Ramakrishnan, Lavanya; Skone, Jonathan; Varadharajan, Charuleka; Wu, Yuxin (2025). Data and scripts from: “Denoising autoencoder for reconstructing sensor observation data and predicting evapotranspiration: noisy and missing values repair and uncertainty quantification” [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/2561511
    Explore at:
    Dataset updated
    Jan 1, 2025
    Dataset provided by
    United States Department of Energyhttp://energy.gov/
    Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) (United States); Watershed Function SFA
    38.92,-106.95|38.92,-106.95|38.92,-106.95|38.92,-106.95|38.92,-106.9539.033,-106.88|38.82,-106.88|38.82,-107.12|39.033,-107.12|39.033,-106.8838.92625,-106.98|38.92625,-106.98|38.92625,-106.98|38.92625,-106.98|38.92625,-106.9837.878607,-122.241423|37.878607,-122.241423|37.878607,-122.241423|37.878607,-122.241423|37.878607,-122.24142338.922583,-106.947288|38.922583,-106.947288|38.922583,-106.947288|38.922583,-106.947288|38.922583,-106.947288
    Authors
    Bi, Xiangyu; Chou, Chunwei; Johnsen, Timothy; Ramakrishnan, Lavanya; Skone, Jonathan; Varadharajan, Charuleka; Wu, Yuxin
    Description

    This data package includes data and scripts from the manuscript “Denoising autoencoder for reconstructing sensor observation data and predicting evapotranspiration: noisy and missing values repair and uncertainty quantification”.The study addressed common challenges faced in environmental sensing and modeling, including uncertain input data, missing sensor observations, and high-dimensional datasets with interrelated but redundant variables. Point-scaled meteorological and soil sensor observations were perturbed with noises and missing values, and denoising autoencoder (DAE) neural networks were developed to reconstruct the perturbed data and further predict evapotranspiration. This study concluded that (1) the reconstruction quality of each variable depends on its cross-correlation and alignment to the underlying data structure, (2) uncertainties from the models were overall stronger than those from the data corruption, and (3) there was a tradeoff between reducing bias and reducing variance when evaluating the uncertainty of the machine learning models.This package includes:(1) Four ipython scripts (.ipynb): “DAE_train.ipynb” trains and evaluates DAE neural networks, “DAE_predict.ipynb” makes predictions from the trained DAE models, “ET_train.ipynb” trains and evaluates ET prediction neural networks, and “ET_predict.ipynb” makes predictions from trained ET models.(2) One python file (.py): “methods.py” includes all user-defined functions and python codes used in the ipython scripts.(3) A “sub_models” folder that includes fivemore » trained DAE neural networks (in pytorch format, .pt), which could be used to ingest input data before being fed to the downstream ET models in ‘ET_train.ipynb” or ‘ET_predict.ipynb’.(4) Two data files (.csv). Daily meteorological, vegetation, and soil data is in “df_data.csv”, where “df_meta.csv” contains the location and time information of “df_data.csv”. Each row (index) in “df_meta.csv” corresponds to each row in “df_data.csv”. These data files are formatted to follow the data structure requirements and be directly used in the ipython scripts, and they have been shuffled chronologically to train machine learning models. The meteorological and soil data was collected using point sensors between 2019-2023 at(4.a) Three shrub-dominated field sites in East River, Colorado (named “ph1”, “ph2” and “sg5” in “df_meta.csv”, where “ph1” and “ph2” were located at PumpHouse Hillslopes, and “sg5” was at Snodgrass Mountain meadow) and(4.b) One outdoor, mesoscale, and herbaceous-dominated experiment in Berkeley, California (named “tb” in “df_meta.csv”, short for Smartsoils Testbed at Lawrence Berkeley National Lab).- See "df_data_dd.csv" and "df_meta_dd.csv" for variable descriptions and the Methods section for additional data processing steps. See "flmd.csv" and "README.txt" for brief file descriptions.- All ipython scripts and python files are written in and require PYTHON language software.« less

  15. Z

    Selkie GIS Techno-Economic Tool input datasets

    • data.niaid.nih.gov
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cullinane, Margaret (2023). Selkie GIS Techno-Economic Tool input datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10083960
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset authored and provided by
    Cullinane, Margaret
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data was prepared as input for the Selkie GIS-TE tool. This GIS tool aids site selection, logistics optimization and financial analysis of wave or tidal farms in the Irish and Welsh maritime areas. Read more here: https://www.selkie-project.eu/selkie-tools-gis-technoeconomic-model/

    This research was funded by the Science Foundation Ireland (SFI) through MaREI, the SFI Research Centre for Energy, Climate and the Marine and by the Sustainable Energy Authority of Ireland (SEAI). Support was also received from the European Union's European Regional Development Fund through the Ireland Wales Cooperation Programme as part of the Selkie project.

    File Formats

    Results are presented in three file formats:

    tif Can be imported into a GIS software (such as ARC GIS) csv Human-readable text format, which can also be opened in Excel png Image files that can be viewed in standard desktop software and give a spatial view of results

    Input Data

    All calculations use open-source data from the Copernicus store and the open-source software Python. The Python xarray library is used to read the data.

    Hourly Data from 2000 to 2019

    • Wind - Copernicus ERA5 dataset 17 by 27.5 km grid
      10m wind speed

    • Wave - Copernicus Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis dataset 3 by 5 km grid

    Accessibility

    The maximum limits for Hs and wind speed are applied when mapping the accessibility of a site.
    The Accessibility layer shows the percentage of time the Hs (Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis) and wind speed (ERA5) are below these limits for the month.

    Input data is 20 years of hourly wave and wind data from 2000 to 2019, partitioned by month. At each timestep, the accessibility of the site was determined by checking if
    the Hs and wind speed were below their respective limits. The percentage accessibility is the number of hours within limits divided by the total number of hours for the month.

    Environmental data is from the Copernicus data store (https://cds.climate.copernicus.eu/). Wave hourly data is from the 'Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis' dataset.
    Wind hourly data is from the ERA 5 dataset.

    Availability

    A device's availability to produce electricity depends on the device's reliability and the time to repair any failures. The repair time depends on weather
    windows and other logistical factors (for example, the availability of repair vessels and personnel.). A 2013 study by O'Connor et al. determined the
    relationship between the accessibility and availability of a wave energy device. The resulting graph (see Fig. 1 of their paper) shows the correlation between accessibility at Hs of 2m and wind speed of 15.0m/s and availability. This graph is used to calculate the availability layer from the accessibility layer.

    The input value, accessibility, measures how accessible a site is for installation or operation and maintenance activities. It is the percentage time the
    environmental conditions, i.e. the Hs (Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis) and wind speed (ERA5), are below operational limits.
    Input data is 20 years of hourly wave and wind data from 2000 to 2019, partitioned by month. At each timestep, the accessibility of the site was determined
    by checking if the Hs and wind speed were below their respective limits. The percentage accessibility is the number of hours within limits divided by the total
    number of hours for the month. Once the accessibility was known, the percentage availability was calculated using the O'Connor et al. graph of the relationship between the two. A mature technology reliability was assumed.

    Weather Window

    The weather window availability is the percentage of possible x-duration windows where weather conditions (Hs, wind speed) are below maximum limits for the
    given duration for the month.

    The resolution of the wave dataset (0.05° × 0.05°) is higher than that of the wind dataset
    (0.25° x 0.25°), so the nearest wind value is used for each wave data point. The weather window layer is at the resolution of the wave layer.

    The first step in calculating the weather window for a particular set of inputs (Hs, wind speed and duration) is to calculate the accessibility at each timestep.
    The accessibility is based on a simple boolean evaluation: are the wave and wind conditions within the required limits at the given timestep?

    Once the time series of accessibility is calculated, the next step is to look for periods of sustained favourable environmental conditions, i.e. the weather
    windows. Here all possible operating periods with a duration matching the required weather-window value are assessed to see if the weather conditions remain
    suitable for the entire period. The percentage availability of the weather window is calculated based on the percentage of x-duration windows with suitable
    weather conditions for their entire duration.The weather window availability can be considered as the probability of having the required weather window available
    at any given point in the month.

    Extreme Wind and Wave

    The Extreme wave layers show the highest significant wave height expected to occur during the given return period. The Extreme wind layers show the highest wind speed expected to occur during the given return period.

    To predict extreme values, we use Extreme Value Analysis (EVA). EVA focuses on the extreme part of the data and seeks to determine a model to fit this reduced
    portion accurately. EVA consists of three main stages. The first stage is the selection of extreme values from a time series. The next step is to fit a model
    that best approximates the selected extremes by determining the shape parameters for a suitable probability distribution. The model then predicts extreme values
    for the selected return period. All calculations use the python pyextremes library. Two methods are used - Block Maxima and Peaks over threshold.

    The Block Maxima methods selects the annual maxima and fits a GEVD probability distribution.

    The peaks_over_threshold method has two variable calculation parameters. The first is the percentile above which values must be to be selected as extreme (0.9 or 0.998). The second input is the time difference between extreme values for them to be considered independent (3 days). A Generalised Pareto Distribution is fitted to the selected
    extremes and used to calculate the extreme value for the selected return period.

  16. d

    Data from: Correction of preferred-orientation induced distortion in...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Jun 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weili Cao; Dongjie Zhu; Xinzheng Zhang (2024). Correction of preferred-orientation induced distortion in cryo-electron microscopy maps [Dataset]. http://doi.org/10.5061/dryad.73n5tb354
    Explore at:
    Dataset updated
    Jun 20, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Weili Cao; Dongjie Zhu; Xinzheng Zhang
    Description

    Reconstruction maps of cryo-electron microscopy (cryo-EM) exhibit distortion when the cryo-EM dataset is incomplete, usually caused by unevenly distributed orientations. Prior efforts had been attempted to address this preferred orientation problem using tilt-collection strategy, modifications to grids or to air-water-interfaces. However, these approaches often require time-consuming experiments and the effect was always protein dependent. Here, we developed a procedure containing removing mis-aligned particles and an iterative reconstruction method based on signal-to-noise ratio of Fourier component to correct such distortion by recovering missing data using a purely computational algorithm. This procedure called Signal-to-Noise Ratio Iterative Reconstruction Method (SIRM) was applied on incomplete datasets of various proteins to fix distortion in cryo-EM maps and to a more isotropic resolution. In addition, SIRM provides a better reference map for further reconstruction refinements, r..., , , # SIRM: Open Source Data

    We have submitted the original chart files (.csv) and density maps (.mrc) related to the images in the article "Correction of preferred-orientation induced distortion in cryo-electron microscopy maps"

    Descriptions

    SIRM_Fig1_csv.rar

    • Fig1A.csv: The CSV file corresponding to the histogram of Euler angle deviations in particle sets with different degrees of missing wedge under conventional refinement.
    • Fig1B.csv: The CSV file corresponding to the histogram of Euler angle deviations in particle sets with different degrees of missing wedge After Cross-Validation process.

    SIRM_Fig3_MRC_map.rar

    • groundTruth.mrc: The ground-truth density map of Apo-ferritin without missing cone, corresponding to Fig.3A.
    • loss70_sameParticleNum.mrc: The MC-35 density map of Apo-ferritin, corresponding to Fig.3D.
    • loss70_sameParticleNum_SIRM.mrc: The MC-35 density map of Apo-ferritin after SIRM processing, corresponding to Fig.3G.
    • *loss80_samePa...
  17. AAA Approved Auto Repair Facilties locations in the USA

    • agenty.com
    csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agenty, AAA Approved Auto Repair Facilties locations in the USA [Dataset]. https://agenty.com/marketplace/stores/aaa-approved-auto-repair-facilties-locations-in-the-usa
    Explore at:
    csvAvailable download formats
    Dataset provided by
    Agenty
    Time period covered
    2025
    Area covered
    United States
    Description

    Complete list of all 2951 AAA Approved Auto Repair Facilties POI locations in the the USA with name, geo-coded address, city, email, phone number etc for download in CSV format or via the API.

  18. d

    NRCS FY2018 Soil Properties and Interpretations, Derived Using gSSURGO Data...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). NRCS FY2018 Soil Properties and Interpretations, Derived Using gSSURGO Data and Tools [Dataset]. https://catalog.data.gov/dataset/nrcs-fy2018-soil-properties-and-interpretations-derived-using-gssurgo-data-and-tools
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    These data depict the western United States Map Unit areas as defined by the USDA NRCS. Each Map Unit area contains information on a variety of soil properties and interpretations. The raster is to be joined to the .csv file by the field "mukey." We keep the raster and csv separate to preserve the full attribute names in the csv that would be truncated if attached to the raster. Once joined, the raster can be classified or analyzed by the columns which depict the properties and interpretations. It is important to note that each property has a corresponding component percent column to indicate how much of the map unit has the dominant property provided. For example, if the property "AASHTO Group Classification (Surface) 0 to 1cm" is recorded as "A-1" for a map unit, a user should also refer to the component percent field for this property (in this case 75). This means that an estimated 75% of the map unit has a "A-1" AASHTO group classification and that "A-1" is the dominant group. The property in the column is the dominant component, and so the other 25% of this map unit is comprised of other AASHTO Group Classifications. This raster attribute table was generated from the "Map Soil Properties and Interpretations" tool within the gSSURGO Mapping Toolset in the Soil Data Management Toolbox for ArcGIS™ User Guide Version 4.0 (https://www.nrcs.usda.gov/wps/PA_NRCSConsumption/download?cid=nrcseprd362255&ext=pdf) from GSSURGO that used their Map Unit Raster as the input feature (https://gdg.sc.egov.usda.gov/). The FY2018 Gridded SSURGO Map Unit Raster was created for use in national, regional, and state-wide resource planning and analysis of soils data. These data were created with guidance from the USDA NRCS. The fields named "*COMPPCT_R" can exceed 100% for some map units. The NRCS personnel are aware of and working on fixing this issue. Take caution when interpreting these areas, as they are the result of some data duplication in the master gSSURGO database. The data are considered valuable and required for timely science needs, and thus are released with this known error. The USDA NRCS are developing a data release which will replace this item when it is available. For the most up to date ssurgo releases that do not include the custom fields as this release does, see https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/home/?cid=nrcs142p2_053628#tools For additional definitions, see https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/geo/?cid=nrcs142p2_053627.

  19. General Motors Maintenance And Repair locations in the USA

    • agenty.com
    csv
    Updated May 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agenty (2025). General Motors Maintenance And Repair locations in the USA [Dataset]. https://agenty.com/marketplace/stores/general-motors-maintenance-and-repair-locations-in-the-usa
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 25, 2025
    Dataset provided by
    Agenty
    Time period covered
    2025
    Area covered
    United States
    Description

    Complete list of all 13205 General Motors Maintenance And Repair POI locations in the the USA with name, geo-coded address, city, email, phone number etc for download in CSV format or via the API.

  20. f

    Data_Sheet_1_Successful Treatment of Noise-Induced Hearing Loss by...

    • frontiersin.figshare.com
    txt
    Updated Jun 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Athanasia Warnecke; Jennifer Harre; Matthew Shew; Adam J. Mellott; Igor Majewski; Martin Durisin; Hinrich Staecker (2023). Data_Sheet_1_Successful Treatment of Noise-Induced Hearing Loss by Mesenchymal Stromal Cells: An RNAseq Analysis of Protective/Repair Pathways.CSV [Dataset]. http://doi.org/10.3389/fncel.2021.656930.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Frontiers
    Authors
    Athanasia Warnecke; Jennifer Harre; Matthew Shew; Adam J. Mellott; Igor Majewski; Martin Durisin; Hinrich Staecker
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Mesenchymal stromal cells (MSCs) are an adult derived stem cell-like population that has been shown to mediate repair in a wide range of degenerative disorders. The protective effects of MSCs are mainly mediated by the release of growth factors and cytokines thereby modulating the diseased environment and the immune system. Within the inner ear, MSCs have been shown protective against tissue damage induced by sound and a variety of ototoxins. To better understand the mechanism of action of MSCs in the inner ear, mice were exposed to narrow band noise. After exposure, MSCs derived from human umbilical cord Wharton’s jelly were injected into the perilymph. Controls consisted of mice exposed to sound trauma only. Forty-eight hours post-cell delivery, total RNA was extracted from the cochlea and RNAseq performed to evaluate the gene expression induced by the cell therapy. Changes in gene expression were grouped together based on gene ontology classification. A separate cohort of animals was treated in a similar fashion and allowed to survive for 2 weeks post-cell therapy and hearing outcomes determined. Treatment with MSCs after severe sound trauma induced a moderate hearing protective effect. MSC treatment resulted in an up-regulation of genes related to immune modulation, hypoxia response, mitochondrial function and regulation of apoptosis. There was a down-regulation of genes related to synaptic remodeling, calcium homeostasis and the extracellular matrix. Application of MSCs may provide a novel approach to treating sound trauma induced hearing loss and may aid in the identification of novel strategies to protect hearing.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
BrainGPT (2024). BrainBench_Human_v0.1.csv [Dataset]. https://huggingface.co/datasets/BrainGPT/BrainBench_Human_v0.1.csv

BrainBench_Human_v0.1.csv

BrainBench

BrainGPT/BrainBench_Human_v0.1.csv

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 18, 2024
Dataset authored and provided by
BrainGPT
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

What is BrainBench?

BrainBench is a forward-looking benchmark for neuroscience. BrainBench evaluates test-takers' ability to predict neuroscience results.

  What is BrainBench made of?

BrainBench's test cases were sourced from recent Journal of Neuroscience abstracts across five neuroscience domains: Behavioral/Cognitive, Systems/Circuits, Neurobiology of Disease, Cellular/Molecular, and Developmental/Plasticity/Repair. Test-takers chose between the original abstract and… See the full description on the dataset page: https://huggingface.co/datasets/BrainGPT/BrainBench_Human_v0.1.csv.

Search
Clear search
Close search
Google apps
Main menu