100+ datasets found
  1. Data from: Nursing Home Compare

    • catalog.data.gov
    • data.va.gov
    • +2more
    Updated May 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Veterans Affairs (2021). Nursing Home Compare [Dataset]. https://catalog.data.gov/dataset/nursing-home-compare-ed7b0
    Explore at:
    Dataset updated
    May 1, 2021
    Dataset provided by
    United States Department of Veterans Affairshttp://va.gov/
    Description

    Nursing Home Compare has detailed information about every Medicare and Medicaid nursing home in the country. A nursing home is a place for people who can’t be cared for at home and need 24-hour nursing care. These are the official datasets used on the Medicare.gov Nursing Home Compare Website provided by the Centers for Medicare & Medicaid Services. These data allow you to compare the quality of care at every Medicare and Medicaid-certified nursing home in the country, including over 15,000 nationwide.

  2. f

    Comparison of OR tables between two datasets for one CD interaction.

    • figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yang Liu; Haiming Xu; Suchao Chen; Xianfeng Chen; Zhenguo Zhang; Zhihong Zhu; Xueying Qin; Landian Hu; Jun Zhu; Guo-Ping Zhao; Xiangyin Kong (2023). Comparison of OR tables between two datasets for one CD interaction. [Dataset]. http://doi.org/10.1371/journal.pgen.1001338.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS Genetics
    Authors
    Yang Liu; Haiming Xu; Suchao Chen; Xianfeng Chen; Zhenguo Zhang; Zhihong Zhu; Xueying Qin; Landian Hu; Jun Zhu; Guo-Ping Zhao; Xiangyin Kong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of OR tables between the interaction of rs7522462 and rs11945978 in the WTCCC data with the shared controls (left) and the interaction of the proxy SNPs, rs296533 and rs2089509 in the IBDGC data (right). The legend to this table is the same as that of Table 3.

  3. R

    Data from: Different Stages Dataset

    • universe.roboflow.com
    zip
    Updated Sep 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liam McArdle (2024). Different Stages Dataset [Dataset]. https://universe.roboflow.com/liam-mcardle/different-stages
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 9, 2024
    Dataset authored and provided by
    Liam McArdle
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Dancer Bounding Boxes
    Description

    Different Stages

    ## Overview
    
    Different Stages is a dataset for object detection tasks - it contains Dancer annotations for 3,843 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  4. A dataset from a survey investigating disciplinary differences in data...

    • zenodo.org
    • explore.openaire.eu
    • +1more
    bin, csv, pdf, txt
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein (2024). A dataset from a survey investigating disciplinary differences in data citation [Dataset]. http://doi.org/10.5281/zenodo.7853477
    Explore at:
    txt, pdf, bin, csvAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    GENERAL INFORMATION

    Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation

    Date of data collection: January to March 2022

    Collection instrument: SurveyMonkey

    Funding: Alfred P. Sloan Foundation


    SHARING/ACCESS INFORMATION

    Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license

    Links to publications that cite or use the data:

    Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437

    Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data:
    A survey investigating disciplinary differences in data citation.
    Zenodo. https://doi.org/10.5281/zenodo.7555266


    DATA & FILE OVERVIEW

    File List

    • Filename: MDCDatacitationReuse2021Codebookv2.pdf
      Codebook
    • Filename: MDCDataCitationReuse2021surveydatav2.csv
      Dataset format in csv
    • Filename: MDCDataCitationReuse2021surveydatav2.sav
      Dataset format in SPSS
    • Filename: MDCDataCitationReuseSurvey2021QNR.pdf
      Questionnaire

    Additional related data collected that was not included in the current data package: Open ended questions asked to respondents


    METHODOLOGICAL INFORMATION

    Description of methods used for collection/generation of data:

    The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.

    Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).

    Methods for processing the data:

    Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.

    Instrument- or software-specific information needed to interpret the data:

    The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.


    DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata

    Number of variables: 95

    Number of cases/rows: 2,492

    Missing data codes: 999 Not asked

    Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.

  5. P

    Meta-Dataset Dataset

    • paperswithcode.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eleni Triantafillou; Tyler Zhu; Vincent Dumoulin; Pascal Lamblin; Utku Evci; Kelvin Xu; Ross Goroshin; Carles Gelada; Kevin Swersky; Pierre-Antoine Manzagol; Hugo Larochelle, Meta-Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/meta-dataset
    Explore at:
    Authors
    Eleni Triantafillou; Tyler Zhu; Vincent Dumoulin; Pascal Lamblin; Utku Evci; Kelvin Xu; Ross Goroshin; Carles Gelada; Kevin Swersky; Pierre-Antoine Manzagol; Hugo Larochelle
    Description

    The Meta-Dataset benchmark is a large few-shot learning benchmark and consists of multiple datasets of different data distributions. It does not restrict few-shot tasks to have fixed ways and shots, thus representing a more realistic scenario. It consists of 10 datasets from diverse domains:

    ILSVRC-2012 (the ImageNet dataset, consisting of natural images with 1000 categories) Omniglot (hand-written characters, 1623 classes) Aircraft (dataset of aircraft images, 100 classes) CUB-200-2011 (dataset of Birds, 200 classes) Describable Textures (different kinds of texture images with 43 categories) Quick Draw (black and white sketches of 345 different categories) Fungi (a large dataset of mushrooms with 1500 categories) VGG Flower (dataset of flower images with 102 categories), Traffic Signs (German traffic sign images with 43 classes) MSCOCO (images collected from Flickr, 80 classes).

    All datasets except Traffic signs and MSCOCO have a training, validation and test split (proportioned roughly into 70%, 15%, 15%). The datasets Traffic Signs and MSCOCO are reserved for testing only.

  6. d

    50 States Comparison

    • catalog.data.gov
    • mydata.iowa.gov
    • +1more
    Updated Sep 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.iowa.gov (2023). 50 States Comparison [Dataset]. https://catalog.data.gov/dataset/50-states-comparison
    Explore at:
    Dataset updated
    Sep 1, 2023
    Dataset provided by
    data.iowa.gov
    Description

    This online application gives manufacturers the ability to compare Iowa to other states on a number of different topics including: business climate, education, operating costs, quality of life and workforce.

  7. w

    Dataset of books called Ordinary difference-differential equations

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Ordinary difference-differential equations [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Ordinary+difference-differential+equations
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is Ordinary difference-differential equations. It features 7 columns including author, publication date, language, and book publisher.

  8. T

    WORLD by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Aug 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2023). WORLD by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/world-
    Explore at:
    xml, json, csv, excelAvailable download formats
    Dataset updated
    Aug 18, 2023
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World, World
    Description

    This dataset provides values for WORLD reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  9. Data from: Comparison of Algorithms for Anomaly Detection in Flight Recorder...

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • datasets.ai
    • +3more
    Updated Feb 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Comparison of Algorithms for Anomaly Detection in Flight Recorder Data of Airline Operations [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/comparison-of-algorithms-for-anomaly-detection-in-flight-recorder-data-of-airline-operatio
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Published at 12th AIAA Aviation Technology, Integration, and Operations (ATIO) Conference and 14th AIAA/ISSM 17 - 19 September 2012, Indianapolis, Indiana

  10. NADAC Comparison

    • catalog.data.gov
    • data.virginia.gov
    • +2more
    Updated Jul 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Medicare & Medicaid Services (2025). NADAC Comparison [Dataset]. https://catalog.data.gov/dataset/nadac-comparison-c0aeb
    Explore at:
    Dataset updated
    Jul 2, 2025
    Dataset provided by
    Centers for Medicare & Medicaid Services
    Description

    The NADAC Weekly Comparison identifies the drug products with current NADAC rates that are replaced with new NADAC rates. Other changes (e.g. NDC additions and terminations) to the NADAC file are not reflected in this comparison. Note: Effective Date was not recorded in the dataset until 6/7/2017

  11. d

    PREDIK Data-Driven: Geospatial Data | USA | Tailor-made datasets: Foot...

    • datarade.ai
    Updated Oct 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Predik Data-driven (2021). PREDIK Data-Driven: Geospatial Data | USA | Tailor-made datasets: Foot traffic & Places Data [Dataset]. https://datarade.ai/data-products/predik-data-driven-geospatial-data-usa-tailor-made-datas-predik-data-driven
    Explore at:
    .json, .csv, .xls, .sqlAvailable download formats
    Dataset updated
    Oct 13, 2021
    Dataset authored and provided by
    Predik Data-driven
    Area covered
    United States
    Description

    This Location Data & Foot traffic dataset available for all countries include enriched raw mobility data and visitation at POIs to answer questions such as:

    -How often do people visit a location? (daily, monthly, absolute, and averages). -What type of places do they visit ? (parks, schools, hospitals, etc) -Which social characteristics do people have in a certain POI? - Breakdown by type: residents, workers, visitors. -What's their mobility like enduring night hours & day hours?
    -What's the frequency of the visits partition by day of the week and hour of the day?

    Extra insights -Visitors´ relative income Level. -Visitors´ preferences as derived by their visits to shopping, parks, sports facilities, churches, among others.

    Overview & Key Concepts Each record corresponds to a ping from a mobile device, at a particular moment in time and at a particular latitude and longitude. We procure this data from reliable technology partners, which obtain it through partnerships with location-aware apps. All the process is compliant with applicable privacy laws.

    We clean and process these massive datasets with a number of complex, computer-intensive calculations to make them easier to use in different data science and machine learning applications, especially those related to understanding customer behavior.

    Featured attributes of the data Device speed: based on the distance between each observation and the previous one, we estimate the speed at which the device is moving. This is particularly useful to differentiate between vehicles, pedestrians, and stationery observations.

    Night base of the device: we calculate the approximated location of where the device spends the night, which is usually their home neighborhood.

    Day base of the device: we calculate the most common daylight location during weekdays, which is usually their work location.

    Income level: we use the night neighborhood of the device, and intersect it with available socioeconomic data, to infer the device’s income level. Depending on the country, and the availability of good census data, this figure ranges from a relative wealth index to a currency-calculated income.

    POI visited: we intersect each observation with a number of POI databases, to estimate check-ins to different locations. POI databases can vary significantly, in scope and depth, between countries.

    Category of visited POI: for each observation that can be attributable to a POI, we also include a standardized location category (park, hospital, among others). Coverage: Worldwide.

    Delivery schemas We can deliver the data in three different formats:

    Full dataset: one record per mobile ping. These datasets are very large, and should only be consumed by experienced teams with large computing budgets.

    Visitation stream: one record per attributable visit. This dataset is considerably smaller than the full one but retains most of the more valuable elements in the dataset. This helps understand who visited a specific POI, characterize and understand the consumer's behavior.

    Audience profiles: one record per mobile device in a given period of time (usually monthly). All the visitation stream is aggregated by category. This is the most condensed version of the dataset and is very useful to quickly understand the types of consumers in a particular area and to create cohorts of users.

  12. E

    Dataset of whole genome bisulfite data of 4 different monocyte samples

    • ega-archive.org
    • m.egawon.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataset of whole genome bisulfite data of 4 different monocyte samples [Dataset]. https://ega-archive.org/datasets/EGAD00001003259
    Explore at:
    License

    https://ega-archive.org/dacs/EGAC00001000179https://ega-archive.org/dacs/EGAC00001000179

    Description

    Regions of common inter-individual DNA methylation differences in human monocytes – potential function and genetic basis WGBS Data of Samples: 43_Hm03_BlMo_Ct, 43_Hm02_BlMo_Ct, 43_Hm05_BlMo_Ct, 43_Hm01_BlMo_Ct For details about sequencing or sample metadata check http://deep.dkfz.de/

  13. f

    Dataset for paper: Body Positivity but not for everyone

    • sussex.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kathleen Simon; Megan Hurst (2023). Dataset for paper: Body Positivity but not for everyone [Dataset]. http://doi.org/10.25377/sussex.9885644.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    University of Sussex
    Authors
    Kathleen Simon; Megan Hurst
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Data for a Brief Report/Short Communication published in Body Image (2021). Details of the study are included below via the abstract from the manuscript. The dataset includes online experimental data from 167 women who were recruited via social media and institutional participant pools. The experiment was completed in Qualtrics.Women viewed either neutral travel images (control), body positivity posts with an average-sized model (e.g., ~ UK size 14), or body positivity posts with a larger model (e.g., UK size 18+); which images women viewed is show in the ‘condition’ variable in the data.The data includes the age range, height, weight, calculated BMI, and Instagram use of participants. After viewing the images, women responded to the Positive and Negative Affect Schedule (PANAS), a state version of the Body Satisfaction Scale (BSS), and reported their immediate social comparison with the images (SAC items). Women then selected a lunch for themselves from a hypothetical menu; these selections are detailed in the data, as are the total calories calculated from this and the proportion of their picks which were (provided as a percentage, and as a categorical variable [as used in the paper analyses]). Women also reported whether they were on a special diet (e.g., vegan or vegetarian), had food intolerances, when they last ate, and how hungry they were.

    Women also completed trait measures of Body Appreciation (BAS-2) and social comparison (PACS-R). Women also were asked to comment on what they thought the experiment was about. Items and computed scales are included within the dataset.This item includes the dataset collected for the manuscript (in SPSS and CSV formats), the variable list for the CSV file (for users working with the CSV datafile; the variable list and details are contained within the .sav file for the SPSS version), and the SPSS syntax for our analyses (.sps). Also included are the information and consent form (collected via Qualtrics) and the questions as completed by participants (both in pdf format).Please note that the survey order in the PDF is not the same as in the datafiles; users should utilise the variable list (either in CSV or SPSS formats) to identify the items in the data.The SPSS syntax can be used to replicate the analyses reported in the Results section of the paper. Annotations within the syntax file guide the user through these.

    A copy of SPSS Statistics is needed to open the .sav and .sps files.

    Manuscript abstract:

    Body Positivity (or ‘BoPo’) social media content may be beneficial for women’s mood and body image, but concerns have been raised that it may reduce motivation for healthy behaviours. This study examines differences in women’s mood, body satisfaction, and hypothetical food choices after viewing BoPo posts (featuring average or larger women) or a neutral travel control. Women (N = 167, 81.8% aged 18-29) were randomly assigned in an online experiment to one of three conditions (BoPo-average, BoPo-larger, or Travel/Control) and viewed three Instagram posts for two minutes, before reporting their mood and body satisfaction, and selecting a meal from a hypothetical menu. Women who viewed the BoPo posts featuring average-size women reported more positive mood than the control group; women who viewed posts featuring larger women did not. There were no effects of condition on negative mood or body satisfaction. Women did not make less healthy food choices than the control in either BoPo condition; women who viewed the BoPo images of larger women showed a stronger association between hunger and calories selected. These findings suggest that concerns over BoPo promoting unhealthy behaviours may be misplaced, but further research is needed regarding women’s responses to different body sizes.

  14. Data from: Password Reset Dataset

    • kaggle.com
    Updated Oct 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HariSellowpay (2023). Password Reset Dataset [Dataset]. https://www.kaggle.com/datasets/harisellowpay/password-reset-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 3, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    HariSellowpay
    Description

    The dataset is designed to simulate password-related events, creating a synthetic representation of actions related to password management. It includes fields like timestamp, action, event type, location, IP address, password, hour, and time difference.

    • The dataset comprises 50,000 records representing a variety of password-related events.
    • A list of commonly used passwords is incorporated to mimic real-world scenarios.
    • Timestamps are spread throughout the current year.
    • Features like 'hour' and 'time_difference' are derived to provide additional insights into the temporal aspects of the events.

    This synthetic dataset can be used for training and testing machine learning models related to cyber security, anomaly detection, or password management. It allows researchers and practitioners to experiment with data resembling real-world scenarios without compromising actual user information.

  15. o

    Data from: A consensus compound/bioactivity dataset for data-driven drug...

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Mar 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Isigkeit; Apirat Chaikuad; Daniel Merk (2022). A consensus compound/bioactivity dataset for data-driven drug design and chemogenomics [Dataset]. http://doi.org/10.5281/zenodo.6398019
    Explore at:
    Dataset updated
    Mar 2, 2022
    Authors
    Laura Isigkeit; Apirat Chaikuad; Daniel Merk
    Description

    This is the updated version of the dataset from 10.5281/zenodo.6320761 Information The diverse publicly available compound/bioactivity databases constitute a key resource for data-driven applications in chemogenomics and drug design. Analysis of their coverage of compound entries and biological targets revealed considerable differences, however, suggesting benefit of a consensus dataset. Therefore, we have combined and curated information from five esteemed databases (ChEMBL, PubChem, BindingDB, IUPHAR/BPS and Probes&Drugs) to assemble a consensus compound/bioactivity dataset comprising 1144648 compounds with 10915362 bioactivities on 5613 targets (including defined macromolecular targets as well as cell-lines and phenotypic readouts). It also provides simplified information on assay types underlying the bioactivity data and on bioactivity confidence by comparing data from different sources. We have unified the source databases, brought them into a common format and combined them, enabling an ease for generic uses in multiple applications such as chemogenomics and data-driven drug design. The consensus dataset provides increased target coverage and contains a higher number of molecules compared to the source databases which is also evident from a larger number of scaffolds. These features render the consensus dataset a valuable tool for machine learning and other data-driven applications in (de novo) drug design and bioactivity prediction. The increased chemical and bioactivity coverage of the consensus dataset may improve robustness of such models compared to the single source databases. In addition, semi-automated structure and bioactivity annotation checks with flags for divergent data from different sources may help data selection and further accurate curation. This dataset belongs to the publication: https://doi.org/10.3390/molecules27082513 Structure and content of the dataset Dataset structure ChEMBL ID PubChem ID IUPHAR ID Target Activity type Assay type Unit Mean C (0) ... Mean PC (0) ... Mean B (0) ... Mean I (0) ... Mean PD (0) ... Activity check annotation Ligand names Canonical SMILES C ... Structure check (Tanimoto) Source The dataset was created using the Konstanz Information Miner (KNIME) (https://www.knime.com/) and was exported as a CSV-file and a compressed CSV-file. Except for the canonical SMILES columns, all columns are filled with the datatype ‘string’. The datatype for the canonical SMILES columns is the smiles-format. We recommend the File Reader node for using the dataset in KNIME. With the help of this node the data types of the columns can be adjusted exactly. In addition, only this node can read the compressed format. Column content: ChEMBL ID, PubChem ID, IUPHAR ID: chemical identifier of the databases Target: biological target of the molecule expressed as the HGNC gene symbol Activity type: for example, pIC50 Assay type: Simplification/Classification of the assay into cell-free, cellular, functional and unspecified Unit: unit of bioactivity measurement Mean columns of the databases: mean of bioactivity values or activity comments denoted with the frequency of their occurrence in the database, e.g. Mean C = 7.5 *(15) -> the value for this compound-target pair occurs 15 times in ChEMBL database Activity check annotation: a bioactivity check was performed by comparing values from the different sources and adding an activity check annotation to provide automated activity validation for additional confidence no comment: bioactivity values are within one log unit; check activity data: bioactivity values are not within one log unit; only one data point: only one value was available, no comparison and no range calculated; no activity value: no precise numeric activity value was available; no log-value could be calculated: no negative decadic logarithm could be calculated, e.g., because the reported unit was not a compound concentration Ligand names: all unique names contained in the five source databases are listed Canonical SMILES columns: Molecular structure of the compound from each database Structure check (Tanimoto): To denote matching or differing compound structures in different source databases match: molecule structures are the same between different sources; no match: the structures differ. We calculated the Jaccard-Tanimoto similarity coefficient from Morgan Fingerprints to reveal true differences between sources and reported the minimum value; 1 structure: no structure comparison is possible, because there was only one structure available; no structure: no structure comparison is possible, because there was no structure available. Source: From which databases the data come from

  16. D

    Replication Data for: Control Comparison of Impact Actuator

    • darus.uni-stuttgart.de
    Updated Apr 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Schulte (2024). Replication Data for: Control Comparison of Impact Actuator [Dataset]. http://doi.org/10.18419/DARUS-3763
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 2, 2024
    Dataset provided by
    DaRUS
    Authors
    Alexander Schulte
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    DFG
    Description

    The Dataset contains the data, that was created simulating and measuring different control strategies for the impact actuator. Each table contains the time value, the set position and the actual position for the base axes as well as the actuators when executing a reference contour. There are two tables for each of the four control strategies, both simulative and experimental: EdgeStop: Reference Profile with acceleration an jerk limits ImpRef: Previous Impact Control Strategie ImpNewWithoutReset: New Strategie, but without controller reset ImpNew: New Impact Conctrol strategie

  17. Dataset for A Comparison of In Vitro Points of Departure with Human...

    • catalog.data.gov
    • datasets.ai
    Updated May 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2024). Dataset for A Comparison of In Vitro Points of Departure with Human Biomonitoring Levels for Per- and Polyfluoroalkyl Substances (PFAS) [Dataset]. https://catalog.data.gov/dataset/dataset-for-a-comparison-of-in-vitro-points-of-departure-with-human-biomonitoring-levels-f
    Explore at:
    Dataset updated
    May 2, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Dataset for A Comparison of In Vitro Points of Departure with Human Biomonitoring Levels for Per- and Polyfluoroalkyl Substances (PFAS), to be published by Environmental Health Perspectives (EHP). This dataset is associated with the following publication: Judson, R., D. Smith, M. Devito, J. Wambaugh, B. Wetmore, K. Friedman, G. Patlewicz, R. Thomas, R. Sayre, J. Olker, S. Degitz, S. Padilla, J. Harrill, T. Shafer, and K. Carstens. A Comparison of In Vitro Points of Departure with Human Blood Levels for Per- and Polyfluoroalkyl Substances (PFAS) (Toxics). Toxics. MDPI, Basel, SWITZERLAND, 12(4): 271, (2024).

  18. f

    Datasets used in experiments.

    • plos.figshare.com
    xls
    Updated Jan 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonio Fernando Lavareda Jacob Junior; Fabricio Almeida do Carmo; Adamo Lima de Santana; Ewaldo Eder Carvalho Santana; Fabio Manoel Franca Lobato (2024). Datasets used in experiments. [Dataset]. http://doi.org/10.1371/journal.pone.0297147.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 19, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Antonio Fernando Lavareda Jacob Junior; Fabricio Almeida do Carmo; Adamo Lima de Santana; Ewaldo Eder Carvalho Santana; Fabio Manoel Franca Lobato
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Missing data is a prevalent problem that requires attention, as most data analysis techniques are unable to handle it. This is particularly critical in Multi-Label Classification (MLC), where only a few studies have investigated missing data in this application domain. MLC differs from Single-Label Classification (SLC) by allowing an instance to be associated with multiple classes. Movie classification is a didactic example since it can be “drama” and “bibliography” simultaneously. One of the most usual missing data treatment methods is data imputation, which seeks plausible values to fill in the missing ones. In this scenario, we propose a novel imputation method based on a multi-objective genetic algorithm for optimizing multiple data imputations called Multiple Imputation of Multi-label Classification data with a genetic algorithm, or simply EvoImp. We applied the proposed method in multi-label learning and evaluated its performance using six synthetic databases, considering various missing values distribution scenarios. The method was compared with other state-of-the-art imputation strategies, such as K-Means Imputation (KMI) and weighted K-Nearest Neighbors Imputation (WKNNI). The results proved that the proposed method outperformed the baseline in all the scenarios by achieving the best evaluation measures considering the Exact Match, Accuracy, and Hamming Loss. The superior results were constant in different dataset domains and sizes, demonstrating the EvoImp robustness. Thus, EvoImp represents a feasible solution to missing data treatment for multi-label learning.

  19. Pairwise sentence complexity comparison

    • kaggle.com
    Updated Jun 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Douglas K.G. Araujo (2021). Pairwise sentence complexity comparison [Dataset]. https://www.kaggle.com/douglaskgaraujo/pairwise-sentence-complexity-comparison
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 8, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Douglas K.G. Araujo
    Description

    Dataset creation

    The dataset was created by this notebook: https://www.kaggle.com/douglaskgaraujo/sentence-complexity-comparison-dataset

    Context

    This data is a pairwise comparison of sentences, together with information about their relative complexity. The original dataset is from the CommonLit Readability Prize competition, and interested readers are referred there (especially the competitions' discussion forums) for more information on the data itself.

    Important notice! As per that competition's rules, the license is as follows:

    1. COMPETITION DATA. "Competition Data" means the data or datasets available from the Competition Website for the purpose of use in the Competition, including any prototype or executable code provided on the Competition Website. The Competition Data will contain private and public test sets. Which data belongs to which set will not be made available to participants.

    A. Data Access and Use. Competition Use and Non-Commercial & Academic Research: *You may access and use the Competition Data for non-commercial purposes only, including for participating in the Competition and on Kaggle.com forums, and for academic research and education. *The Competition Sponsor reserves the right to disqualify any participant who uses the Competition Data other than as permitted by the Competition Website and these Rules.

    B. Data Security. You agree to use reasonable and suitable measures to prevent persons who have not formally agreed to these Rules from gaining access to the Competition Data. You agree not to transmit, duplicate, publish, redistribute or otherwise provide or make available the Competition Data to any party not participating in the Competition. You agree to notify Kaggle immediately upon learning of any possible unauthorized transmission of or unauthorized access to the Competition Data and agree to work with Kaggle to rectify any unauthorized transmission or access.

    C. External Data. You may use data other than the Competition Data (“External Data”) to develop and test your Submissions. However, you will ensure the External Data is publicly available and equally accessible to use by all participants of the Competition for purposes of the competition at no cost to the other participants. The ability to use External Data under this Section 7.C (External Data) does not limit your other obligations under these Competition Rules, including but not limited to Section 11 (Winners Obligations).

    Content

    This dataset is a pairwise comparison of each sentence in the CommonLit competition with 500 other randomly-matched sentences. Sentences are divided into a training and validation datasets before being matched randomly. The relative complexity of each sentence is measured, and features such as the distance between this score for both sentences, and a column indicating whether or not the first sentence's readability score is greater than or equal to the score of the second sentence.

    Acknowledgements

    Thank you for the organisers of this competition for providing this dataset.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  20. 4

    Data from: AmsterTime: A Visual Place Recognition Benchmark Dataset for...

    • data.4tu.nl
    zip
    Updated Apr 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Burak Yildiz; Seyran Khademi; Ronald Maria Siebes; Jan van Gemert; Tino Mager; Beate Löffler; Carola Hein; Victor de Boer (2022). AmsterTime: A Visual Place Recognition Benchmark Dataset for Severe Domain Shift [Dataset]. http://doi.org/10.4121/19580806.v4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 3, 2022
    Dataset provided by
    4TU.ResearchData
    Authors
    Burak Yildiz; Seyran Khademi; Ronald Maria Siebes; Jan van Gemert; Tino Mager; Beate Löffler; Carola Hein; Victor de Boer
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Amsterdam
    Dataset funded by
    Volkswagen Foundation
    Description

    AmsterTime dataset offers a collection of 2,500 well-curated images matching the same scene from a street view matched to historical archival image data from Amsterdam city. The image pairs capture the same place with different cameras, viewpoints, and appearances. Unlike existing benchmark datasets, AmsterTime is directly crowdsourced in a GIS navigation platform (Mapillary). In turn, all the matching pairs are verified by a human expert to verify the correct matches and evaluate the human competence in the Visual Place Recognition (VPR) task for further references.


    The properties of the dataset are summarized as:

    • 1200+ license-free images from the Amsterdam City Archive, representing urban places in the city of Amsterdam, captured in the past century by many photographers.
    • All archival queries are matched with street view images from Mapillary.
    • All matches are verified by architectural historians and Amsterdam inhabitants.
    • Image pairs are archival and street views capturing the same place with different cameras, time lags, structural changes, occlusion, viewpoint, appearance, and illuminations.
    • The dataset exhibits a domain shift between query and the gallery due to significant difference between scanned archival and street view images.

    Two sub-tasks are created on the dataset:

    • Verification is a binary classification (auxiliary) task to detect a pair of archival and street-view images of the same place. The verification task for AmsterTime dataset has all of the crowdsourced image pairs as positive labeled, where the same number of negative samples are generated by randomly pairing archival and street-view images summing up to a total of 2,462 pairs in the verification task.
    • Retrieval is the main task corresponding to VPR, in which a given query image is matched with a set of gallery images. For the retrieval task, AmsterTime dataset offers 1231 query images where the leave-one-out set serves as the gallery images for each query.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Department of Veterans Affairs (2021). Nursing Home Compare [Dataset]. https://catalog.data.gov/dataset/nursing-home-compare-ed7b0
Organization logo

Data from: Nursing Home Compare

Related Article
Explore at:
Dataset updated
May 1, 2021
Dataset provided by
United States Department of Veterans Affairshttp://va.gov/
Description

Nursing Home Compare has detailed information about every Medicare and Medicaid nursing home in the country. A nursing home is a place for people who can’t be cared for at home and need 24-hour nursing care. These are the official datasets used on the Medicare.gov Nursing Home Compare Website provided by the Centers for Medicare & Medicaid Services. These data allow you to compare the quality of care at every Medicare and Medicaid-certified nursing home in the country, including over 15,000 nationwide.

Search
Clear search
Close search
Google apps
Main menu