11 datasets found
  1. M

    Data from: INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis...

    • stanfordaimi.azurewebsites.net
    Updated Jun 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft Research (2025). INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis [Dataset]. https://stanfordaimi.azurewebsites.net/datasets/151848b9-8b31-4129-bc25-cefdf18f95d8
    Explore at:
    Dataset updated
    Jun 26, 2025
    Dataset authored and provided by
    Microsoft Research
    License

    https://aimistanford-web-api.azurewebsites.net/licenses/8de476ec-6092-4502-82f0-3e84aa75788f/viewhttps://aimistanford-web-api.azurewebsites.net/licenses/8de476ec-6092-4502-82f0-3e84aa75788f/view

    Description

    Synthesizing information from various data sources plays a crucial role in the practice of modern medicine. Current applications of artificial intelligence in medicine often focus on single-modality data due to a lack of publicly available, multimodal medical datasets. To address this limitation, we introduce INSPECT, which contains de-identified longitudinal records from a large cohort of pulmonary embolism (PE) patients, along with ground truth labels for multiple outcomes. INSPECT contains data from 19,402 patients, including 23,248 CT images, sections of radiology reports, and structured electronic health record (EHR) data (including demographics, diagnoses, procedures, and vitals). Using our provided dataset, we develop and release a benchmark for evaluating several baseline modeling approaches on a variety of important PE related tasks. We evaluate image-only, EHR-only, and fused models. Trained models and the de-identified dataset are made available for non-commercial use under a data use agreement. To the best our knowledge, INSPECT is the largest multimodal dataset for enabling reproducible research on strategies for integrating 3D medical imaging and EHR data. EHR modality data is uploaded to Stanford Redivis website (https://redivis.com/Stanford).

  2. s

    Data from: Fostering cultures of open qualitative research: Dataset 2 –...

    • orda.shef.ac.uk
    xlsx
    Updated Oct 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Hanchard; Itzel San Roman Pineda (2025). Fostering cultures of open qualitative research: Dataset 2 – Interview Transcripts [Dataset]. http://doi.org/10.15131/shef.data.23567223.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Oct 8, 2025
    Dataset provided by
    The University of Sheffield
    Authors
    Matthew Hanchard; Itzel San Roman Pineda
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset was created and deposited onto the University of Sheffield Online Research Data repository (ORDA) on 23-Jun-2023 by Dr. Matthew S. Hanchard, Research Associate at the University of Sheffield iHuman Institute. The dataset forms part of three outputs from a project titled ‘Fostering cultures of open qualitative research’ which ran from January 2023 to June 2023:

    · Fostering cultures of open qualitative research: Dataset 1 – Survey Responses · Fostering cultures of open qualitative research: Dataset 2 – Interview Transcripts · Fostering cultures of open qualitative research: Dataset 3 – Coding Book

    The project was funded with £13,913.85 of Research England monies held internally by the University of Sheffield - as part of their ‘Enhancing Research Cultures’ scheme 2022-2023.

    The dataset aligns with ethical approval granted by the University of Sheffield School of Sociological Studies Research Ethics Committee (ref: 051118) on 23-Jan-2021. This includes due concern for participant anonymity and data management.

    ORDA has full permission to store this dataset and to make it open access for public re-use on the basis that no commercial gain will be made form reuse. It has been deposited under a CC-BY-NC license. Overall, this dataset comprises:

    · 15 x Interview transcripts - in .docx file format which can be opened with Microsoft Word, Google Doc, or an open-source equivalent.

    All participants have read and approved their transcripts and have had an opportunity to retract details should they wish to do so.

    Participants chose whether to be pseudonymised or named directly. The pseudonym can be used to identify individual participant responses in the qualitative coding held within the ‘Fostering cultures of open qualitative research: Dataset 3 – Coding Book’ files.

    For recruitment, 14 x participants we selected based on their responses to the project survey., whilst one participant was recruited based on specific expertise.

    · 1 x Participant sheet – in .csv format which may by opened with Microsoft Excel, Google Sheet, or an open-source equivalent.

    The provides socio-demographic detail on each participant alongside their main field of research and career stage. It includes a RespondentID field/column which can be used to connect interview participants with their responses to the survey questions in the accompanying ‘Fostering cultures of open qualitative research: Dataset 1 – Survey Responses’ files.

    The project was undertaken by two staff:

    Co-investigator: Dr. Itzel San Roman Pineda ORCiD ID: 0000-0002-3785-8057 i.sanromanpineda@sheffield.ac.uk Postdoctoral Research Assistant Labelled as ‘Researcher 1’ throughout the dataset

    Principal Investigator (corresponding dataset author): Dr. Matthew Hanchard ORCiD ID: 0000-0003-2460-8638 m.s.hanchard@sheffield.ac.uk Research Associate iHuman Institute, Social Research Institutes, Faculty of Social Science Labelled as ‘Researcher 2’ throughout the dataset

  3. COKI Open Access Dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Oct 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard Hosking; Richard Hosking; James P. Diprose; James P. Diprose; Aniek Roelofs; Aniek Roelofs; Tuan-Yow Chien; Tuan-Yow Chien; Lucy Montgomery; Lucy Montgomery; Cameron Neylon; Cameron Neylon (2023). COKI Open Access Dataset [Dataset]. http://doi.org/10.5281/zenodo.7048603
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 3, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Richard Hosking; Richard Hosking; James P. Diprose; James P. Diprose; Aniek Roelofs; Aniek Roelofs; Tuan-Yow Chien; Tuan-Yow Chien; Lucy Montgomery; Lucy Montgomery; Cameron Neylon; Cameron Neylon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The COKI Open Access Dataset measures open access performance for 142 countries and 5117 institutions and is available in JSON Lines format. The data is visualised at the COKI Open Access Dashboard: https://open.coki.ac/.

    The COKI Open Access Dataset is created with the COKI Academic Observatory data collection pipeline, which fetches data about research publications from multiple sources, synthesises the datasets and creates the open access calculations for each country and institution.

    Each week a number of specialised research publication datasets are collected. The datasets that are used for the COKI Open Access Dataset release include Crossref Metadata, Microsoft Academic Graph, Unpaywall and the Research Organization Registry.

    After fetching the datasets, they are synthesised to produce aggregate time series statistics for each country and institution in the dataset. The aggregate timeseries statistics include publication count, open access status and citation count.

    See https://open.coki.ac/data/ for the dataset schema. A new version of the dataset is deposited every week.

    Code

    License
    COKI Open Access Dataset © 2022 by Curtin University is licenced under CC BY 4.0.

    Attributions
    This work contains information from:

  4. Z

    COVID-19 Open Research Dataset (CORD-19)

    • data.niaid.nih.gov
    • marketplace.sshopencloud.eu
    Updated Jul 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucy Lu Wang (2024). COVID-19 Open Research Dataset (CORD-19) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3715505
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Kyle Lo
    JJ Yang
    Lucy Lu Wang
    Sebastian Kohlmeier
    Description

    A full description of this dataset along with updated information can be found here.

    In response to the COVID-19 pandemic, the Allen Institute for AI has partnered with leading research groups to prepare and distribute the COVID-19 Open Research Dataset (CORD-19), a free resource of scholarly articles, including full text content, about COVID-19 and the coronavirus family of viruses for use by the global research community.

    This dataset is intended to mobilize researchers to apply recent advances in natural language processing to generate new insights in support of the fight against this infectious disease. The corpus will be updated weekly as new research is published in peer-reviewed publications and archival services like bioRxiv, medRxiv, and others.

    By downloading this dataset you are agreeing to the Dataset license. Specific licensing information for individual articles in the dataset is available in the metadata file.

    Additional licensing information is available on the PMC website, medRxiv website and bioRxiv website.

    Dataset content:

    Commercial use subset

    Non-commercial use subset

    PMC custom license subset

    bioRxiv/medRxiv subset (pre-prints that are not peer reviewed)

    Metadata file

    Readme

    Each paper is represented as a single JSON object (see schema file for details).

    Description:

    The dataset contains all COVID-19 and coronavirus-related research (e.g. SARS, MERS, etc.) from the following sources:

    PubMed's PMC open access corpus using this query (COVID-19 and coronavirus research)

    Additional COVID-19 research articles from a corpus maintained by the WHO

    bioRxiv and medRxiv pre-prints using the same query as PMC (COVID-19 and coronavirus research)

    We also provide a comprehensive metadata file of coronavirus and COVID-19 research articles with links to PubMed, Microsoft Academic and the WHO COVID-19 database of publications (includes articles without open access full text).

    We recommend using metadata from the comprehensive file when available, instead of parsed metadata in the dataset. Please note the dataset may contain multiple entries for individual PMC IDs in cases when supplementary materials are available.

    This repository is linked to the WHO database of publications on coronavirus disease and other resources, such as Microsoft Academic Graph, PubMed, and Semantic Scholar. A coalition including the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research, and the National Library of Medicine of the National Institutes of Health came together to provide this service.

    Citation:

    When including CORD-19 data in a publication or redistribution, please cite the dataset as follows:

    In bibliography:

    COVID-19 Open Research Dataset (CORD-19). 2020. Version 2020-MM-DD. Retrieved from https://pages.semanticscholar.org/coronavirus-research. Accessed YYYY-MM-DD. 10.5281/zenodo.3715505

    In text:

    (CORD-19, 2020)

    The Allen Institute for AI and particularly the Semantic Scholar team will continue to provide updates to this dataset as the situation evolves and new research is released.

  5. R

    Mnist Dataset

    • universe.roboflow.com
    • tensorflow.org
    • +3more
    zip
    Updated Aug 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Popular Benchmarks (2022). Mnist Dataset [Dataset]. https://universe.roboflow.com/popular-benchmarks/mnist-cjkff/model/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 8, 2022
    Dataset authored and provided by
    Popular Benchmarks
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Digits
    Description

    THE MNIST DATABASE of handwritten digits

    Authors:

    • Yann LeCun, Courant Institute, NYU
    • Corinna Cortes, Google Labs, New York
    • Christopher J.C. Burges, Microsoft Research, Redmond

    Dataset Obtained From: http://yann.lecun.com/exdb/mnist/

    All images were sized 28x28 in the original dataset

    The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.

    It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

    Version 1 (original-images_trainSetSplitBy80_20):

    • Original, raw images, with the train set split to provide 80% of its images to the training set and 20% of its images to the validation set
    • Trained from Roboflow Classification Model's ImageNet training checkpoint

    Version 2 (original-images_ModifiedClasses_trainSetSplitBy80_20):

    • Original, raw images, with the train set split to provide 80% of its images to the training set and 20% of its images to the validation set
    • Modify Classes, a Roboflow preprocessing feature, was employed to change class names from 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 to one, two, three, four, five, six, seven, eight, nine
    • Trained from the Roboflow Classification Model's ImageNet training checkpoint

    Version 3 (original-images_Original-MNIST-Splits):

    • Original images, with the original splits for MNIST: train (86% of images - 60,000 images) set and test (14% of images - 10,000 images) set only.
    • This version was not trained

    Citation:

    @article{lecun2010mnist,
     title={MNIST handwritten digit database},
     author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
     journal={ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist},
     volume={2},
     year={2010}
    }
    
  6. Global Forest Ecosystem Structure and Function Data For Carbon Balance...

    • data.nasa.gov
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Global Forest Ecosystem Structure and Function Data For Carbon Balance Research - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/global-forest-ecosystem-structure-and-function-data-for-carbon-balance-research-e3cd6
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    A comprehensive global database has been assembled to quantify CO2 fluxes and pathways across different levels of integration (from photosynthesis up to net ecosystem production) in forest ecosystems. The database fills an important gap for model calibration, model validation, and hypothesis testing at global and regional scales. The database archive includes: a Microsoft Office Access Database; data files for all tables in the database; query outputs from the database; and SQL script file for re-creating the database from the tables. The database is structured by site (i.e., a forest or stand of known geographical location, biome, species composition, and management regime). It contains carbon budget variables (fluxes and stocks), ecosystem traits (standing biomass, leaf area index, age), and ancillary information (management regime, climate, soil characteristics) for 529 sites from eight forest biomes. Data entries originated from peer-reviewed literature and personal communications with researchers involved in Fluxnet. Flux estimates were included in the database when they were based on direct measurements (e.g., tower-based eddy covariance system measurements), derived from single or multiple direct measurements, or modeled. Stand description was based on observed values, and climatic description was based on the CRU data set and ORCHIDEE model output. Uncertainty for each carbon balance component in the database was estimated in a uniformed way by expert judgment. Robustness of CO2 balances was tested, and closure terms were introduced as a numerical way to approach data quality and flux uncertainty at the biome level.

  7. z

    GAPs Data Repository on Return: Guideline, Data Samples and Codebook

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    Updated Feb 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeynep Sahin Mencutek; Zeynep Sahin Mencutek; Fatma Yılmaz-Elmas; Fatma Yılmaz-Elmas (2025). GAPs Data Repository on Return: Guideline, Data Samples and Codebook [Dataset]. http://doi.org/10.5281/zenodo.14862490
    Explore at:
    Dataset updated
    Feb 13, 2025
    Dataset provided by
    RedCAP
    Authors
    Zeynep Sahin Mencutek; Zeynep Sahin Mencutek; Fatma Yılmaz-Elmas; Fatma Yılmaz-Elmas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The GAPs Data Repository provides a comprehensive overview of available qualitative and quantitative data on national return regimes, now accessible through an advanced web interface at https://data.returnmigration.eu/.

    This updated guideline outlines the complete process, starting from the initial data collection for the return migration data repository to the development of a comprehensive web-based platform. Through iterative development, participatory approaches, and rigorous quality checks, we have ensured a systematic representation of return migration data at both national and comparative levels.

    The Repository organizes data into five main categories, covering diverse aspects and offering a holistic view of return regimes: country profiles, legislation, infrastructure, international cooperation, and descriptive statistics. These categories, further divided into subcategories, are based on insights from a literature review, existing datasets, and empirical data collection from 14 countries. The selection of categories prioritizes relevance for understanding return and readmission policies and practices, data accessibility, reliability, clarity, and comparability. Raw data is meticulously collected by the national experts.

    The transition to a web-based interface builds upon the Repository’s original structure, which was initially developed using REDCap (Research Electronic Data Capture). It is a secure web application for building and managing online surveys and databases.The REDCAP ensures systematic data entries and store them on Uppsala University’s servers while significantly improving accessibility and usability as well as data security. It also enables users to export any or all data from the Project when granted full data export privileges. Data can be exported in various ways and formats, including Microsoft Excel, SAS, Stata, R, or SPSS for analysis. At this stage, the Data Repository design team also converted tailored records of available data into public reports accessible to anyone with a unique URL, without the need to log in to REDCap or obtain permission to access the GAPs Project Data Repository. Public reports can be used to share information with stakeholders or external partners without granting them access to the Project or requiring them to set up a personal account. Currently, all public report links inserted in this report are also available on the Repository’s webpage, allowing users to export original data.

    This report also includes a detailed codebook to help users understand the structure, variables, and methodologies used in data collection and organization. This addition ensures transparency and provides a comprehensive framework for researchers and practitioners to effectively interpret the data.

    The GAPs Data Repository is committed to providing accessible, well-organized, and reliable data by moving to a centralized web platform and incorporating advanced visuals. This Repository aims to contribute inputs for research, policy analysis, and evidence-based decision-making in the return and readmission field.

    Explore the GAPs Data Repository at https://data.returnmigration.eu/.

  8. d

    Comparison of R1 and R2 Online Research Data Services

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Szkirpan, Elizabeth (2023). Comparison of R1 and R2 Online Research Data Services [Dataset]. http://doi.org/10.7910/DVN/SHJABB
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Szkirpan, Elizabeth
    Description

    Compiled in mid-2022, this dataset contains the raw data file, randomized ranked lists of R1 and R2 research institutions, and files created to support data visualization for Elizabeth Szkirpan's 2022 study regarding availability of data services and research data information via university libraries for online users. Files are available in Microsoft Excel formats.

  9. Lecture Notes - CS Tools - 2024/2025 – deZarza

    • figshare.com
    pdf
    Updated Jul 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    I. de Zarzà; J. de Curto (2025). Lecture Notes - CS Tools - 2024/2025 – deZarza [Dataset]. http://doi.org/10.6084/m9.figshare.29582690.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jul 16, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    I. de Zarzà; J. de Curto
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This compilation of additional lecture materials offers a practical introduction to key Computer Science (CS) and Digital tools and concepts aimed at enhancing research, teaching, and administrative efficiency. Prepared by Dr. I. de Zarzà, and also reviewed and edited by Dr. J. de Curtò, are designed as a transversal resource, to support students from diverse disciplines—ranging from engineering and business to public management and health sciences.Topics include:· Introduction to Programming· Spreadsheet software and Excel functions· Word processing and Overleaf (LaTeX)· Presentation tools including PowerPoint, SlidesAI, and Genially· Prompt engineering and AI-assisted writing with Copilot and ChatGPT· Web and blog creation using HTML and Blogger· Introduction to databases (SQL and NoSQL)· Cybersecurity fundamentals and safe digital practices· Multimedia generation with AI (voice, video, and music tools like Suno and Sora)Developed across various undergraduate programs at the Universidad de Zaragoza, the notes combine technical know-how with real-world applications in academic and public sector contexts.

  10. M

    MRA-MIDAS: Multimodal Image Dataset for AI-based Skin Cancer

    • stanfordaimi.azurewebsites.net
    Updated Jun 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft Research (2024). MRA-MIDAS: Multimodal Image Dataset for AI-based Skin Cancer [Dataset]. https://stanfordaimi.azurewebsites.net/datasets/f4c2020f-801a-42dd-a477-a1a8357ef2a5
    Explore at:
    Dataset updated
    Jun 18, 2024
    Dataset authored and provided by
    Microsoft Research
    License

    https://aimistanford-web-api.azurewebsites.net/licenses/f1f352a6-243f-4905-8e00-389edbca9e83/viewhttps://aimistanford-web-api.azurewebsites.net/licenses/f1f352a6-243f-4905-8e00-389edbca9e83/view

    Description

    We introduce the Melanoma Research Alliance Multimodal Image Dataset for AI-based Skin Cancer (MRA-MIDAS) dataset, the first publicly available, prospectively-recruited, systematically-paired dermoscopic and clinical image-based dataset across a range of skin-lesion diagnoses. This dataset encompasses a wide array of skin lesions and includes well-annotated, patient-level, clinical metadata. It aims to more accurately mirror real-world clinical scenarios than retrospectively curated datasets and is enhanced by extensive histopathologic confirmation to ensure data integrity. This research was approved by the Institutional Review Board at Stanford University under IRB#36050, along with the Cleveland Clinic Foundation under IRB#20-666, and adhered to the Helsinki Declaration. Patients presenting to the dermatology clinics of participating dermatologists at Stanford Medicine or Cleveland Clinic Foundation between August 18, 2020, and April 17, 2023, were eligible for the study if 1) they had at least one solitary skin lesion of concern identified where a skin biopsy was deemed medically necessary by the dermatologist investigator or 2) patients were directed to in-clinic evaluation for a lesion that was previously identified as concerning through a teledermatology encounter or dermatologist review of a patient photo submitted through the electronic patient messaging portal. Patients underwent written informed consent with either the physician or research coordinator, after which both clinical and dermoscopic digital photography were obtained of any eligible skin lesions. Each lesion underwent standardized photography with a contemporary model iPhone or iPad device (iPhone SE to iPhone 12 Pro and iPod touch to iPad mini) without flash photography at 15-cm and 30-cm distances, along with digital dermatoscope photography. For each lesion, clinical information about the patient was obtained and recorded including sex assigned at birth, age, Fitzpatrick skin type, personal history of melanoma, anatomic location, and the lesion’s length and width. Investigators had the discretion to identify additional control lesions that clinically appeared benign on a corresponding contralateral body site that were similarly enrolled for digital photography as an un-biopsied control lesion to include in the dataset, though model analysis was restricted to biopsied lesions. This dataset contains images obtained from patients at Stanford who provided consent for public release of their images and represents the near entirety of cases enrolled at this site. At the time of first enrollment, the Stanford dermatologists at the specialized pigmented lesion and melanoma clinics had an average of 15.7 years of post-residency experience while those in general medical dermatology clinics had an average of 3.9 years’ experience. Dermatologists noted their top-five ranked clinical impressions at the time of evaluation, along with their binary level of confidence (Yes/No) in their top impression. For any biopsied lesions, associated histopathologic final diagnoses were recorded and categorized into a previously described taxonomy. Biopsy results were interpreted by three board-certified dermatopathologists at Stanford. A dermatopathology consensus conference reviewed any diagnosis of severely dysplastic melanocytic nevus or worse. Melanocytic lesions were specifically grouped in the following manner: benign melanocytic nevi, melanomas (including melanoma in-situ and invasive melanoma), and surgically-eligible intermediate melanocytic tumors where complete excision is typically recommended (including severely dysplastic melanocytic nevi and melanocytomas such as typical/atypical Spitz tumors, such as BAP-1-inactivated melanocytic tumors, deep penetrating nevi/tumors, and cellular blue nevi with atypia). Cases were included in the dataset if a second reviewing independent board-certified dermatologist agreed with the favored diagnosis based on a review of the associated images. Funding: This project is based on research supported by the Melanoma Research Alliance (MRA)- L’Oreal Dermatological Beauty Brands Team Science Award, along with philanthropic funding from the David Mair and Vanessa Vu-Mair Artificial Intelligence in Skin Cancer Fund and the Tal & Cinthia Simon Melanoma Research Fund at Stanford Medicine. Acknowledgments: This material is the result of work supported with resources and the use of facilities at the Veterans Affairs Palo Alto Health Care System in Palo Alto, California.

  11. C

    Cloud-based Database Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Cloud-based Database Report [Dataset]. https://www.datainsightsmarket.com/reports/cloud-based-database-1454611
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The cloud-based database market is experiencing explosive growth, projected to reach $30.86 billion in 2025 and exhibiting a remarkable Compound Annual Growth Rate (CAGR) of 53.6% from 2019 to 2033. This phenomenal expansion is fueled by several key drivers. The increasing adoption of cloud computing across diverse industries, coupled with the inherent scalability and cost-effectiveness of cloud-based databases, are primary factors. Furthermore, the growing demand for real-time data analytics and the need for robust data management solutions are significantly contributing to market expansion. Businesses are increasingly migrating from on-premise solutions to leverage the agility, enhanced security features, and improved disaster recovery capabilities offered by cloud databases. The market's competitive landscape is dominated by major players like Amazon Web Services, Google, Microsoft, and Oracle, each offering a comprehensive suite of services. However, the emergence of specialized solutions and open-source options like MongoDB and Cassandra is also driving innovation and expanding market accessibility. The shift towards serverless databases and the increasing adoption of managed services are shaping market trends, while challenges like data security concerns and vendor lock-in remain areas of ongoing concern. The forecast period (2025-2033) promises continued growth, with the market expected to surpass $300 billion. This is predicated on the continued adoption of cloud technologies across all sectors, including healthcare, finance, retail, and manufacturing. Further advancements in database technology, such as AI-powered database management systems and improved integration with other cloud services, will continue to propel market expansion. However, potential restraints include the need for skilled professionals to manage and maintain these complex systems, and the ongoing concern about regulatory compliance and data sovereignty. The continuous evolution of hybrid cloud deployments will offer a path for organizations seeking a balanced approach between public and private cloud deployments, creating another exciting avenue for market growth. The geographically diverse player base ensures that the market's growth will be felt globally, with regional variations depending on digital infrastructure development and adoption rates.

  12. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Microsoft Research (2025). INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis [Dataset]. https://stanfordaimi.azurewebsites.net/datasets/151848b9-8b31-4129-bc25-cefdf18f95d8

Data from: INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

Related Article
Explore at:
Dataset updated
Jun 26, 2025
Dataset authored and provided by
Microsoft Research
License

https://aimistanford-web-api.azurewebsites.net/licenses/8de476ec-6092-4502-82f0-3e84aa75788f/viewhttps://aimistanford-web-api.azurewebsites.net/licenses/8de476ec-6092-4502-82f0-3e84aa75788f/view

Description

Synthesizing information from various data sources plays a crucial role in the practice of modern medicine. Current applications of artificial intelligence in medicine often focus on single-modality data due to a lack of publicly available, multimodal medical datasets. To address this limitation, we introduce INSPECT, which contains de-identified longitudinal records from a large cohort of pulmonary embolism (PE) patients, along with ground truth labels for multiple outcomes. INSPECT contains data from 19,402 patients, including 23,248 CT images, sections of radiology reports, and structured electronic health record (EHR) data (including demographics, diagnoses, procedures, and vitals). Using our provided dataset, we develop and release a benchmark for evaluating several baseline modeling approaches on a variety of important PE related tasks. We evaluate image-only, EHR-only, and fused models. Trained models and the de-identified dataset are made available for non-commercial use under a data use agreement. To the best our knowledge, INSPECT is the largest multimodal dataset for enabling reproducible research on strategies for integrating 3D medical imaging and EHR data. EHR modality data is uploaded to Stanford Redivis website (https://redivis.com/Stanford).

Search
Clear search
Close search
Google apps
Main menu