2 datasets found
  1. h

    n2c2_2011

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BigScience Biomedical Datasets, n2c2_2011 [Dataset]. https://huggingface.co/datasets/bigbio/n2c2_2011
    Explore at:
    Dataset authored and provided by
    BigScience Biomedical Datasets
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    The i2b2/VA corpus contained de-identified discharge summaries from Beth Israel Deaconess Medical Center, Partners Healthcare, and University of Pittsburgh Medical Center (UPMC). In addition, UPMC contributed de-identified progress notes to the i2b2/VA corpus. This dataset contains the records from Beth Israel and Partners.

    The i2b2/VA corpus contained five concept categories: problem, person, pronoun, test, and treatment. Each record in the i2b2/VA corpus was annotated by two independent annotators for coreference pairs. Then the pairs were post-processed in order to create coreference chains. These chains were presented to an adjudicator, who resolved the disagreements between the original annotations, and added or deleted annotations as necessary. The outputs of the adjudicators were then re-adjudicated, with particular attention being paid to duplicates and enforcing consistency in the annotations.

  2. h

    n2c2_2010

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BigScience Biomedical Datasets, n2c2_2010 [Dataset]. https://huggingface.co/datasets/bigbio/n2c2_2010
    Explore at:
    Dataset authored and provided by
    BigScience Biomedical Datasets
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    The i2b2/VA corpus contained de-identified discharge summaries from Beth Israel Deaconess Medical Center, Partners Healthcare, and University of Pittsburgh Medical Center (UPMC). In addition, UPMC contributed de-identified progress notes to the i2b2/VA corpus. This dataset contains the records from Beth Israel and Partners.

    The 2010 i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records comprises three tasks: 1) a concept extraction task focused on the extraction of medical concepts from patient reports; 2) an assertion classification task focused on assigning assertion types for medical problem concepts; 3) a relation classification task focused on assigning relation types that hold between medical problems, tests, and treatments.

    i2b2 and the VA provided an annotated reference standard corpus for the three tasks. Using this reference standard, 22 systems were developed for concept extraction, 21 for assertion classification, and 16 for relation classification.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
BigScience Biomedical Datasets, n2c2_2011 [Dataset]. https://huggingface.co/datasets/bigbio/n2c2_2011

n2c2_2011

bigbio/n2c2_2011

n2c2 2011 Coreference

Explore at:
Dataset authored and provided by
BigScience Biomedical Datasets
License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

The i2b2/VA corpus contained de-identified discharge summaries from Beth Israel Deaconess Medical Center, Partners Healthcare, and University of Pittsburgh Medical Center (UPMC). In addition, UPMC contributed de-identified progress notes to the i2b2/VA corpus. This dataset contains the records from Beth Israel and Partners.

The i2b2/VA corpus contained five concept categories: problem, person, pronoun, test, and treatment. Each record in the i2b2/VA corpus was annotated by two independent annotators for coreference pairs. Then the pairs were post-processed in order to create coreference chains. These chains were presented to an adjudicator, who resolved the disagreements between the original annotations, and added or deleted annotations as necessary. The outputs of the adjudicators were then re-adjudicated, with particular attention being paid to duplicates and enforcing consistency in the annotations.

Search
Clear search
Close search
Google apps
Main menu