13 datasets found
  1. Community homogeneity measures for θ varying.

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane (2023). Community homogeneity measures for θ varying. [Dataset]. http://doi.org/10.1371/journal.pone.0122777.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Community homogeneity measures for θ varyingCommunity homogeneity measures for θ varying.

  2. The Residential Population Generator (RPGen): A tool to parameterize...

    • catalog.data.gov
    Updated Mar 9, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2021). The Residential Population Generator (RPGen): A tool to parameterize residential, demographic, and physiological data to model intraindividual exposure, dose, and risk [Dataset]. https://catalog.data.gov/dataset/the-residential-population-generator-rpgen-a-tool-to-parameterize-residential-demographic-
    Explore at:
    Dataset updated
    Mar 9, 2021
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This repository contains scripts, input files, and some example output files for the Residential Population Generator, an R-based tool to generate synthetic human residental populations to use in making estimates of near-field chemical exposures. This tool is most readily adapted for using in the workflow for CHEM, the Combined Human Exposure Model, avaialable in two other GitHub repositories in the HumanExposure project, including ProductUseScheduler and source2dose. CHEM is currently best suited to estimating exposure to product use. Outputs from RPGen are translated into ProductUseScheduler, which with subsequent outputs used in source2dose.

  3. H

    LAM Synthetic Forecast Generation Dataset

    • beta.hydroshare.org
    • hydroshare.org
    zip
    Updated May 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zachary Paul Brodeur (2024). LAM Synthetic Forecast Generation Dataset [Dataset]. https://beta.hydroshare.org/resource/e51d9821c8d84682b642eb0818ac3137/
    Explore at:
    zip(600.0 MB)Available download formats
    Dataset updated
    May 6, 2024
    Dataset provided by
    HydroShare
    Authors
    Zachary Paul Brodeur
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Oct 3, 1984 - Sep 10, 2019
    Area covered
    Description

    Pre-processed subset of raw HEFS hindcast data for Lake Mendocino (LAM) configured for compatibility with the repository structure of the versions 1 and 2 synthetic forecast model contained here: https://github.com/zpb4/Synthetic-Forecast-v1-FIRO-DISES and here: https://github.com/zpb4/Synthetic-Forecast-v2-FIRO-DISES. The data are pre-structured for the repository setup and instructions are included in README files for both GitHub repos on how to setup the data contained in this resource.

    Contains HEFS hindcast .csv files and observed full-natural-flow files for the following sites: LAMC1 - main reservoir inflow to Lake Mendocino UKAC1 - downstream flows at Ukiah junction HOPC1L - downstream local flows at Hopland junction

    Data also contains R scripts used to preprocess the raw HEFS data contained in the associated public Hydroshare resource here: https://www.hydroshare.org/resource/ccffddde118f4145854c960295f520cb/

  4. Z

    Synthetic XES Event Log of Malignant Melanoma Treatment

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Grüger, Joscha (2024). Synthetic XES Event Log of Malignant Melanoma Treatment [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13828518
    Explore at:
    Dataset updated
    Sep 23, 2024
    Dataset provided by
    Grüger, Joscha
    Kuhn, Martin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The synthetic event log described in this document consists of 25,000 traces, generated using the process model outlined in Geyer et al. (2024) [1] and the DALG tool [2]. This event log simulates the treatment process of malignant melanoma patients, adhering to clinical guidelines. Each trace in the log represents a unique patient journey through various stages of melanoma treatment, providing detailed insights into decision points, treatments, and outcomes.

    The DALG tool [2] was employed to generate this data-aware event log, ensuring realistic data distribution and variability.

    DALG: https://github.com/DavidJilg/DALG

    [1] Geyer, T., Grüger, J., & Kuhn, M. (2024). Clinical Guideline-based Model for the Treatment of Malignant Melanoma (Data Petri Net) (1.0). Zenodo. https://doi.org/10.5281/zenodo.10785431

    [2] Jilg, D., Grüger, J., Geyer, T., Bergmann, R.: DALG: the data aware event log generator. In: BPM 2023 - Demos & Resources. CEUR Workshop Proceedings, vol. 3469, pp. 142–146. CEUR-WS.org (2023)

  5. Tango Spacecraft Dataset for Region of Interest Estimation and Semantic...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated May 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bechini Michele; Bechini Michele; Lunghi Paolo; Lunghi Paolo; Lavagna Michèle; Lavagna Michèle (2023). Tango Spacecraft Dataset for Region of Interest Estimation and Semantic Segmentation [Dataset]. http://doi.org/10.5281/zenodo.6507864
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 23, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Bechini Michele; Bechini Michele; Lunghi Paolo; Lunghi Paolo; Lavagna Michèle; Lavagna Michèle
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Reference Paper:

    M. Bechini, M. Lavagna, P. Lunghi, Dataset generation and validation for spacecraft pose estimation via monocular images processing, Acta Astronautica 204 (2023) 358–369

    M. Bechini, P. Lunghi, M. Lavagna. "Spacecraft Pose Estimation via Monocular Image Processing: Dataset Generation and Validation". In 9th European Conference for Aeronautics and Aerospace Sciences (EUCASS)

    General Description:

    The "Tango Spacecraft Dataset for Region of Interest Estimation and Semantic Segmentation" dataset here published should be used for Region of Interest (ROI) and/or semantic segmentation tasks. It is split into 30002 train images and 3002 test images representing the Tango spacecraft from Prisma mission, being the largest publicly available dataset of synthetic space-borne noise-free images tailored to ROI extraction and Semantic Segmentation tasks (up to our knowledge). The label of each image gives, for the Bounding Box annotations, the filename of the image, the ROI top-left corner (minimum x, minimum y) in pixels, the ROI bottom-right corner (maximum x, maximum y) in pixels, and the center point of the ROI in pixels. The annotation are taken in image reference frame with the origin located at the top-left corner of the image, positive x rightward and positive y downward. Concerning the Semantic Segmentation, RGB masks are provided. Each RGB mask correspond to a single image in both train and test dataset. The RGB images are such that the R channel corresponds to the spacecraft, the G channel corresponds to the Earth (if present), and the B channel corresponds to the background (deep space). Per each channel the pixels have non-zero value only in correspondence of the object that they represent (Tango, Earth, Deep Space). More information on the dataset split and on the label format are reported below.

    Images Information:

    The dataset comprises 30002 synthetic grayscale images of Tango spacecraft from Prisma mission that serves as train set, while the test set is formed by 3002 synthetic grayscale images of Tango spacecraft from Prisma mission in PNG format. About 1/6 of the images both in the train and in the test set have a non-black background, obtained by rendering an Earth-like model in the raytracing process used to define the images reported. The images are noise-free to increase the flexibility of the dataset. The illumination direction of the spacecraft in the scene is uniformly distributed in the 3D space in agreement with the Sun position constraints.


    Labels Information:

    Labels for the bounding box extraction are here provided in separated JSON files. The files are formatted per each image as in the following example:

    • filename : tango_img_1 # name of the image to which the data are referred
    • rol_tl : [x, y] # ROI top-left corner (minimum x, minimum y) in pixels
    • roi_br : [x, y] # ROI bottom-right corner (maximum x, maximum y) in pixels
    • roi_cc : [x, y] # center point of the ROI in pixels

    Notice that the annotation are taken in image reference frame with the origin located at the top-left corner of the image, positive x rightward and positive y downward.To make the usage of the dataset easier, both the training set and the test set are split in two folders containing the images with earth as background and without background.

    Concerning the Semantic Segmentation Labels, they are provided as RGB masks named as "filename_mask.png" where "filename" is the filename of the image of the training set or the test set to which a specific mask is referred. The RGB images are such that the R channel corresponds to the spacecraft, the G channel corresponds to the Earth (if present), and the B channel corresponds to the background (deep space). Per each channel the pixels have non-zero value only in correspondence of the object that they represent (Tango, Earth, Deep Space).

    VERSION CONTROL

    • v1.0: This version contains the dataset (both train and test) of full scale images with ROI annotations and RGB masks for Semantic Segmentation tasks. These images have width=height=1024 pixels. The position of tango with respect to the camera is randomly selected from a uniform distribution, but it is ensured the full visibility in all the images.

    Note: this dataset contains the same images of the "Tango Spacecraft Wireframe Dataset Model for Line Segments Detection" v2.0 full-scale (DOI: https://doi.org/10.5281/zenodo.6372848) and also "Tango Spacecraft Dataset for Monocular Pose Estimation" v1.0 (DOI: https://doi.org/10.5281/zenodo.6499007) and they can be used together by combining the annotations of the relative pose and the ones of the reprojected wireframe model of Tango, with also the ones of the ROI. These three datasets give the most comprehensive dataset of space borne synthetic images ever published (up to our knowledge).

  6. f

    Data from: Algorithm 2.

    • figshare.com
    xls
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane (2023). Algorithm 2. [Dataset]. http://doi.org/10.1371/journal.pone.0122777.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Algorithm 2Algorithm 2.

  7. Synthetic gene expression data with underlying gene network

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Aug 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jianchang Hu; Jianchang Hu; Silke Szymczak; Silke Szymczak (2023). Synthetic gene expression data with underlying gene network [Dataset]. http://doi.org/10.5281/zenodo.8242661
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 15, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jianchang Hu; Jianchang Hu; Silke Szymczak; Silke Szymczak
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the synthetic gene expression data along with the underlying gene network used in the simulation studies of Hu and Szymczak (2023) for evaluating network-guided random forest.

    In this dataset we consider the situation of 1000 genes and 1000 samples each for training and testing sets. Each file contains a list of 100 replications of the considered scenario which can be identified via the file name. In particular, we consider 6 different scenarios depending on the number of disease modules and how are the effects of disease genes distributed within the disease module. When there are disease genes, we also consider 3 different levels of effect sizes. The binary responses are then generated via a logistic regression model. More details on these scenarios and the data generation mechanism can be found in Hu and Szymczak (2023).

    The data is generated by the function gen_data in R package networkRF which can be accessed at https://github.com/imbs-hl/networkRF. To obtain the datasets with 3000 genes, which is the other part of the data used in the simulation studies of Hu and Szymczak (2023), simply modify the num.var argument of the function gen_data. More descriptions on the implementation and the format of the output can be found in the help page of the R package.

  8. H

    NHG Synthetic Forecast generation dataset

    • hydroshare.org
    • beta.hydroshare.org
    • +1more
    zip
    Updated Apr 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zachary Paul Brodeur (2024). NHG Synthetic Forecast generation dataset [Dataset]. https://www.hydroshare.org/resource/dfa02b83bbde4ae3888ffafeb4446a5b
    Explore at:
    zip(308.6 MB)Available download formats
    Dataset updated
    Apr 17, 2024
    Dataset provided by
    HydroShare
    Authors
    Zachary Paul Brodeur
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Oct 1, 1979 - Sep 30, 2019
    Description

    Pre-processed subset of raw HEFS hindcast data for New Hogan lake (NHG) configured for compatibility with the repository structure of the versions 1 and 2 synthetic forecast model contained here: https://github.com/zpb4/Synthetic-Forecast-v1-FIRO-DISES and here: https://github.com/zpb4/Synthetic-Forecast-v2-FIRO-DISES. The data are pre-structured for the repository setup and instructions are included in README files for both GitHub repos on how to setup the data contained in this resource.

    Contains HEFS hindcast .csv files and observed full-natural-flow files for the following sites: NHGC1 - main reservoir inflow to New Hogan lake MSGC1L - downstream local flows from Mud Slough

    Data also contains R scripts used to preprocess the raw HEFS data contained in the associated public Hydroshare resource here: https://www.hydroshare.org/resource/f63ead2d62414940a7d90acdc234a5d1/

  9. UTHealth - Fundus and Synthetic OCT-A Dataset (UT-FSOCTA)

    • zenodo.org
    bin, zip
    Updated Dec 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ivan Coronado; Samiksha Pachade; Rania Abdelkhaleq; Juntao Yan; Sergio Salazar-Marioni; Amanda Jagolino; Mozhdeh Bahrainian; Roomasa Channa; Sunil Sheth; Luca Giancardo; Luca Giancardo; Ivan Coronado; Samiksha Pachade; Rania Abdelkhaleq; Juntao Yan; Sergio Salazar-Marioni; Amanda Jagolino; Mozhdeh Bahrainian; Roomasa Channa; Sunil Sheth (2023). UTHealth - Fundus and Synthetic OCT-A Dataset (UT-FSOCTA) [Dataset]. http://doi.org/10.5281/zenodo.6476639
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Dec 11, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ivan Coronado; Samiksha Pachade; Rania Abdelkhaleq; Juntao Yan; Sergio Salazar-Marioni; Amanda Jagolino; Mozhdeh Bahrainian; Roomasa Channa; Sunil Sheth; Luca Giancardo; Luca Giancardo; Ivan Coronado; Samiksha Pachade; Rania Abdelkhaleq; Juntao Yan; Sergio Salazar-Marioni; Amanda Jagolino; Mozhdeh Bahrainian; Roomasa Channa; Sunil Sheth
    Description

    Introduction

    Vessel segmentation in fundus images is essential in the diagnosis and prognosis of retinal diseases and the identification of image-based biomarkers. However, creating a vessel segmentation map can be a tedious and time consuming process, requiring careful delineation of the vasculature, which is especially hard for microcapillary plexi in fundus images. Optical coherence tomography angiography (OCT-A) is a relatively novel modality visualizing blood flow and microcapillary plexi not clearly observed in fundus photography. Unfortunately, current commercial OCT-A cameras have various limitations due to their complex optics making them more expensive, less portable, and with a reduced field of view (FOV) compared to fundus cameras. Moreover, the vast majority of population health data collection efforts do not include OCT-A data.

    We believe that strategies able to map fundus images to en-face OCT-A can create precise vascular vessel segmentation with less effort.

    In this dataset, called UTHealth - Fundus and Synthetic OCT-A Dataset (UT-FSOCTA), we include fundus images and en-face OCT-A images for 112 subjects. The two modalities have been manually aligned to allow for training of medical imaging machine learning pipelines. This dataset is accompanied by a manuscript that describes an approach to generate fundus vessel segmentations using OCT-A for training (Coronado et al., 2022). We refer to this approach as "Synthetic OCT-A".

    Fundus Imaging

    We include 45 degree macula centered fundus images that cover both macula and optic disc. All images were acquired using a OptoVue iVue fundus camera without pupil dilation.

    The full images are available at the fov45/fundus directory. In addition, we extracted the FOVs corresponding to the en-face OCT-A images collected in cropped/fundus/disc and cropped/fundus/macula.

    Enface OCT-A

    We include the en-face OCT-A images of the superficial capillary plexus. All images were acquired using an OptoVue Avanti OCT camera with OCT-A reconstruction software (AngioVue). Low quality images with errors in the retina layer segmentations were not included.

    En-face OCTA images are located in cropped/octa/disc and cropped/octa/macula. In addition, we include a denoised version of these images where only vessels are included. This has been performed automatically using the ROSE algorithm (Ma et al. 2021). These can be found in cropped/GT_OCT_net/noThresh and cropped/GT_OCT_net/Thresh, the former contains the probabilities of the ROSE algorithm the latter a binary map.

    Synthetic OCT-A

    We train a custom conditional generative adversarial network (cGAN) to map a fundus image to an en face OCT-A image. Our model consists of a generator synthesizing en face OCT-A images from corresponding areas in fundus photographs and a discriminator judging the resemblance of the synthesized images to the real en face OCT-A samples. This allows us to avoid the use of manual vessel segmentation maps altogether.

    The full images are available at the fov45/synthetic_octa directory. Then, we extracted the FOVs corresponding to the en-face OCT-A images collected in cropped/synthetic_octa/disc and cropped/synthetic_octa/macula. In addition, we performed the same denoising ROSE algorithm (Ma et al. 2021) used for the original enface OCT-A images, the results are available in cropped/denoised_synthetic_octa/noThresh and cropped/denoised_synthetic_octa/Thresh, the former contains the probabilities of the ROSE algorithm the latter a binary map.

    Other Fundus Vessel Segmentations Included

    In this dataset, we have also included the output of two recent vessel segmentation algorithms trained on external datasets with manual vessel segmentations. SA-Unet (Li et. al, 2020) and IterNet (Guo et. al, 2021).

    • SA-Unet. The full images are available at the fov45/SA_Unet directory. Then, we extracted the FOVs corresponding to the en-face OCT-A images collected in cropped/SA_Unet/disc and cropped/SA_Unet/macula.

    • IterNet. The full images are available at the fov45/Iternet directory. Then, we extracted the FOVs corresponding to the en-face OCT-A images collected in cropped/Iternet/disc and cropped/Iternet/macula.

    Train/Validation/Test Replication

    In order to replicate or compare your model to the results of our paper, we report below the data split used.

    • Training subjects IDs: 1 - 25

    • Validation subjects IDs: 26 - 30

    • Testing subjects IDs: 31 - 112

    Data Acquisition

    This dataset was acquired at the Texas Medical Center - Memorial Hermann Hospital in accordance with the guidelines from the Helsinki Declaration and it was approved by the UTHealth IRB with protocol HSC-MS-19-0352.

    User Agreement

    The UT-FSOCTA dataset is free to use for non-commercial scientific research only. In case of any publication the following paper needs to be cited

    
    Coronado I, Pachade S, Trucco E, Abdelkhaleq R, Yan J, Salazar-Marioni S, Jagolino-Cole A, Bahrainian M, Channa R, Sheth SA, Giancardo L. Synthetic OCT-A blood vessel maps using fundus images and generative adversarial networks. Sci Rep 2023;13:15325. https://doi.org/10.1038/s41598-023-42062-9.
    

    Funding

    This work is supported by the Translational Research Institute for Space Health through NASA Cooperative Agreement NNX16AO69A.

    Research Team and Acknowledgements

    Here are the people behind this data acquisition effort:

    Ivan Coronado, Samiksha Pachade, Rania Abdelkhaleq, Juntao Yan, Sergio Salazar-Marioni, Amanda Jagolino, Mozhdeh Bahrainian, Roomasa Channa, Sunil Sheth, Luca Giancardo

    We would also like to acknowledge for their support: the Institute for Stroke and Cerebrovascular Diseases at UTHealth, the VAMPIRE team at University of Dundee, UK and Memorial Hermann Hospital System.

    References

    Coronado I, Pachade S, Trucco E, Abdelkhaleq R, Yan J, Salazar-Marioni S, Jagolino-Cole A, Bahrainian M, Channa R, Sheth SA, Giancardo L. Synthetic OCT-A blood vessel maps using fundus images and generative adversarial networks. Sci Rep 2023;13:15325. https://doi.org/10.1038/s41598-023-42062-9.
    
    
    C. Guo, M. Szemenyei, Y. Yi, W. Wang, B. Chen, and C. Fan, "SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation," in 2020 25th International Conference on Pattern Recognition (ICPR), Jan. 2021, pp. 1236–1242. doi: 10.1109/ICPR48806.2021.9413346.
    
    L. Li, M. Verma, Y. Nakashima, H. Nagahara, and R. Kawasaki, "IterNet: Retinal Image Segmentation Utilizing Structural Redundancy in Vessel Networks," 2020 IEEE Winter Conf. Appl. Comput. Vis. WACV, 2020, doi: 10.1109/WACV45572.2020.9093621.
    
    Y. Ma et al., "ROSE: A Retinal OCT-Angiography Vessel Segmentation Dataset and New Model," IEEE Trans. Med. Imaging, vol. 40, no. 3, pp. 928–939, Mar. 2021, doi: 10.1109/TMI.2020.3042802.
    
  10. Data from: Algorithm 1.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane (2023). Algorithm 1. [Dataset]. http://doi.org/10.1371/journal.pone.0122777.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Algorithm 1Algorithm 1.

  11. Structural measures for Ewthmax varying.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane (2023). Structural measures for Ewthmax varying. [Dataset]. http://doi.org/10.1371/journal.pone.0122777.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Structural measures for Ewthmax varyingStructural measures for Ewthmax varying.

  12. f

    Structural measures for Ebtwmax varying and Ewthmax=14.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane (2023). Structural measures for Ebtwmax varying and Ewthmax=14. [Dataset]. http://doi.org/10.1371/journal.pone.0122777.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Structural measures for Ebtwmax varying and Ewthmax=14Structural measures for Ebtwmax varying and Ewthmax=14.

  13. P

    MATHWELL Human Annotation Dataset Dataset

    • paperswithcode.com
    Updated Feb 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bryan R Christ; Jonathan Kropko; Thomas Hartvigsen (2024). MATHWELL Human Annotation Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/mathwell-human-annotation-dataset
    Explore at:
    Dataset updated
    Feb 23, 2024
    Authors
    Bryan R Christ; Jonathan Kropko; Thomas Hartvigsen
    Description

    The MATHWELL Human Annotation Dataset contains 5,084 synthetic word problems and answers generated by MATHWELL, a reference-free educational grade school math word problem generator released in MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations, and comparison models (GPT-4, GPT-3.5, Llama-2, MAmmoTH, and LLEMMA) with expert human annotations for solvability, accuracy, appropriateness, and meets all criteria (MaC). Solvability means the problem is mathematically possible to solve, accuracy means the Program of Thought (PoT) solution arrives at the correct answer, appropriateness means that the mathematical topic is familiar to a grade school student and the question's context is appropriate for a young learner, and MaC denotes questions which are labeled as solvable, accurate, and appropriate. Null values for accuracy and appropriateness indicate a question labeled as unsolvable, which means it cannot have an accurate solution and is automatically inappropriate. Based on our annotations, 82.2% of the question/answer pairs are solvable, 87.3% have accurate solutions, 78.1% are appropriate, and 58.4% meet all criteria.

    This dataset is designed to train text classifiers to automatically label word problem generator outputs for solvability, accuracy, and appropriateness. More details about the dataset can be found in our paper.

  14. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane (2023). Community homogeneity measures for θ varying. [Dataset]. http://doi.org/10.1371/journal.pone.0122777.t005
Organization logo

Community homogeneity measures for θ varying.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Christine Largeron; Pierre-Nicolas Mougel; Reihaneh Rabbany; Osmar R. Zaïane
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Community homogeneity measures for θ varyingCommunity homogeneity measures for θ varying.

Search
Clear search
Close search
Google apps
Main menu