7 datasets found
  1. f

    Data from: Automatic estimation of unknown chemical components in a mixed...

    • tandf.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryo Murakami; Hideki Yoshikawa; Kenji Nagata; Hiroshi Shinotsuka; Hiromi Tanaka; Takeshi Iizuka; Hayaru Shouno (2023). Automatic estimation of unknown chemical components in a mixed material by XPS analysis using a genetic algorithm [Dataset]. http://doi.org/10.6084/m9.figshare.19704889.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Ryo Murakami; Hideki Yoshikawa; Kenji Nagata; Hiroshi Shinotsuka; Hiromi Tanaka; Takeshi Iizuka; Hayaru Shouno
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    There is an urgent need to develop automatic analysis methods for the large number of X-ray photoelectron spectroscopy (XPS) spectra being obtained by methods such as 3D chemical analysis and operand analysis. In a previous study, we developed an automatic analysis method for mixed materials that can decompose the XPS spectra and estimate the compositional ratios by comparison with XPS reference spectra of many candidate single-phase compounds. This method needs access to the XPS reference spectrum of every possible compound in the sample. However, in many practical cases, the necessary XPS reference spectra are not always available. In this study, we developed an automatic analysis method to estimate the compositional ratios, even when all necessary XPS reference spectra are not available, i.e. the measured XPS spectra contain unknown peak structures. In particular, the new method can automatically estimate the number of unknown peaks by the combination of a genetic algorithm and the Bayesian information criterion. We applied the method to analyze the depth-resolved XPS spectra of a (PZT) piezoelectric film and successfully identified the change in the chemical states of the components in the film without ambiguity.

  2. Synthetic data for assessing and comparing local post-hoc explanation of...

    • zenodo.org
    zip
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Macas; Martin Macas; Ondrej Misar; Ondrej Misar (2025). Synthetic data for assessing and comparing local post-hoc explanation of detected process shift [Dataset]. http://doi.org/10.5281/zenodo.15000635
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Martin Macas; Martin Macas; Ondrej Misar; Ondrej Misar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    Synthetic data for assessing and comparing local post-hoc explanation of detected process shift
    DOI
    10.5281/zenodo.15000635
    Synthetic dataset contains data used in experiment described in article submitted to Computers in Industry journal entitled
    Assessing and Comparing Local Post-hoc Explanation for Shift Detection in Process Monitoring.
    The citation will be updated immediately after the article will be accepted.
    Particular data.mat files are stored in a subfolder structure, which clearly assigns the particular file to
    on of the tested cases.
    For example, data for experiments with normally distributed data, known number of shifted variables and 5 variables are stored in path ormal\known_number\5_vars\rho0.1.
    The meaning of particular folders is explained here:
    normal - all variables are normally distributed
    not-normal - copula based multivariate distribution based on normal and gamma marginal distributions and defined correlation
    known_number - known number of shifted variables (methods used this information, which is not available in real world)
    unknown_number - unknown number of shifted variables, realistic case
    2_vars - data with 2 variables (n=2)
    ...
    10_vars - data with 10 variables (n=2)
    rho0.1 - correlation among all variables is 0.1
    ...
    rho0.9 - correlation among all variables is 0.9
    Each data.mat file contains the following variables:
    LIME_res nval x n results of LIME explanation
    MYT_res nval x n results of MYT explanation
    NN_res nval x n results of ANN explanation
    X p x 11000 Unshifted data
    S n x n sigma matrix (covariance matrix) for the unshifted data
    mu 1xn mean parameter for the unshifted data
    n 1x1 number of variables (dimensionality)
    trn_set n x ntrn x 2 train set for ANN explainer,
    trn_set(:,:,1) are values of variables from shifted process
    trn_set(:,:,2) labels denoting which variables are shifted
    trn_set(i,j,2) is 1 if ith variable of jth sample trn_set(:,j,1) is shifted
    val_set n x 95 x 2 validation set used for testing and generating LIME_res, MYT_res and NN_res
  3. f

    Refined, automatic organelle segmentations in near-isotropic, reconstructed...

    • janelia.figshare.com
    bin
    Updated Sep 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Ackerman; Jesse Aaron; Davis Bennett; Navaneetha Krishnan Bharathan; John Bogovic; CellMap Project Team; Teng Leong Chew; COSEM Project Team; William Giang; Larissa Heinrich; Satya Khuon; Andrew P. Kowalczyk; Woohyun Park; Alyson Petruncio; Stephan Preibisch; Stephan Saalfeld; Sara N. Stahley; Eric Trautman; A. Wayne Vogl; Aubrey Weigel (2023). Refined, automatic organelle segmentations in near-isotropic, reconstructed volume electron microscopy (FIB-SEM) of A431 epithelial cells (aic_desmosome-2) [Dataset]. http://doi.org/10.25378/janelia.22674466.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    Sep 6, 2023
    Dataset provided by
    Janelia Research Campus
    Authors
    David Ackerman; Jesse Aaron; Davis Bennett; Navaneetha Krishnan Bharathan; John Bogovic; CellMap Project Team; Teng Leong Chew; COSEM Project Team; William Giang; Larissa Heinrich; Satya Khuon; Andrew P. Kowalczyk; Woohyun Park; Alyson Petruncio; Stephan Preibisch; Stephan Saalfeld; Sara N. Stahley; Eric Trautman; A. Wayne Vogl; Aubrey Weigel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sample: Cultured A431 (ATCC CRL-1555) cells stably expressing Desmoplakin-eGFP (Addgene #32227) and mApple-VAPB, stained with MitoView650 Sample Description: The endoplasmic reticulum (ER) forms a dynamic network that contacts other cellular membranes to regulate stress responses, calcium signaling, and lipid transfer. Using high-resolution volume electron microscopy, we find that the ER forms a previously unknown association with keratin intermediate filaments and desmosomal cell-cell junctions. Peripheral ER assembles into mirror image-like arrangements at desmosomes and exhibits nanometer proximity to keratin filaments and the desmosome cytoplasmic plaque. ER tubules exhibit stable associations with desmosomes, and perturbation of desmosomes or keratin filaments alters ER organization and mobility. These findings indicate that desmosomes and the keratin cytoskeleton pattern the distribution of the ER network. Overall, this study reveals a previously unknown subcellular architecture defined by the structural integration of ER tubules with an epithelial intercellular junction. Protocol: High pressure freezing, freeze-substitution resin embedding with 2% OsO₄ 0.1% UA 3% H₂O in acetone; resin embedding in Eponate 12. Contributions: Sample prepared by Jesse Aaron and Satya Khuon (AIC Janelia), staining and resin embedding by Nirmala Iyer (HHMI/Janelia), trimming and imaging by Jesse Aaron and Satya Khoun (AIC Janelia), post-processing by Eric Trautman and Stephan Preibisch (HHMI/Janelia), manual segmentations by COSEM Project Team (HHMI/Janelia) and Kowalczyk Lab (Pennsylvania State College of Medicine), and automated segmentations by COSEM Project Team. Acquisition ID: aic_desmosome-2 Voxel size (nm): 8 x 8 x 8 (X, Y, Z) Data dimensions (µm): 42.3 x 10.4 x 52.5 (X, Y, Z) Imaging start date: 2021-01-27 Dataset URL (Redirect): https://data.janelia.org/T8l9C EM DOI: https://doi.org/10.25378/janelia.22670176 Visualization Website: https://openorganelle.janelia.org/datasets/aic_desmosome-2 Publications: "Architecture and dynamics of a novel desmosome-endoplasmic reticulum organelle" by Bharathan et al., 2022 Imaging duration (days): 4 Landing energy (eV): 1200 Imaging current (nA): 2.0 Scanning speed (MHz): 0.500

  4. P

    default of credit card clients Data Set Dataset

    • paperswithcode.com
    Updated May 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). default of credit card clients Data Set Dataset [Dataset]. https://paperswithcode.com/dataset/default-of-credit-card-clients-data-set
    Explore at:
    Dataset updated
    May 7, 2024
    Description

    This research aimed at the case of customers default payments in Taiwan and compares the predictive accuracy of probability of default among six data mining methods. From the perspective of risk management, the result of predictive accuracy of the estimated probability of default will be more valuable than the binary result of classification - credible or not credible clients. Because the real probability of default is unknown, this study presented the novel Sorting Smoothing Method to estimate the real probability of default. With the real probability of default as the response variable (Y), and the predictive probability of default as the independent variable (X), the simple linear regression result (Y = A + BX) shows that the forecasting model produced by artificial neural network has the highest coefficient of determination; its regression intercept (A) is close to zero, and regression coefficient (B) to one. Therefore, among the six data mining techniques, artificial neural network is the only one that can accurately estimate the real probability of default.

  5. o

    Data from: On Nonadaptive Search Problem

    • explore.openaire.eu
    Updated Jun 17, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emil Kolev (2012). On Nonadaptive Search Problem [Dataset]. https://explore.openaire.eu/search/other?pid=10525%2F1718
    Explore at:
    Dataset updated
    Jun 17, 2012
    Authors
    Emil Kolev
    Description

    We consider nonadaptive search problem for an unknown element x from the set A = {1, 2, 3, . . . , 2^n}, n ≥ 3. For fixed integer S the questions are of the form: Does x belong to a subset B of A, where the sum of the elements of B is equal to S? We wish to find all integers S for which nonadaptive search with n questions finds x. We continue our investigation from [4] and solve the last remaining case n = 2^k , k ≥ 2. 2000 Mathematics Subject Classification: 91A46, 91A35.

  6. Z

    Data from: SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Justin Salamon (2020). SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from an urban acoustic sensor network [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2590741
    Explore at:
    Dataset updated
    Sep 14, 2020
    Dataset provided by
    Yu Wang
    Mark Cartwright
    Ana Elisa Mendez Mendez
    Jason Cramer
    Oded Nov
    Magdalena Fuentes
    Juan Pablo Bello
    Justin Salamon
    Graham Dove
    Vincent Lostanlen
    Charlie Mydlarz
    Ho-Hsiang Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from an urban acoustic sensor network

    Version 2.3, September 2020

    Created by

    Mark Cartwright (1,2,3), Jason Cramer (1), Ana Elisa Mendez Mendez (1), Yu Wang (1), Ho-Hsiang Wu (1), Vincent Lostanlen (1,2,4), Magdalena Fuentes (1), Graham Dove (2), Charlie Mydlarz (1,2), Justin Salamon (5), Oded Nov (6), Juan Pablo Bello (1,2,3)

    Music and Audio Research Lab, New York University

    Center for Urban Science and Progress, New York University

    Department of Computer Science and Engineering, New York University

    Cornell Lab of Ornithology

    Adobe Research

    Department of Technology Management and Innovation, New York University

    Publication

    If using this data in an academic work, please reference the DOI and version, as well as cite the following paper, which presented the data collection procedure and the first version of the dataset:

    Cartwright, M., Cramer, J., Mendez, A.E.M., Wang, Y., Wu, H., Lostanlen, V., Fuentes, M., Dove, G., Mydlarz, C., Salamon, J., Nov, O., Bello, J.P. SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context. In Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020. [pdf]

    Description

    SONYC Urban Sound Tagging (SONYC-UST) is a dataset for the development and evaluation of machine listening systems for realistic urban noise monitoring. The audio was recorded from the SONYC acoustic sensor network. Volunteers on the Zooniverse citizen science platform tagged the presence of 23 classes that were chosen in consultation with the New York City Department of Environmental Protection. These 23 fine-grained classes can be grouped into 8 coarse-grained classes. The recordings are split into three sets: training, validation, and test. The training and validation sets are disjoint with respect to the sensor from which each recording came, and the test set is displaced in time. For increased reliability, three volunteers annotated each recording. In addition, members of the SONYC team subsequently created a subset of verified, ground-truth tags using a two-stage annotation procedure in which two annotators independently tagged and then collectively resolved any disagreements. This subset of recordings with verified annotations intersects with all three recording splits. All of the recordings in the test set have these verified annotations. In v2 version of this dataset, we have also included coarse spatiotemporal context information to aid in tag prediction when time and location is known. For more details on the motivation and creation of this dataset see the DCASE 2020 Urban Sound Tagging with Spatiotemporal Context Task website.

    Audio data

    The provided audio has been acquired using the SONYC acoustic sensor network for urban noise pollution monitoring. Over 60 different sensors have been deployed in New York City, and these sensors have collectively gathered the equivalent of over 50 years of audio data, of which we provide a small subset. The data was sampled by selecting the nearest neighbors on VGGish features of recordings known to have classes of interest. All recordings are 10 seconds and were recorded with identical microphones at identical gain settings. To maintain privacy, we quantized the spatial information to the level of a city block, and we quantized the temporal information to the level of an hour. We also limited the occurrence of recordings with positive human voice annotations to one per hour per sensor.

    Label taxonomy

    The label taxonomy is as follows:

    engine 1: small-sounding-engine 2: medium-sounding-engine 3: large-sounding-engine X: engine-of-uncertain-size

    machinery-impact 1: rock-drill 2: jackhammer 3: hoe-ram 4: pile-driver X: other-unknown-impact-machinery

    non-machinery-impact 1: non-machinery-impact

    powered-saw 1: chainsaw 2: small-medium-rotating-saw 3: large-rotating-saw X: other-unknown-powered-saw

    alert-signal 1: car-horn 2: car-alarm 3: siren 4: reverse-beeper X: other-unknown-alert-signal

    music 1: stationary-music 2: mobile-music 3: ice-cream-truck X: music-from-uncertain-source

    human-voice 1: person-or-small-group-talking 2: person-or-small-group-shouting 3: large-crowd 4: amplified-speech X: other-unknown-human-voice

    dog 1: dog-barking-whining

    The classes preceded by an X code indicate when an annotator was able to identify the coarse class, but couldn’t identify the fine class because either they were uncertain which fine class it was or the fine class was not included in the taxonomy. dcase-ust-taxonomy.yaml contains this taxonomy in an easily machine-readable form.

    Data splits

    This release contains a training subset (13538 recordings from 35 sensors), and validation subset (4308 recordings from 9 sensors), and a test subset (669 recordings from 48 sensors). The training and validation subsets are disjoint with respect to the sensor from which each recording came. The sensors in the test set will not disjoint from the training and validation subsets, but the test recordings are displaced in time, occurring after any of the recordings in the training and validation subset. The subset of recordings with verified annotations (1380 recordings) intersects with all three recording splits. All of the recordings in the test set have these verified annotations.

    Annotation data

    The annotation data are contained in annotations.csv, and encompass the training, validation, and test subsets. Each row in the file represents one multi-label annotation of a recording—it could be the annotation of a single citizen science volunteer, a single SONYC team member, or the agreed-upon ground truth by the SONYC team (see the annotator_id column description for more information). Note that since the SONYC team members annotated each class group separately, there may be multiple annotation rows by a single SONYC team annotator for a particular audio recording.

    Columns

    split

    The data split. (train, validate, test)

    sensor_id

    The ID of the sensor the recording is from.

    audio_filename

    The filename of the audio recording

    annotator_id

    The anonymous ID of the annotator. If this value is positive, it is a citizen science volunteer from the Zooniverse platform. If it is negative, it is a SONYC team member. If it is 0, then it is the ground truth agreed-upon by the SONYC team.

    year

    The year the recording is from.

    week

    The week of the year the recording is from.

    day

    The day of the week the recording is from, with Monday as the start (i.e. 0=Monday).

    hour

    The hour of the day the recording is from

    borough The NYC borough in which the sensor is located (1=Manhattan, 3=Brooklyn, 4=Queens). This corresponds to the first digit in the 10-digit NYC parcel number system known as Borough, Block, Lot (BBL).

    block

    The NYC block in which the sensor is located. This corresponds to digits 2—6 digit in the 10-digit NYC parcel number system known as Borough, Block, Lot (BBL).

    latitude

    The latitude coordinate of the block in which the sensor is located.

    longitude

    The longitude coordinate of the block in which the sensor is located.

    Columns of this form indicate the presence of fine-level class. 1 if present, 0 if not present. If -1, then the class was not labeled in this annotation because the annotation was performed by a SONYC team member who only annotated one coarse group of classes at a time when annotating the verified subset.

    Columns of this form indicate the presence of a coarse-level class. 1 if present, 0 if not present. If -1, then the class was not labeled in this annotation because the annotation was performed by a SONYC team member who only annotated one coarse group of classes at a time when annotating the verified subset. These columns are computed from the fine-level class presence columns and are presented here for convenience when training on only coarse-level classes.

    Columns of this form indicate the proximity of a fine-level class. After indicating the presence of a fine-level class, citizen science annotators were asked to indicate the proximity of the sound event to the sensor. Only the citizen science volunteers performed this task, and therefore this data is not included in the verified annotations. This column may take on one of the following four values: (near, far, notsure, -1). If -1, then the proximity was not annotated because either the annotation was not performed by a citizen science volunteer, or the citizen science volunteer did not indicate the presence of the class.

    Conditions of use

    Dataset created by Mark Cartwright, Jason Cramer, Ana Elisa Mendez Mendez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon, Oded Nov, and Juan Pablo Bello

    The SONYC-UST dataset is offered free of charge under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license: https://creativecommons.org/licenses/by/4.0/

    The dataset and its contents are made available on an “as is” basis and without warranties of any kind, including without limitation satisfactory quality and conformity, merchantability, fitness for a particular purpose, accuracy or completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, New York University is not liable for, and expressly excludes all liability for, loss or damage however and whenever caused to anyone by any use of the SONYC-UST dataset or any part of it.

    Feedback

    Please help us improve SONYC-UST by sending your feedback to:

    Mark Cartwright: mcartwright@gmail.com

    In case of a problem, please include as many details as possible.

    Acknowledgments

    We would like to thank all

  7. Data from: Into Unknown Territory: late XMM-Newton observations of...

    • esdcdoi.esac.esa.int
    Updated Jan 7, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Space Agency (2017). Into Unknown Territory: late XMM-Newton observations of GRB130427A [Dataset]. http://doi.org/10.5270/esa-70j92kd
    Explore at:
    https://www.iana.org/assignments/media-types/application/fitsAvailable download formats
    Dataset updated
    Jan 7, 2017
    Dataset authored and provided by
    European Space Agencyhttp://www.esa.int/
    Time period covered
    May 31, 2015 - Dec 24, 2015
    Description

    GammaRay Burst (GRB) 130427A is an outstanding event, having the highestgammaray fluence of any GRB detected in almost 30 years. Its Xray and opticalafterglows are extraordinarily bright as well, and this event has potential tobe the longest observable GRB since the launch of Swift. We propose to observethis exceptional GRB with XMMNewton ~2 years after the trigger, to determinethe late behaviour of its Xray afterglow. It represents an unique opportunityto detect and study an Xray afterglow at such late times. Determining thespectral and temporal indices of the X ray emission will give us a better gripon the environment and test explosion models. The proposed observations are anextension of those already performed on this burst by the PI with XMMNewton. truncated!, Please see actual data for full text [truncated!, Please see actual data for full text]

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ryo Murakami; Hideki Yoshikawa; Kenji Nagata; Hiroshi Shinotsuka; Hiromi Tanaka; Takeshi Iizuka; Hayaru Shouno (2023). Automatic estimation of unknown chemical components in a mixed material by XPS analysis using a genetic algorithm [Dataset]. http://doi.org/10.6084/m9.figshare.19704889.v1

Data from: Automatic estimation of unknown chemical components in a mixed material by XPS analysis using a genetic algorithm

Related Article
Explore at:
pdfAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
Taylor & Francis
Authors
Ryo Murakami; Hideki Yoshikawa; Kenji Nagata; Hiroshi Shinotsuka; Hiromi Tanaka; Takeshi Iizuka; Hayaru Shouno
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

There is an urgent need to develop automatic analysis methods for the large number of X-ray photoelectron spectroscopy (XPS) spectra being obtained by methods such as 3D chemical analysis and operand analysis. In a previous study, we developed an automatic analysis method for mixed materials that can decompose the XPS spectra and estimate the compositional ratios by comparison with XPS reference spectra of many candidate single-phase compounds. This method needs access to the XPS reference spectrum of every possible compound in the sample. However, in many practical cases, the necessary XPS reference spectra are not always available. In this study, we developed an automatic analysis method to estimate the compositional ratios, even when all necessary XPS reference spectra are not available, i.e. the measured XPS spectra contain unknown peak structures. In particular, the new method can automatically estimate the number of unknown peaks by the combination of a genetic algorithm and the Bayesian information criterion. We applied the method to analyze the depth-resolved XPS spectra of a (PZT) piezoelectric film and successfully identified the change in the chemical states of the components in the film without ambiguity.

Search
Clear search
Close search
Google apps
Main menu