7 datasets found

f
Data from: Automatic estimation of unknown chemical components in a mixed...
tandf.figshare.com
pdf
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ryo Murakami; Hideki Yoshikawa; Kenji Nagata; Hiroshi Shinotsuka; Hiromi Tanaka; Takeshi Iizuka; Hayaru Shouno (2023). Automatic estimation of unknown chemical components in a mixed material by XPS analysis using a genetic algorithm [Dataset]. http://doi.org/10.6084/m9.figshare.19704889.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19704889.v1
Dataset updated
May 31, 2023
Dataset provided by
Taylor & Francis
Authors
Ryo Murakami; Hideki Yoshikawa; Kenji Nagata; Hiroshi Shinotsuka; Hiromi Tanaka; Takeshi Iizuka; Hayaru Shouno
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
There is an urgent need to develop automatic analysis methods for the large number of X-ray photoelectron spectroscopy (XPS) spectra being obtained by methods such as 3D chemical analysis and operand analysis. In a previous study, we developed an automatic analysis method for mixed materials that can decompose the XPS spectra and estimate the compositional ratios by comparison with XPS reference spectra of many candidate single-phase compounds. This method needs access to the XPS reference spectrum of every possible compound in the sample. However, in many practical cases, the necessary XPS reference spectra are not always available. In this study, we developed an automatic analysis method to estimate the compositional ratios, even when all necessary XPS reference spectra are not available, i.e. the measured XPS spectra contain unknown peak structures. In particular, the new method can automatically estimate the number of unknown peaks by the combination of a genetic algorithm and the Bayesian information criterion. We applied the method to analyze the depth-resolved XPS spectra of a (PZT) piezoelectric film and successfully identified the change in the chemical states of the components in the film without ambiguity.
Synthetic data for assessing and comparing local post-hoc explanation of...
zenodo.org
zip
Updated Mar 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Macas; Martin Macas; Ondrej Misar; Ondrej Misar (2025). Synthetic data for assessing and comparing local post-hoc explanation of detected process shift [Dataset]. http://doi.org/10.5281/zenodo.15000635
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15000635
Dataset updated
Mar 10, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Martin Macas; Martin Macas; Ondrej Misar; Ondrej Misar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Synthetic data for assessing and comparing local post-hoc explanation of detected process shift

DOI

10.5281/zenodo.15000635

Synthetic dataset contains data used in experiment described in article submitted to Computers in Industry journal entitled

Assessing and Comparing Local Post-hoc Explanation for Shift Detection in Process Monitoring.

The citation will be updated immediately after the article will be accepted.

Particular data.mat files are stored in a subfolder structure, which clearly assigns the particular file to

on of the tested cases.

For example, data for experiments with normally distributed data, known number of shifted variables and 5 variables are stored in path ormal\known_number\5_vars\rho0.1.

The meaning of particular folders is explained here:

normal - all variables are normally distributed

not-normal - copula based multivariate distribution based on normal and gamma marginal distributions and defined correlation

known_number - known number of shifted variables (methods used this information, which is not available in real world)

unknown_number - unknown number of shifted variables, realistic case

2_vars - data with 2 variables (n=2)

...

10_vars - data with 10 variables (n=2)

rho0.1 - correlation among all variables is 0.1

...

rho0.9 - correlation among all variables is 0.9

Each data.mat file contains the following variables:

LIME_res nval x n results of LIME explanation

MYT_res nval x n results of MYT explanation

NN_res nval x n results of ANN explanation

X p x 11000 Unshifted data

S n x n sigma matrix (covariance matrix) for the unshifted data

mu 1xn mean parameter for the unshifted data

n 1x1 number of variables (dimensionality)

trn_set n x ntrn x 2 train set for ANN explainer,

trn_set(:,:,1) are values of variables from shifted process

trn_set(:,:,2) labels denoting which variables are shifted

trn_set(i,j,2) is 1 if ith variable of jth sample trn_set(:,j,1) is shifted

val_set n x 95 x 2 validation set used for testing and generating LIME_res, MYT_res and NN_res
f
Refined, automatic organelle segmentations in near-isotropic, reconstructed...
janelia.figshare.com
bin
Updated Sep 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Ackerman; Jesse Aaron; Davis Bennett; Navaneetha Krishnan Bharathan; John Bogovic; CellMap Project Team; Teng Leong Chew; COSEM Project Team; William Giang; Larissa Heinrich; Satya Khuon; Andrew P. Kowalczyk; Woohyun Park; Alyson Petruncio; Stephan Preibisch; Stephan Saalfeld; Sara N. Stahley; Eric Trautman; A. Wayne Vogl; Aubrey Weigel (2023). Refined, automatic organelle segmentations in near-isotropic, reconstructed volume electron microscopy (FIB-SEM) of A431 epithelial cells (aic_desmosome-2) [Dataset]. http://doi.org/10.25378/janelia.22674466.v2
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.25378/janelia.22674466.v2
Dataset updated
Sep 6, 2023
Dataset provided by
Janelia Research Campus
Authors
David Ackerman; Jesse Aaron; Davis Bennett; Navaneetha Krishnan Bharathan; John Bogovic; CellMap Project Team; Teng Leong Chew; COSEM Project Team; William Giang; Larissa Heinrich; Satya Khuon; Andrew P. Kowalczyk; Woohyun Park; Alyson Petruncio; Stephan Preibisch; Stephan Saalfeld; Sara N. Stahley; Eric Trautman; A. Wayne Vogl; Aubrey Weigel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sample: Cultured A431 (ATCC CRL-1555) cells stably expressing Desmoplakin-eGFP (Addgene #32227) and mApple-VAPB, stained with MitoView650 Sample Description: The endoplasmic reticulum (ER) forms a dynamic network that contacts other cellular membranes to regulate stress responses, calcium signaling, and lipid transfer. Using high-resolution volume electron microscopy, we find that the ER forms a previously unknown association with keratin intermediate filaments and desmosomal cell-cell junctions. Peripheral ER assembles into mirror image-like arrangements at desmosomes and exhibits nanometer proximity to keratin filaments and the desmosome cytoplasmic plaque. ER tubules exhibit stable associations with desmosomes, and perturbation of desmosomes or keratin filaments alters ER organization and mobility. These findings indicate that desmosomes and the keratin cytoskeleton pattern the distribution of the ER network. Overall, this study reveals a previously unknown subcellular architecture defined by the structural integration of ER tubules with an epithelial intercellular junction. Protocol: High pressure freezing, freeze-substitution resin embedding with 2% OsO₄ 0.1% UA 3% H₂O in acetone; resin embedding in Eponate 12. Contributions: Sample prepared by Jesse Aaron and Satya Khuon (AIC Janelia), staining and resin embedding by Nirmala Iyer (HHMI/Janelia), trimming and imaging by Jesse Aaron and Satya Khoun (AIC Janelia), post-processing by Eric Trautman and Stephan Preibisch (HHMI/Janelia), manual segmentations by COSEM Project Team (HHMI/Janelia) and Kowalczyk Lab (Pennsylvania State College of Medicine), and automated segmentations by COSEM Project Team. Acquisition ID: aic_desmosome-2 Voxel size (nm): 8 x 8 x 8 (X, Y, Z) Data dimensions (µm): 42.3 x 10.4 x 52.5 (X, Y, Z) Imaging start date: 2021-01-27 Dataset URL (Redirect): https://data.janelia.org/T8l9C EM DOI: https://doi.org/10.25378/janelia.22670176 Visualization Website: https://openorganelle.janelia.org/datasets/aic_desmosome-2 Publications: "Architecture and dynamics of a novel desmosome-endoplasmic reticulum organelle" by Bharathan et al., 2022 Imaging duration (days): 4 Landing energy (eV): 1200 Imaging current (nA): 2.0 Scanning speed (MHz): 0.500
P
default of credit card clients Data Set Dataset
paperswithcode.com
Updated May 7, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). default of credit card clients Data Set Dataset [Dataset]. https://paperswithcode.com/dataset/default-of-credit-card-clients-data-set
Explore at:
Dataset updated
May 7, 2024
Description
This research aimed at the case of customers default payments in Taiwan and compares the predictive accuracy of probability of default among six data mining methods. From the perspective of risk management, the result of predictive accuracy of the estimated probability of default will be more valuable than the binary result of classification - credible or not credible clients. Because the real probability of default is unknown, this study presented the novel Sorting Smoothing Method to estimate the real probability of default. With the real probability of default as the response variable (Y), and the predictive probability of default as the independent variable (X), the simple linear regression result (Y = A + BX) shows that the forecasting model produced by artificial neural network has the highest coefficient of determination; its regression intercept (A) is close to zero, and regression coefficient (B) to one. Therefore, among the six data mining techniques, artificial neural network is the only one that can accurately estimate the real probability of default.
o
Data from: On Nonadaptive Search Problem
explore.openaire.eu
Updated Jun 17, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emil Kolev (2012). On Nonadaptive Search Problem [Dataset]. https://explore.openaire.eu/search/other?pid=10525%2F1718
Explore at:
Dataset updated
Jun 17, 2012
Authors
Emil Kolev
Description
We consider nonadaptive search problem for an unknown element x from the set A = {1, 2, 3, . . . , 2^n}, n ≥ 3. For fixed integer S the questions are of the form: Does x belong to a subset B of A, where the sum of the elements of B is equal to S? We wish to find all integers S for which nonadaptive search with n questions finds x. We continue our investigation from [4] and solve the last remaining case n = 2^k , k ≥ 2. 2000 Mathematics Subject Classification: 91A46, 91A35.
Z
Data from: SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from...
data.niaid.nih.gov
zenodo.org
Updated Sep 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Justin Salamon (2020). SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from an urban acoustic sensor network [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2590741
Explore at:
Dataset updated
Sep 14, 2020
Dataset provided by
Yu Wang
Mark Cartwright
Ana Elisa Mendez Mendez
Jason Cramer
Oded Nov
Magdalena Fuentes
Juan Pablo Bello
Justin Salamon
Graham Dove
Vincent Lostanlen
Charlie Mydlarz
Ho-Hsiang Wu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from an urban acoustic sensor network

Version 2.3, September 2020

Created by

Mark Cartwright (1,2,3), Jason Cramer (1), Ana Elisa Mendez Mendez (1), Yu Wang (1), Ho-Hsiang Wu (1), Vincent Lostanlen (1,2,4), Magdalena Fuentes (1), Graham Dove (2), Charlie Mydlarz (1,2), Justin Salamon (5), Oded Nov (6), Juan Pablo Bello (1,2,3)

Music and Audio Research Lab, New York University

Center for Urban Science and Progress, New York University

Department of Computer Science and Engineering, New York University

Cornell Lab of Ornithology

Adobe Research

Department of Technology Management and Innovation, New York University

Publication

If using this data in an academic work, please reference the DOI and version, as well as cite the following paper, which presented the data collection procedure and the first version of the dataset:

Cartwright, M., Cramer, J., Mendez, A.E.M., Wang, Y., Wu, H., Lostanlen, V., Fuentes, M., Dove, G., Mydlarz, C., Salamon, J., Nov, O., Bello, J.P. SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context. In Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020. [pdf]

Description

SONYC Urban Sound Tagging (SONYC-UST) is a dataset for the development and evaluation of machine listening systems for realistic urban noise monitoring. The audio was recorded from the SONYC acoustic sensor network. Volunteers on the Zooniverse citizen science platform tagged the presence of 23 classes that were chosen in consultation with the New York City Department of Environmental Protection. These 23 fine-grained classes can be grouped into 8 coarse-grained classes. The recordings are split into three sets: training, validation, and test. The training and validation sets are disjoint with respect to the sensor from which each recording came, and the test set is displaced in time. For increased reliability, three volunteers annotated each recording. In addition, members of the SONYC team subsequently created a subset of verified, ground-truth tags using a two-stage annotation procedure in which two annotators independently tagged and then collectively resolved any disagreements. This subset of recordings with verified annotations intersects with all three recording splits. All of the recordings in the test set have these verified annotations. In v2 version of this dataset, we have also included coarse spatiotemporal context information to aid in tag prediction when time and location is known. For more details on the motivation and creation of this dataset see the DCASE 2020 Urban Sound Tagging with Spatiotemporal Context Task website.

Audio data

The provided audio has been acquired using the SONYC acoustic sensor network for urban noise pollution monitoring. Over 60 different sensors have been deployed in New York City, and these sensors have collectively gathered the equivalent of over 50 years of audio data, of which we provide a small subset. The data was sampled by selecting the nearest neighbors on VGGish features of recordings known to have classes of interest. All recordings are 10 seconds and were recorded with identical microphones at identical gain settings. To maintain privacy, we quantized the spatial information to the level of a city block, and we quantized the temporal information to the level of an hour. We also limited the occurrence of recordings with positive human voice annotations to one per hour per sensor.

Label taxonomy

The label taxonomy is as follows:

engine 1: small-sounding-engine 2: medium-sounding-engine 3: large-sounding-engine X: engine-of-uncertain-size

machinery-impact 1: rock-drill 2: jackhammer 3: hoe-ram 4: pile-driver X: other-unknown-impact-machinery

non-machinery-impact 1: non-machinery-impact

powered-saw 1: chainsaw 2: small-medium-rotating-saw 3: large-rotating-saw X: other-unknown-powered-saw

alert-signal 1: car-horn 2: car-alarm 3: siren 4: reverse-beeper X: other-unknown-alert-signal

music 1: stationary-music 2: mobile-music 3: ice-cream-truck X: music-from-uncertain-source

human-voice 1: person-or-small-group-talking 2: person-or-small-group-shouting 3: large-crowd 4: amplified-speech X: other-unknown-human-voice

dog 1: dog-barking-whining

The classes preceded by an X code indicate when an annotator was able to identify the coarse class, but couldn’t identify the fine class because either they were uncertain which fine class it was or the fine class was not included in the taxonomy. dcase-ust-taxonomy.yaml contains this taxonomy in an easily machine-readable form.

Data splits

This release contains a training subset (13538 recordings from 35 sensors), and validation subset (4308 recordings from 9 sensors), and a test subset (669 recordings from 48 sensors). The training and validation subsets are disjoint with respect to the sensor from which each recording came. The sensors in the test set will not disjoint from the training and validation subsets, but the test recordings are displaced in time, occurring after any of the recordings in the training and validation subset. The subset of recordings with verified annotations (1380 recordings) intersects with all three recording splits. All of the recordings in the test set have these verified annotations.

Annotation data

The annotation data are contained in annotations.csv, and encompass the training, validation, and test subsets. Each row in the file represents one multi-label annotation of a recording—it could be the annotation of a single citizen science volunteer, a single SONYC team member, or the agreed-upon ground truth by the SONYC team (see the annotator_id column description for more information). Note that since the SONYC team members annotated each class group separately, there may be multiple annotation rows by a single SONYC team annotator for a particular audio recording.

Columns

split

The data split. (train, validate, test)

sensor_id

The ID of the sensor the recording is from.

audio_filename

The filename of the audio recording

annotator_id

The anonymous ID of the annotator. If this value is positive, it is a citizen science volunteer from the Zooniverse platform. If it is negative, it is a SONYC team member. If it is 0, then it is the ground truth agreed-upon by the SONYC team.

year

The year the recording is from.

week

The week of the year the recording is from.

day

The day of the week the recording is from, with Monday as the start (i.e. 0=Monday).

hour

The hour of the day the recording is from

borough The NYC borough in which the sensor is located (1=Manhattan, 3=Brooklyn, 4=Queens). This corresponds to the first digit in the 10-digit NYC parcel number system known as Borough, Block, Lot (BBL).

block

The NYC block in which the sensor is located. This corresponds to digits 2—6 digit in the 10-digit NYC parcel number system known as Borough, Block, Lot (BBL).

latitude

The latitude coordinate of the block in which the sensor is located.

longitude

The longitude coordinate of the block in which the sensor is located.

Columns of this form indicate the presence of fine-level class. 1 if present, 0 if not present. If -1, then the class was not labeled in this annotation because the annotation was performed by a SONYC team member who only annotated one coarse group of classes at a time when annotating the verified subset.

Columns of this form indicate the presence of a coarse-level class. 1 if present, 0 if not present. If -1, then the class was not labeled in this annotation because the annotation was performed by a SONYC team member who only annotated one coarse group of classes at a time when annotating the verified subset. These columns are computed from the fine-level class presence columns and are presented here for convenience when training on only coarse-level classes.

Columns of this form indicate the proximity of a fine-level class. After indicating the presence of a fine-level class, citizen science annotators were asked to indicate the proximity of the sound event to the sensor. Only the citizen science volunteers performed this task, and therefore this data is not included in the verified annotations. This column may take on one of the following four values: (near, far, notsure, -1). If -1, then the proximity was not annotated because either the annotation was not performed by a citizen science volunteer, or the citizen science volunteer did not indicate the presence of the class.

Conditions of use

Dataset created by Mark Cartwright, Jason Cramer, Ana Elisa Mendez Mendez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon, Oded Nov, and Juan Pablo Bello

The SONYC-UST dataset is offered free of charge under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license: https://creativecommons.org/licenses/by/4.0/

The dataset and its contents are made available on an “as is” basis and without warranties of any kind, including without limitation satisfactory quality and conformity, merchantability, fitness for a particular purpose, accuracy or completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, New York University is not liable for, and expressly excludes all liability for, loss or damage however and whenever caused to anyone by any use of the SONYC-UST dataset or any part of it.

Feedback

Please help us improve SONYC-UST by sending your feedback to:

Mark Cartwright: mcartwright@gmail.com

In case of a problem, please include as many details as possible.

Acknowledgments

We would like to thank all
Data from: Into Unknown Territory: late XMM-Newton observations of...
esdcdoi.esac.esa.int
Updated Jan 7, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
European Space Agency (2017). Into Unknown Territory: late XMM-Newton observations of GRB130427A [Dataset]. http://doi.org/10.5270/esa-70j92kd
Explore at:
https://www.iana.org/assignments/media-types/application/fitsAvailable download formats
Unique identifier
https://doi.org/10.5270/esa-70j92kd
Dataset updated
Jan 7, 2017
Dataset authored and provided by
European Space Agencyhttp://www.esa.int/
Time period covered
May 31, 2015 - Dec 24, 2015
Description
GammaRay Burst (GRB) 130427A is an outstanding event, having the highestgammaray fluence of any GRB detected in almost 30 years. Its Xray and opticalafterglows are extraordinarily bright as well, and this event has potential tobe the longest observable GRB since the launch of Swift. We propose to observethis exceptional GRB with XMMNewton ~2 years after the trigger, to determinethe late behaviour of its Xray afterglow. It represents an unique opportunityto detect and study an Xray afterglow at such late times. Determining thespectral and temporal indices of the X ray emission will give us a better gripon the environment and test explosion models. The proposed observations are anextension of those already performed on this burst by the PI with XMMNewton. truncated!, Please see actual data for full text [truncated!, Please see actual data for full text]
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ryo Murakami; Hideki Yoshikawa; Kenji Nagata; Hiroshi Shinotsuka; Hiromi Tanaka; Takeshi Iizuka; Hayaru Shouno (2023). Automatic estimation of unknown chemical components in a mixed material by XPS analysis using a genetic algorithm [Dataset]. http://doi.org/10.6084/m9.figshare.19704889.v1

Data from: Automatic estimation of unknown chemical components in a mixed material by XPS analysis using a genetic algorithm

Explore at:

pdfAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.19704889.v1

Dataset updated

May 31, 2023

Dataset provided by

Taylor & Francis

Authors

Ryo Murakami; Hideki Yoshikawa; Kenji Nagata; Hiroshi Shinotsuka; Hiromi Tanaka; Takeshi Iizuka; Hayaru Shouno

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

There is an urgent need to develop automatic analysis methods for the large number of X-ray photoelectron spectroscopy (XPS) spectra being obtained by methods such as 3D chemical analysis and operand analysis. In a previous study, we developed an automatic analysis method for mixed materials that can decompose the XPS spectra and estimate the compositional ratios by comparison with XPS reference spectra of many candidate single-phase compounds. This method needs access to the XPS reference spectrum of every possible compound in the sample. However, in many practical cases, the necessary XPS reference spectra are not always available. In this study, we developed an automatic analysis method to estimate the compositional ratios, even when all necessary XPS reference spectra are not available, i.e. the measured XPS spectra contain unknown peak structures. In particular, the new method can automatically estimate the number of unknown peaks by the combination of a genetic algorithm and the Bayesian information criterion. We applied the method to analyze the depth-resolved XPS spectra of a (PZT) piezoelectric film and successfully identified the change in the chemical states of the components in the film without ambiguity.

Clear search

Close search

Google apps

Main menu

Data from: Automatic estimation of unknown chemical components in a mixed...

Synthetic data for assessing and comparing local post-hoc explanation of...

Refined, automatic organelle segmentations in near-isotropic, reconstructed...

default of credit card clients Data Set Dataset

Data from: On Nonadaptive Search Problem

Data from: SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from...

Data from: Into Unknown Territory: late XMM-Newton observations of...

Data from: Automatic estimation of unknown chemical components in a mixed material by XPS analysis using a genetic algorithm