Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There is an urgent need to develop automatic analysis methods for the large number of X-ray photoelectron spectroscopy (XPS) spectra being obtained by methods such as 3D chemical analysis and operand analysis. In a previous study, we developed an automatic analysis method for mixed materials that can decompose the XPS spectra and estimate the compositional ratios by comparison with XPS reference spectra of many candidate single-phase compounds. This method needs access to the XPS reference spectrum of every possible compound in the sample. However, in many practical cases, the necessary XPS reference spectra are not always available. In this study, we developed an automatic analysis method to estimate the compositional ratios, even when all necessary XPS reference spectra are not available, i.e. the measured XPS spectra contain unknown peak structures. In particular, the new method can automatically estimate the number of unknown peaks by the combination of a genetic algorithm and the Bayesian information criterion. We applied the method to analyze the depth-resolved XPS spectra of a (PZT) piezoelectric film and successfully identified the change in the chemical states of the components in the film without ambiguity.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sample: Cultured A431 (ATCC CRL-1555) cells stably expressing Desmoplakin-eGFP (Addgene #32227) and mApple-VAPB, stained with MitoView650 Sample Description: The endoplasmic reticulum (ER) forms a dynamic network that contacts other cellular membranes to regulate stress responses, calcium signaling, and lipid transfer. Using high-resolution volume electron microscopy, we find that the ER forms a previously unknown association with keratin intermediate filaments and desmosomal cell-cell junctions. Peripheral ER assembles into mirror image-like arrangements at desmosomes and exhibits nanometer proximity to keratin filaments and the desmosome cytoplasmic plaque. ER tubules exhibit stable associations with desmosomes, and perturbation of desmosomes or keratin filaments alters ER organization and mobility. These findings indicate that desmosomes and the keratin cytoskeleton pattern the distribution of the ER network. Overall, this study reveals a previously unknown subcellular architecture defined by the structural integration of ER tubules with an epithelial intercellular junction. Protocol: High pressure freezing, freeze-substitution resin embedding with 2% OsO₄ 0.1% UA 3% H₂O in acetone; resin embedding in Eponate 12. Contributions: Sample prepared by Jesse Aaron and Satya Khuon (AIC Janelia), staining and resin embedding by Nirmala Iyer (HHMI/Janelia), trimming and imaging by Jesse Aaron and Satya Khoun (AIC Janelia), post-processing by Eric Trautman and Stephan Preibisch (HHMI/Janelia), manual segmentations by COSEM Project Team (HHMI/Janelia) and Kowalczyk Lab (Pennsylvania State College of Medicine), and automated segmentations by COSEM Project Team. Acquisition ID: aic_desmosome-2 Voxel size (nm): 8 x 8 x 8 (X, Y, Z) Data dimensions (µm): 42.3 x 10.4 x 52.5 (X, Y, Z) Imaging start date: 2021-01-27 Dataset URL (Redirect): https://data.janelia.org/T8l9C EM DOI: https://doi.org/10.25378/janelia.22670176 Visualization Website: https://openorganelle.janelia.org/datasets/aic_desmosome-2 Publications: "Architecture and dynamics of a novel desmosome-endoplasmic reticulum organelle" by Bharathan et al., 2022 Imaging duration (days): 4 Landing energy (eV): 1200 Imaging current (nA): 2.0 Scanning speed (MHz): 0.500
This research aimed at the case of customers default payments in Taiwan and compares the predictive accuracy of probability of default among six data mining methods. From the perspective of risk management, the result of predictive accuracy of the estimated probability of default will be more valuable than the binary result of classification - credible or not credible clients. Because the real probability of default is unknown, this study presented the novel Sorting Smoothing Method to estimate the real probability of default. With the real probability of default as the response variable (Y), and the predictive probability of default as the independent variable (X), the simple linear regression result (Y = A + BX) shows that the forecasting model produced by artificial neural network has the highest coefficient of determination; its regression intercept (A) is close to zero, and regression coefficient (B) to one. Therefore, among the six data mining techniques, artificial neural network is the only one that can accurately estimate the real probability of default.
We consider nonadaptive search problem for an unknown element x from the set A = {1, 2, 3, . . . , 2^n}, n ≥ 3. For fixed integer S the questions are of the form: Does x belong to a subset B of A, where the sum of the elements of B is equal to S? We wish to find all integers S for which nonadaptive search with n questions finds x. We continue our investigation from [4] and solve the last remaining case n = 2^k , k ≥ 2. 2000 Mathematics Subject Classification: 91A46, 91A35.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from an urban acoustic sensor network
Version 2.3, September 2020
Created by
Mark Cartwright (1,2,3), Jason Cramer (1), Ana Elisa Mendez Mendez (1), Yu Wang (1), Ho-Hsiang Wu (1), Vincent Lostanlen (1,2,4), Magdalena Fuentes (1), Graham Dove (2), Charlie Mydlarz (1,2), Justin Salamon (5), Oded Nov (6), Juan Pablo Bello (1,2,3)
Music and Audio Research Lab, New York University
Center for Urban Science and Progress, New York University
Department of Computer Science and Engineering, New York University
Cornell Lab of Ornithology
Adobe Research
Department of Technology Management and Innovation, New York University
Publication
If using this data in an academic work, please reference the DOI and version, as well as cite the following paper, which presented the data collection procedure and the first version of the dataset:
Cartwright, M., Cramer, J., Mendez, A.E.M., Wang, Y., Wu, H., Lostanlen, V., Fuentes, M., Dove, G., Mydlarz, C., Salamon, J., Nov, O., Bello, J.P. SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context. In Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020. [pdf]
Description
SONYC Urban Sound Tagging (SONYC-UST) is a dataset for the development and evaluation of machine listening systems for realistic urban noise monitoring. The audio was recorded from the SONYC acoustic sensor network. Volunteers on the Zooniverse citizen science platform tagged the presence of 23 classes that were chosen in consultation with the New York City Department of Environmental Protection. These 23 fine-grained classes can be grouped into 8 coarse-grained classes. The recordings are split into three sets: training, validation, and test. The training and validation sets are disjoint with respect to the sensor from which each recording came, and the test set is displaced in time. For increased reliability, three volunteers annotated each recording. In addition, members of the SONYC team subsequently created a subset of verified, ground-truth tags using a two-stage annotation procedure in which two annotators independently tagged and then collectively resolved any disagreements. This subset of recordings with verified annotations intersects with all three recording splits. All of the recordings in the test set have these verified annotations. In v2 version of this dataset, we have also included coarse spatiotemporal context information to aid in tag prediction when time and location is known. For more details on the motivation and creation of this dataset see the DCASE 2020 Urban Sound Tagging with Spatiotemporal Context Task website.
Audio data
The provided audio has been acquired using the SONYC acoustic sensor network for urban noise pollution monitoring. Over 60 different sensors have been deployed in New York City, and these sensors have collectively gathered the equivalent of over 50 years of audio data, of which we provide a small subset. The data was sampled by selecting the nearest neighbors on VGGish features of recordings known to have classes of interest. All recordings are 10 seconds and were recorded with identical microphones at identical gain settings. To maintain privacy, we quantized the spatial information to the level of a city block, and we quantized the temporal information to the level of an hour. We also limited the occurrence of recordings with positive human voice annotations to one per hour per sensor.
Label taxonomy
The label taxonomy is as follows:
engine 1: small-sounding-engine 2: medium-sounding-engine 3: large-sounding-engine X: engine-of-uncertain-size
machinery-impact 1: rock-drill 2: jackhammer 3: hoe-ram 4: pile-driver X: other-unknown-impact-machinery
non-machinery-impact 1: non-machinery-impact
powered-saw 1: chainsaw 2: small-medium-rotating-saw 3: large-rotating-saw X: other-unknown-powered-saw
alert-signal 1: car-horn 2: car-alarm 3: siren 4: reverse-beeper X: other-unknown-alert-signal
music 1: stationary-music 2: mobile-music 3: ice-cream-truck X: music-from-uncertain-source
human-voice 1: person-or-small-group-talking 2: person-or-small-group-shouting 3: large-crowd 4: amplified-speech X: other-unknown-human-voice
dog 1: dog-barking-whining
The classes preceded by an X code indicate when an annotator was able to identify the coarse class, but couldn’t identify the fine class because either they were uncertain which fine class it was or the fine class was not included in the taxonomy. dcase-ust-taxonomy.yaml contains this taxonomy in an easily machine-readable form.
Data splits
This release contains a training subset (13538 recordings from 35 sensors), and validation subset (4308 recordings from 9 sensors), and a test subset (669 recordings from 48 sensors). The training and validation subsets are disjoint with respect to the sensor from which each recording came. The sensors in the test set will not disjoint from the training and validation subsets, but the test recordings are displaced in time, occurring after any of the recordings in the training and validation subset. The subset of recordings with verified annotations (1380 recordings) intersects with all three recording splits. All of the recordings in the test set have these verified annotations.
Annotation data
The annotation data are contained in annotations.csv, and encompass the training, validation, and test subsets. Each row in the file represents one multi-label annotation of a recording—it could be the annotation of a single citizen science volunteer, a single SONYC team member, or the agreed-upon ground truth by the SONYC team (see the annotator_id column description for more information). Note that since the SONYC team members annotated each class group separately, there may be multiple annotation rows by a single SONYC team annotator for a particular audio recording.
Columns
split
The data split. (train, validate, test)
sensor_id
The ID of the sensor the recording is from.
audio_filename
The filename of the audio recording
annotator_id
The anonymous ID of the annotator. If this value is positive, it is a citizen science volunteer from the Zooniverse platform. If it is negative, it is a SONYC team member. If it is 0, then it is the ground truth agreed-upon by the SONYC team.
year
The year the recording is from.
week
The week of the year the recording is from.
day
The day of the week the recording is from, with Monday as the start (i.e. 0=Monday).
hour
The hour of the day the recording is from
borough The NYC borough in which the sensor is located (1=Manhattan, 3=Brooklyn, 4=Queens). This corresponds to the first digit in the 10-digit NYC parcel number system known as Borough, Block, Lot (BBL).
block
The NYC block in which the sensor is located. This corresponds to digits 2—6 digit in the 10-digit NYC parcel number system known as Borough, Block, Lot (BBL).
latitude
The latitude coordinate of the block in which the sensor is located.
longitude
The longitude coordinate of the block in which the sensor is located.
Columns of this form indicate the presence of fine-level class. 1 if present, 0 if not present. If -1, then the class was not labeled in this annotation because the annotation was performed by a SONYC team member who only annotated one coarse group of classes at a time when annotating the verified subset.
Columns of this form indicate the presence of a coarse-level class. 1 if present, 0 if not present. If -1, then the class was not labeled in this annotation because the annotation was performed by a SONYC team member who only annotated one coarse group of classes at a time when annotating the verified subset. These columns are computed from the fine-level class presence columns and are presented here for convenience when training on only coarse-level classes.
Columns of this form indicate the proximity of a fine-level class. After indicating the presence of a fine-level class, citizen science annotators were asked to indicate the proximity of the sound event to the sensor. Only the citizen science volunteers performed this task, and therefore this data is not included in the verified annotations. This column may take on one of the following four values: (near, far, notsure, -1). If -1, then the proximity was not annotated because either the annotation was not performed by a citizen science volunteer, or the citizen science volunteer did not indicate the presence of the class.
Conditions of use
Dataset created by Mark Cartwright, Jason Cramer, Ana Elisa Mendez Mendez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon, Oded Nov, and Juan Pablo Bello
The SONYC-UST dataset is offered free of charge under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license: https://creativecommons.org/licenses/by/4.0/
The dataset and its contents are made available on an “as is” basis and without warranties of any kind, including without limitation satisfactory quality and conformity, merchantability, fitness for a particular purpose, accuracy or completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, New York University is not liable for, and expressly excludes all liability for, loss or damage however and whenever caused to anyone by any use of the SONYC-UST dataset or any part of it.
Feedback
Please help us improve SONYC-UST by sending your feedback to:
Mark Cartwright: mcartwright@gmail.com
In case of a problem, please include as many details as possible.
Acknowledgments
We would like to thank all
GammaRay Burst (GRB) 130427A is an outstanding event, having the highestgammaray fluence of any GRB detected in almost 30 years. Its Xray and opticalafterglows are extraordinarily bright as well, and this event has potential tobe the longest observable GRB since the launch of Swift. We propose to observethis exceptional GRB with XMMNewton ~2 years after the trigger, to determinethe late behaviour of its Xray afterglow. It represents an unique opportunityto detect and study an Xray afterglow at such late times. Determining thespectral and temporal indices of the X ray emission will give us a better gripon the environment and test explosion models. The proposed observations are anextension of those already performed on this burst by the PI with XMMNewton. truncated!, Please see actual data for full text [truncated!, Please see actual data for full text]
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There is an urgent need to develop automatic analysis methods for the large number of X-ray photoelectron spectroscopy (XPS) spectra being obtained by methods such as 3D chemical analysis and operand analysis. In a previous study, we developed an automatic analysis method for mixed materials that can decompose the XPS spectra and estimate the compositional ratios by comparison with XPS reference spectra of many candidate single-phase compounds. This method needs access to the XPS reference spectrum of every possible compound in the sample. However, in many practical cases, the necessary XPS reference spectra are not always available. In this study, we developed an automatic analysis method to estimate the compositional ratios, even when all necessary XPS reference spectra are not available, i.e. the measured XPS spectra contain unknown peak structures. In particular, the new method can automatically estimate the number of unknown peaks by the combination of a genetic algorithm and the Bayesian information criterion. We applied the method to analyze the depth-resolved XPS spectra of a (PZT) piezoelectric film and successfully identified the change in the chemical states of the components in the film without ambiguity.