Estimating the properties of galaxies, and even where they are, is a challenging process. The Rubin Observatory, a sky survey telescope located in Chile, will once it becomes operational image tens of billions of astronomical objects, the vast majority of which will be galaxies that have never been imaged before. Analyses of these data will require sophisticated methodologies, ones that will allow us to first determine where the galaxies are (i.e., how far away they are), and then conditional on the distance, how massive they are. Given galaxy distance and mass data, we can test theories of how the Universe evolves, by comparing simulated galaxy data with these data.
The Buzzard-V1.0 simulation was used to generate a realistic sample of Rubin Observatory data. In this dataset are measurements for 111,172 galaxies. Developers used these data to benchmark, e.g., methods for estimating galaxy distance. Here, we can assume the distance has been estimated well, and use these data to try to model galaxy mass as a function of brightness and distance.
The dataset contains measures of magnitude and magnitude uncertainty in six astronomical bands (u for ultraviolet, g for green, r for red, i for infrared, and z and y for two additional infrared bands). Magnitude is a logarithmic measure of brightness, with an increase of 5 representing a decrease in brightness by a factor of 100, and with a value of zero being represented (roughly) by how the star Vega appears in the night sky. In addition, there is a redshift measured for each galaxy; it represents by how much light from the galaxy is stretched (by the expansion of Universe) as it travels to us. Thus higher redshifts represent larger distances. The last measurement is log.mass, which is the base-10 logarithm of the galaxy stellar mass in units of solar mass; for instance, log.mass = 10 means that the galaxy has a mass 10 billion times that of the Sun.
As noted above, the idea here is to learn a statistical association between measures of magnitude and distance, and galaxy mass.
One wrinkle here that analysts can exploit is that the data contain standard error estimates for the magnitudes (though not for redshift, for which, in practice, the error would be ).
Schmidt, Malz, Soo, Almosallam, Brescia, Cavuoti, Cohen-Tanugi, Connolly, DeRose, Freeman, Graham, Iyer, Jarvis, Kalmbach, Kovacs, Lee, Longo, Morrison, Newman, Nourbakhsh, Nuss, Pospisil, Tranin, Wechsler, Zhou, Izbicki, (The LSST Dark Energy Science Collaboration). “Evaluation of probabilistic photometric redshift estimation approaches for The Rubin Observatory Legacy Survey of Space and Time (LSST)”. Monthly Notices of the Royal Astronomical Society 499, December 2020, pages 1587–1606. https://doi.org/10.1093/mnras/staa2799
Foto from unsplash
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
R Scripts contain statistical data analisys for streamflow and sediment data, including Flow Duration Curves, Double Mass Analysis, Nonlinear Regression Analysis for Suspended Sediment Rating Curves, Stationarity Tests and include several plots.
Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools. Here we have developed massPix - an R package for analysing and interpreting data from MSI of lipids in tissue. MassPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications. MassPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries. Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering. Mouse cerebellum was analysed using matrix assisted laser desorption ionisation (MALDI) MSI. The resulting MSI dataset forms the test data for massPix.
This dataset contains aerosol gravimetric analysis of mass, using 2-stage multi-jet cascade impactors, taken aboard the Ron Brown ship during the ACE-Asia field project. This dataset contains the tab delimited (.acf) data files. Data can also be downloaded in a netCDF format.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The FragPipe computational proteomics platform is gaining widespread popularity among the proteomics research community because of its fast processing speed and user-friendly graphical interface. Although FragPipe produces well-formatted output tables that are ready for analysis, there is still a need for an easy-to-use and user-friendly downstream statistical analysis and visualization tool. FragPipe-Analyst addresses this need by providing an R shiny web server to assist FragPipe users in conducting downstream analyses of the resulting quantitative proteomics data. It supports major quantification workflows, including label-free quantification, tandem mass tags, and data-independent acquisition. FragPipe-Analyst offers a range of useful functionalities, such as various missing value imputation options, data quality control, unsupervised clustering, differential expression (DE) analysis using Limma, and gene ontology and pathway enrichment analysis using Enrichr. To support advanced analysis and customized visualizations, we also developed FragPipeAnalystR, an R package encompassing all FragPipe-Analyst functionalities that is extended to support site-specific analysis of post-translational modifications (PTMs). FragPipe-Analyst and FragPipeAnalystR are both open-source and freely available.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The repository contains three mzML and four imzML mass spectrometry datasets,
The mzML data are compiled in a single directory 'mzML' and zipped:
The imzML mass spectrometry imaging data are zipped individually:
All these datasets are publicly available from different repositories; however, If you reuse them, please attribute the original authors!
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This data release contains extended estimates of daily groundwater levels and monthly percentiles at 27 short-term monitoring wells in Massachusetts. The Maintenance of Variance Extension Type 1 (MOVE.1) regression method was used to extend short-term groundwater levels at wells with less than 10 years of continuous data. This method uses groundwater level data from a correlated long-term monitoring well (index well) to estimate the groundwater level record for the short-term monitoring well. MOVE.1 regressions are used widely throughout the hydrologic community to extend flow records from streamgaging stations but are less commonly used to extend groundwater records at wells. The data in this data release document the results of the MOVE.1 regressions to estimate groundwater levels and compute updated monthly percentiles for select wells used in the groundwater index in the Massachusetts Drought Management Plan (2019). The U.S. Geological Survey (USGS) groundwater identification ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets and R scripts from González-Suárez, M; Gonzalez-Voyer, A; von Hardenberg, A; Santini, L (2021) The role of brain size on mammalian population densities Journal of Animal Ecology, 90: 653– 661. DOI: 10.1111/1365-2656.13397Additional details in the README.pdf file
SUMMARY OF FILES INCLUDED
·
12 csv datasets from other sources (described
below) with brain and body mass data in the zip file Brain and Mass data.
·
Six csv datasets from other sources and compilations
(described below) with population density and diet information
·
Two csv files (Complete_dataset_published.csv, Brain_data_compilation_published.csv) produced during this study. Details of the files and the compilation protocol are provided in the README file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2. Markdown for using the Raman2imzML package.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset is about: (Appendix 2) Compilation of mass accumulation rates from deep sea sediments during the Holocene. Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.726364 for more information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplemental Data for R analyses; each sheet should be saved out as its own .csv for R. This version contains the humerus and femur circumference metrics in sheet 2 (BodyMass_RegMtrx) that were missing in the version 1.
Supporting datasets and algorithms (R-based) for the manuscript entitled "Development of a Tool to Determine the Variability of Consensus Mass Spectra", including an R Markdown script to reproduce the manuscript's figures.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here we describe the release of the measurements of the galaxy Stellar Mass Function and quiescent mass fractions based on the COSMOS2020 Farmer Catalogue and LePhare estimates of redshifts, masses, and rest-frame colours as described in Weaver et al. 2023 (arXiv:2212.02512v1).
When using these data products please cite both this SMF paper (Weaver et al. 2023) and the COSMOS2020 Catalogue (Weaver et al. 2022). Links to ADS export citations:
SMF | https://ui.adsabs.harvard.edu/abs/2022arXiv221202512W
COSMOS2020 | https://ui.adsabs.harvard.edu/abs/2022ApJS..258...11W
Please reach out if you have any questions or concerns.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Replication data for the publication: "rSIREM: an R package for MALDI spectral deconvolution" by Del Castillo Pérez et al. The deposited data are SALDI-MSI data of three consectutive thin tissue sections from mouse cerebellum measured at the different mass resolutions at the same instrument (MALDI-MSI: Spectroglyph Injector - Orbitrap Exploris). The paper describes a new R package (rSIREM) to computationally improve the mass resolution of an MSI post-measurement. The developed R package (https://github.com/EdelCastillo/rSirem ) applies a statistical treatment on the concentration of spatial images obtained by separately considering each of the m/z over all the pixels. A representative scalar is associated with each image, obtained by applying a new measure (SIREM) to it, derived from Shannon's entropy. The perturbations of this measure, when considering a sequence of consecutive images, reveal the existence of overlap, if it exists. This information serves as a seed to initialize the EM algorithm in the Gaussian Mixture Model context. The efficiency of the method has been verified using three independent procedures.
This dataset was collected by taking linear measurements on postcranial bones in museum collections. The data has been processed using R scripts.
VEPP-4 collider. Measurement of R in e+ e- interactions in the centre-of-mass energy range 7.25 to 10.34 GeV using the MD-1 detector. Data corrected for background and radiative effects.
Scripts for analysis of DI-qTOF data recorded for studies of refractory dissolved organic matter
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset is about: (Appendix 2) Compilation of mass accumulation rates from deep sea sediments during the last glacial maximum. Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.726364 for more information.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Requisite R files for the Proteo-SAFARI app
Supporting datasets and algorithms (R-based) for the manuscript entitled "Development of a Tool to Determine the Variability of Consensus Mass Spectra", including an R Markdown script to reproduce the manuscript's figures.
Estimating the properties of galaxies, and even where they are, is a challenging process. The Rubin Observatory, a sky survey telescope located in Chile, will once it becomes operational image tens of billions of astronomical objects, the vast majority of which will be galaxies that have never been imaged before. Analyses of these data will require sophisticated methodologies, ones that will allow us to first determine where the galaxies are (i.e., how far away they are), and then conditional on the distance, how massive they are. Given galaxy distance and mass data, we can test theories of how the Universe evolves, by comparing simulated galaxy data with these data.
The Buzzard-V1.0 simulation was used to generate a realistic sample of Rubin Observatory data. In this dataset are measurements for 111,172 galaxies. Developers used these data to benchmark, e.g., methods for estimating galaxy distance. Here, we can assume the distance has been estimated well, and use these data to try to model galaxy mass as a function of brightness and distance.
The dataset contains measures of magnitude and magnitude uncertainty in six astronomical bands (u for ultraviolet, g for green, r for red, i for infrared, and z and y for two additional infrared bands). Magnitude is a logarithmic measure of brightness, with an increase of 5 representing a decrease in brightness by a factor of 100, and with a value of zero being represented (roughly) by how the star Vega appears in the night sky. In addition, there is a redshift measured for each galaxy; it represents by how much light from the galaxy is stretched (by the expansion of Universe) as it travels to us. Thus higher redshifts represent larger distances. The last measurement is log.mass, which is the base-10 logarithm of the galaxy stellar mass in units of solar mass; for instance, log.mass = 10 means that the galaxy has a mass 10 billion times that of the Sun.
As noted above, the idea here is to learn a statistical association between measures of magnitude and distance, and galaxy mass.
One wrinkle here that analysts can exploit is that the data contain standard error estimates for the magnitudes (though not for redshift, for which, in practice, the error would be ).
Schmidt, Malz, Soo, Almosallam, Brescia, Cavuoti, Cohen-Tanugi, Connolly, DeRose, Freeman, Graham, Iyer, Jarvis, Kalmbach, Kovacs, Lee, Longo, Morrison, Newman, Nourbakhsh, Nuss, Pospisil, Tranin, Wechsler, Zhou, Izbicki, (The LSST Dark Energy Science Collaboration). “Evaluation of probabilistic photometric redshift estimation approaches for The Rubin Observatory Legacy Survey of Space and Time (LSST)”. Monthly Notices of the Royal Astronomical Society 499, December 2020, pages 1587–1606. https://doi.org/10.1093/mnras/staa2799
Foto from unsplash