11 datasets found

e
Subsetting
paper.erudition.co.in
html
Updated Mar 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2025). Subsetting [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting
Explore at:
htmlAvailable download formats
Dataset updated
Mar 17, 2025
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Subsetting of Data Analysis with R, 2nd Semester , Bachelor of Computer Application 2023-2024
e
Simulation
paper.erudition.co.in
html
Updated Mar 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2025). Simulation [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting
Explore at:
htmlAvailable download formats
Dataset updated
Mar 17, 2025
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Simulation of Data Analysis with R, 2nd Semester , Bachelor of Computer Application 2023-2024
Z
SDSS Galaxy Subset
data.niaid.nih.gov
zenodo.org
Updated Sep 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carvalho, Nuno Ramos (2022). SDSS Galaxy Subset [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_6393487
Explore at:
Dataset updated
Sep 6, 2022
Dataset authored and provided by
Carvalho, Nuno Ramos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Sloan Digital Sky Survey (SDSS) is a comprehensive survey of the northern sky. This dataset contains a subset of this survey, of 100077 objects classified as galaxies, it includes a CSV file with a collection of information and a set of files for each object, namely JPG image files, FITS and spectra data. This dataset is used to train and explore the astromlp-models collection of deep learning models for galaxies characterisation.

The dataset includes a CSV data file where each row is an object from the SDSS database, and with the following columns (note that some data may not be available for all objects):

objid: unique SDSS object identifier

mjd: MJD of observation

plate: plate identifier

tile: tile identifier

fiberid: fiber identifier

run: run number

rerun: rerun number

camcol: camera column

field: field number

ra: right ascension

dec: declination

class: spectroscopic class (only objetcs with GALAXY are included)

subclass: spectroscopic subclass

modelMag_u: better of DeV/Exp magnitude fit for band u

modelMag_g: better of DeV/Exp magnitude fit for band g

modelMag_r: better of DeV/Exp magnitude fit for band r

modelMag_i: better of DeV/Exp magnitude fit for band i

modelMag_z: better of DeV/Exp magnitude fit for band z

redshift: final redshift from SDSS data z

stellarmass: stellar mass extracted from the eBOSS Firefly catalog

w1mag: WISE W1 "standard" aperture magnitude

w2mag: WISE W2 "standard" aperture magnitude

w3mag: WISE W3 "standard" aperture magnitude

w4mag: WISE W4 "standard" aperture magnitude

gz2c_f: Galaxy Zoo 2 classification from Willett et al 2013

gz2c_s: simplified version of Galaxy Zoo 2 classification (labels set)

Besides the CSV file a set of directories are included in the dataset, in each directory you'll find a list of files named after the objid column from the CSV file, with the corresponding data, the following directories tree is available:

sdss-gs/ ├── data.csv ├── fits ├── img ├── spectra └── ssel

Where, each directory contains:

img: RGB images from the object in JPEG format, 150x150 pixels, generated using the SkyServer DR16 API

fits: FITS data subsets around the object across the u, g, r, i, z bands; cut is done using the ImageCutter library

spectra: full best fit spectra data from SDSS between 4000 and 9000 wavelengths

ssel: best fit spectra data from SDSS for specific selected intervals of wavelengths discussed by Sánchez Almeida 2010

Changelog

v0.0.4 - Increase number of objects to ~100k.

v0.0.3 - Increase number of objects to ~80k.

v0.0.2 - Increase number of objects to ~60k.

v0.0.1 - Initial import.
e
Scoping Rules
paper.erudition.co.in
html
Updated Mar 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2025). Scoping Rules [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting
Explore at:
htmlAvailable download formats
Dataset updated
Mar 17, 2025
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Scoping Rules of Data Analysis with R, 2nd Semester , Bachelor of Computer Application 2023-2024
Data from: Effects of nutrient enrichment on freshwater macrophyte and...
zenodo.org
Updated Dec 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Floris K. Neijnens; Floris K. Neijnens; Hadassa Moreira; Hadassa Moreira; Melinda M.J. De Jonge; Melinda M.J. De Jonge; Bart B.H.P. Linssen; Mark A.J. Huijbregts; Mark A.J. Huijbregts; Gertjan W. Geerling; Gertjan W. Geerling; Aafke M. Schipper; Aafke M. Schipper; Bart B.H.P. Linssen (2023). Effects of nutrient enrichment on freshwater macrophyte and invertebrate abundance: A meta-analysis [Dataset]. http://doi.org/10.5281/zenodo.10251772
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10251772
Dataset updated
Dec 4, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Floris K. Neijnens; Floris K. Neijnens; Hadassa Moreira; Hadassa Moreira; Melinda M.J. De Jonge; Melinda M.J. De Jonge; Bart B.H.P. Linssen; Mark A.J. Huijbregts; Mark A.J. Huijbregts; Gertjan W. Geerling; Gertjan W. Geerling; Aafke M. Schipper; Aafke M. Schipper; Bart B.H.P. Linssen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The zip-file contains the data and code accompanying the paper 'Effects of nutrient enrichment on freshwater macrophyte and invertebrate abundance: A meta-analysis'. Together, these files should allow for the replication of the results.
The 'raw_data' folder contains the 'MA_database.csv' file, which contains the extracted data from all primary studies that are used in the analysis. Furthermore, this folder contains the file 'MA_database_description.txt', which gives a description of each data column in the database.
The 'derived_data' folder contains the files that are produced by the R-scripts in this study and used for data analysis. The 'MA_database_processed.csv' and 'MA_database_processed.RData' files contain the converted raw database that is suitable for analysis. The 'DB_IA_subsets.RData' file contains the 'Individual Abundance' (IA) data subsets based on taxonomic group (invertebrates/macrophytes) and inclusion criteria. The 'DB_IA_VCV_matrices.RData' contains for all IA data subsets the variance-covariance (VCV) matrices. The 'DB_AM_subsets.RData' file contains the 'Total Abundance' (TA) and 'Mean Abundance' (MA) data subsets based on taxonomic group (invertebrates/macrophytes) and inclusion criteria.
The 'output_data' folder contains maps with the output data for each data subset (i.e. for each metric, taxonomic group and set of inclusion criteria). For each data subset, the map contains random effects selection results ('Results1_REsel_
The 'scripts' folder contains all R-scripts that we used for this study. The 'PrepareData.R' script takes the database as input and adjusts the file so that it can be used for data analysis. The 'PrepareDataIA.R' and 'PrepareDataAM.R' scripts make subsets of the data and prepare the data for the meta-regression analysis and mixed-effects regression analysis, respectively. The regression analyses are performed in the 'SelectModelsIA.R' and 'SelectModelsAM.R' scripts to calculate the regression model results for the IA metric and MA/TA metrics, respectively. These scripts require the 'RandomAndFixedEffects.R' script, containing the random and fixed effects parameter combinations, as well as the 'Functions.R' script. The 'CreateMap.R' script creates a global map with the location of all studies included in the analysis (figure 1 in the paper). The 'CreateForestPlots.R' script creates plots showing the IA data distribution for both taxonomic groups (figure 2 in the paper). The 'CreateHeatMaps.R' script creates heat maps for all metrics and taxonomic groups (figure 3 in the paper, figures S11.1 and S11.2 in the appendix). The 'CalculateStatistics.R' script calculates the descriptive statistics that are reported throughout the paper, and creates the figures that describe the dataset characteristics (figures S3.1 to S3.5 in the appendix). The 'CreateFunnelPlots.R' script creates the funnel plots for both taxonomic groups (figures S6.1 and S6.2 in the appendix) and performs Egger's tests. The 'CreateControlGraphs.R' script creates graphs showing the dependency of the nutrient response to control concentrations for all metrics and taxonomic groups (figures S10.1 and S10.2 in the appendix).
The 'figures' folder contains all figures that are included in this study.
MISR L1B2 Ellipsoid Product subset for the SAMUM region V003
datasets.ai
data.nasa.gov
+4more
21
Updated Aug 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Aeronautics and Space Administration (2024). MISR L1B2 Ellipsoid Product subset for the SAMUM region V003 [Dataset]. https://datasets.ai/datasets/misr-l1b2-ellipsoid-product-subset-for-the-samum-region-v003
Explore at:
21Available download formats
Dataset updated
Aug 9, 2024
Dataset provided by
NASAhttp://nasa.gov/
Authors
National Aeronautics and Space Administration
Description
This file contains Ellipsoid-projected TOA Radiance,resampled at the surface and topographically corrected, as well as geometrically corrected by PGE22 for the SAMUM_2006 theme.
Data from: Effects of nutrient enrichment on freshwater macrophyte and...
zenodo.org
Updated Dec 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Floris K. Neijnens; Floris K. Neijnens; Hadassa Moreira; Hadassa Moreira; Melinda M.J. De Jonge; Melinda M.J. De Jonge; Bart B.H.P. Linssen; Mark A.J. Huijbregts; Mark A.J. Huijbregts; Gertjan W. Geerling; Gertjan W. Geerling; Aafke M. Schipper; Aafke M. Schipper; Bart B.H.P. Linssen (2023). Effects of nutrient enrichment on freshwater macrophyte and invertebrate abundance: A meta-analysis [Dataset]. http://doi.org/10.5281/zenodo.10372444
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10372444
Dataset updated
Dec 13, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Floris K. Neijnens; Floris K. Neijnens; Hadassa Moreira; Hadassa Moreira; Melinda M.J. De Jonge; Melinda M.J. De Jonge; Bart B.H.P. Linssen; Mark A.J. Huijbregts; Mark A.J. Huijbregts; Gertjan W. Geerling; Gertjan W. Geerling; Aafke M. Schipper; Aafke M. Schipper; Bart B.H.P. Linssen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The zip-file contains the data and code accompanying the paper 'Effects of nutrient enrichment on freshwater macrophyte and invertebrate abundance: A meta-analysis'. Together, these files should allow for the replication of the results.

The 'raw_data' folder contains the 'MA_database.csv' file, which contains the extracted data from all primary studies that are used in the analysis. Furthermore, this folder contains the file 'MA_database_description.txt', which gives a description of each data column in the database.

The 'derived_data' folder contains the files that are produced by the R-scripts in this study and used for data analysis. The 'MA_database_processed.csv' and 'MA_database_processed.RData' files contain the converted raw database that is suitable for analysis. The 'DB_IA_subsets.RData' file contains the 'Individual Abundance' (IA) data subsets based on taxonomic group (invertebrates/macrophytes) and inclusion criteria. The 'DB_IA_VCV_matrices.RData' contains for all IA data subsets the variance-covariance (VCV) matrices. The 'DB_AM_subsets.RData' file contains the 'Total Abundance' (TA) and 'Mean Abundance' (MA) data subsets based on taxonomic group (invertebrates/macrophytes) and inclusion criteria.

The 'output_data' folder contains maps with the output data for each data subset (i.e. for each metric, taxonomic group and set of inclusion criteria). For each data subset, the map contains random effects selection results ('Results1_REsel_

The 'scripts' folder contains all R-scripts that we used for this study. The 'PrepareData.R' script takes the database as input and adjusts the file so that it can be used for data analysis. The 'PrepareDataIA.R' and 'PrepareDataAM.R' scripts make subsets of the data and prepare the data for the meta-regression analysis and mixed-effects regression analysis, respectively. The regression analyses are performed in the 'SelectModelsIA.R' and 'SelectModelsAM.R' scripts to calculate the regression model results for the IA metric and MA/TA metrics, respectively. These scripts require the 'RandomAndFixedEffects.R' script, containing the random and fixed effects parameter combinations, as well as the 'Functions.R' script. The 'CreateMap.R' script creates a global map with the location of all studies included in the analysis (figure 1 in the paper). The 'CreateForestPlots.R' script creates plots showing the IA data distribution for both taxonomic groups (figure 2 in the paper). The 'CreateHeatMaps.R' script creates heat maps for all metrics and taxonomic groups (figure 3 in the paper, figures S11.1 and S11.2 in the appendix). The 'CalculateStatistics.R' script calculates the descriptive statistics that are reported throughout the paper, and creates the figures that describe the dataset characteristics (figures S3.1 to S3.5 in the appendix). The 'CreateFunnelPlots.R' script creates the funnel plots for both taxonomic groups (figures S6.1 and S6.2 in the appendix) and performs Egger's tests. The 'CreateControlGraphs.R' script creates graphs showing the dependency of the nutrient response to control concentrations for all metrics and taxonomic groups (figures S10.1 and S10.2 in the appendix).

The 'figures' folder contains all figures that are included in this study.
E
CELEX Dutch lexical database - Syntax Subset
live.european-language-grid.eu
catalogue.elra.info
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CELEX Dutch lexical database - Syntax Subset [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/2239
Explore at:
License
http://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttp://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
http://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttp://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
Description
The Dutch CELEX data is derived from R.H. Baayen, R. Piepenbrock & L. Gulikers, The CELEX Lexical Database (CD-ROM), Release 2, Dutch Version 3.1, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, 1995. Apart from orthographic features, the CELEX database comprises representations of the phonological, morphological, syntactic and frequency properties of lemmata. For the Dutch data, frequencies have been disambiguated on the basis of the 42.4m Dutch Instituut voor Nederlandse Lexicologie text corpora. To make for greater compatibility with other operating systems, the databases have not been tailored to fit any particular database management program. Instead, the information is presented in a series of plain ASCII files, which can be queried with tools such as AWK and ICON. Unique identity numbers allow the linking of information from different files.This database can be divided into different subsets: · orthography: with or without diacritics, with or without word division positions, alternative spellings, number of letters/syllables; · phonology: phonetic transcriptions with syllable boundaries or primary and secondary stress markers, consonant-vowel patterns, number of phonemes/syllables, alternative pronunciations, frequency per phonetic syllable within words; · morphology: division into stems and affixes, flat or hierarchical representations, stems and their inflections; · syntax: word class, subcategorisations per word class; · frequency of the entries: disambiguated for homographic lemmata.
MERRA-2 subset for evaluation of renewables with merra2ools R-package:...
zenodo.org
data.niaid.nih.gov
+1more
bin, txt
Updated Jun 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oleg Lugovoy; Oleg Lugovoy; Shuo Gao; Shuo Gao (2022). MERRA-2 subset for evaluation of renewables with merra2ools R-package: 1980-2020 hourly, 0.5° lat x 0.625° lon global grid [Dataset]. http://doi.org/10.5061/dryad.v41ns1rtt
Explore at:
bin, txtAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.v41ns1rtt
Dataset updated
Jun 3, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Oleg Lugovoy; Oleg Lugovoy; Shuo Gao; Shuo Gao
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Renewable variable energy resources (VER) - solar and wind energy are becoming increasingly important sources of electricity worldwide. Assessing the potential and the reliability of the resources requires long-term historical data. Directly measured solar radiation and wind speed are limited to locations of weather stations, and even when available, the observations are not directly suitable for the evaluation of VERs potential (as an example, the wind speed is rarely measured at wind turbines heights). Reanalysis data based on satellite imagery and Earth system models, such as MERRA-2 offer a broad set of long-term time series on a global grid.
`merra2ools` is a preprocessed subset of MERRA-2 variables and a software (R-package) designed for quick estimation of hourly output of solar photovoltaics and wind turbines. The grid of the MERRA-2 dataset has 0.625° step length along longitude (- 180° to 180°) and 0.5° along latitude (- 90° to 90°), making 576 x 361 grid or 207936 locations. The subset of the hourly data covers the period from 1980-Jan-01 00:30 UTC to 2020-Jan-31 23:30 UTC. It includes eight variables: wind speed at 10- and 50-meters height (W10M and W50M), wind direction (WDIR), the atmospheric temperature at 10 meters height (T10M), surface incoming shortwave flux (SWGDN), surface albedo (ALBEDO), bias-corrected total precipitation (PRECTOTCORR), and air density at the surface (RHOA). The dataset's key variables are date-time in Coordinated Universal Time timezone (UTC) and location identifiers (locid). In total, the subset has 290,357,084,160 data points (362,946,355,200 including the key variables). To reduce the dataset's memory footprint (~3TB uncompressed), the original MERRA-2 variables have been rounded, scaled, and stored as integers in highly compressed data format with high speed full random access (`fst` package for R). The resulting dataset is saved in separate files by months (41 years x 12 months, 492 data-files in total). Additionally, some summary statistics such as mean values of each variable by month and location ID, annual spatial correlations with the nearest neighbors have been calculated for wind speed and solar irradiance and added to the dataset.
E
Data from: Subset of turbulent energy fluxes, meteorology and soil physics...
catalogue.ceh.ac.uk
data-search.nerc.ac.uk
+1more
zip
Updated Jan 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
R. Morrison; H.M. Cooper; A.M.J. Cumming; C. Evans; S. Oakley; N.P. McNamara; R. Pywell; P. Scarlett (2020). Subset of turbulent energy fluxes, meteorology and soil physics observations collected at eddy covariance sites in southeast England, June 2019 [Dataset]. http://doi.org/10.5285/0254620f-9cf1-4d5b-af3f-bd8a6af95e96
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5285/0254620f-9cf1-4d5b-af3f-bd8a6af95e96
Dataset updated
Jan 24, 2020
Dataset provided by
NERC EDS Environmental Information Data Centre
Authors
R. Morrison; H.M. Cooper; A.M.J. Cumming; C. Evans; S. Oakley; N.P. McNamara; R. Pywell; P. Scarlett
Time period covered
Jun 22, 2019 - Jul 6, 2019
Area covered

Dataset funded by
Natural Environment Research Councilhttps://www.ukri.org/councils/nerc
Description
This dataset contains time series observations of surface-atmosphere exchanges of sensible heat (H) and latent heat (LE) and momentum (τ) measured at UKCEH eddy covariance flux observation sites during summer 2019. The dataset includes ancillary weather and soil physics observations made at each site. Eddy covariance (EC) and micrometeorological observations were collected using open-path eddy covariance systems. Flux, meteorological and soil physics observations were collected and processed using harmonised protocols across all sites. This work was supported by the Natural Environment Research Council award number NE/R016429/1 as part of the UK-SCAPE programme delivering National Capability.
MISR Geometric Parameters subset for the GoMACCS region V002
data.nasa.gov
access.earthdata.nasa.gov
+3more
application/rdfxml +5
Updated Sep 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). MISR Geometric Parameters subset for the GoMACCS region V002 [Dataset]. https://data.nasa.gov/dataset/MISR-Geometric-Parameters-subset-for-the-GoMACCS-r/pstf-x3wh
Explore at:
csv, xml, json, application/rssxml, tsv, application/rdfxmlAvailable download formats
Dataset updated
Sep 20, 2019
Description
Multi-angle Imaging SpectroRadiometer (MISR) is an instrument designed to view Earth with cameras pointed in 9 different directions. As the instrument flies overhead, each piece of Earth's surface below is successively imaged by all 9 cameras, in each of 4 wavelengths (blue, green, red, and near-infrared). The goal of MISR is to improve our understanding of the fate of sunlight in Earth environment, as well as distinguish different types of clouds, particles and surfaces. Specifically, MISR monitors the monthly, seasonal, and long-term trends in three areas: 1) amount and type of atmospheric particles (aerosols), including those formed by natural sources and by human activities; 2) amounts, types, and heights of clouds, and 3) distribution of land surface cover, including vegetation canopy structure. MISR Geometric Parameters subset for the GoMACCS region V002 contains the Geometric Parameters which measure the sun and view angles at the reference ellipsoid.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Einetic (2025). Subsetting [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting

Subsetting

3

Explore at:

htmlAvailable download formats

Dataset updated

Mar 17, 2025

Dataset authored and provided by

Einetic

License

https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

Description

Question Paper Solutions of chapter Subsetting of Data Analysis with R, 2nd Semester , Bachelor of Computer Application 2023-2024

Clear search

Close search

Google apps

Main menu

Subsetting

Simulation

SDSS Galaxy Subset

Scoping Rules

Data from: Effects of nutrient enrichment on freshwater macrophyte and...

MISR L1B2 Ellipsoid Product subset for the SAMUM region V003

Data from: Effects of nutrient enrichment on freshwater macrophyte and...

CELEX Dutch lexical database - Syntax Subset

MERRA-2 subset for evaluation of renewables with merra2ools R-package:...

Data from: Subset of turbulent energy fluxes, meteorology and soil physics...

MISR Geometric Parameters subset for the GoMACCS region V002

SubsettingSee More Versions

3

Subsetting