53 datasets found
  1. e

    Subsetting

    • paper.erudition.co.in
    html
    Updated Aug 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Subsetting [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Aug 11, 2025
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Subsetting of Data Analysis with R, 2nd Semester , Bachelor of Computer Application 2023-2024

  2. d

    Data release for solar-sensor angle analysis subset associated with the...

    • catalog.data.gov
    • data.amerigeoss.org
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Data release for solar-sensor angle analysis subset associated with the journal article "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" [Dataset]. https://catalog.data.gov/dataset/data-release-for-solar-sensor-angle-analysis-subset-associated-with-the-journal-article-so
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    United States, Western United States
    Description

    This dataset provides geospatial location data and scripts used to analyze the relationship between MODIS-derived NDVI and solar and sensor angles in a pinyon-juniper ecosystem in Grand Canyon National Park. The data are provided in support of the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States". The data and scripts allow users to replicate, test, or further explore results. The file GrcaScpnModisCellCenters.csv contains locations (latitude-longitude) of all the 250-m MODIS (MOD09GQ) cell centers associated with the Grand Canyon pinyon-juniper ecosystem that the Southern Colorado Plateau Network (SCPN) is monitoring through its land surface phenology and integrated upland monitoring programs. The file SolarSensorAngles.csv contains MODIS angle measurements for the pixel at the phenocam location plus a random 100 point subset of pixels within the GRCA-PJ ecosystem. The script files (folder: 'Code') consist of 1) a Google Earth Engine (GEE) script used to download MODIS data through the GEE javascript interface, and 2) a script used to calculate derived variables and to test relationships between solar and sensor angles and NDVI using the statistical software package 'R'. The file Fig_8_NdviSolarSensor.JPG shows NDVI dependence on solar and sensor geometry demonstrated for both a single pixel/year and for multiple pixels over time. (Left) MODIS NDVI versus solar-to-sensor angle for the Grand Canyon phenocam location in 2018, the year for which there is corresponding phenocam data. (Right) Modeled r-squared values by year for 100 randomly selected MODIS pixels in the SCPN-monitored Grand Canyon pinyon-juniper ecosystem. The model for forward-scatter MODIS-NDVI is log(NDVI) ~ solar-to-sensor angle. The model for back-scatter MODIS-NDVI is log(NDVI) ~ solar-to-sensor angle + sensor zenith angle. Boxplots show interquartile ranges; whiskers extend to 10th and 90th percentiles. The horizontal line marking the average median value for forward-scatter r-squared (0.835) is nearly indistinguishable from the back-scatter line (0.833). The dataset folder also includes supplemental R-project and packrat files that allow the user to apply the workflow by opening a project that will use the same package versions used in this study (eg, .folders Rproj.user, and packrat, and files .RData, and PhenocamPR.Rproj). The empty folder GEE_DataAngles is included so that the user can save the data files from the Google Earth Engine scripts to this location, where they can then be incorporated into the r-processing scripts without needing to change folder names. To successfully use the packrat information to replicate the exact processing steps that were used, the user should refer to packrat documentation available at https://cran.r-project.org/web/packages/packrat/index.html and at https://www.rdocumentation.org/packages/packrat/versions/0.5.0. Alternatively, the user may also use the descriptive documentation phenopix package documentation, and description/references provided in the associated journal article to process the data to achieve the same results using newer packages or other software programs.

  3. Data Mining Project - Boston

    • kaggle.com
    Updated Nov 25, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SophieLiu (2019). Data Mining Project - Boston [Dataset]. https://www.kaggle.com/sliu65/data-mining-project-boston/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 25, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    SophieLiu
    Area covered
    Boston
    Description

    Context

    To make this a seamless process, I cleaned the data and delete many variables that I thought were not important to our dataset. I then uploaded all of those files to Kaggle for each of you to download. The rideshare_data has both lyft and uber but it is still a cleaned version from the dataset we downloaded from Kaggle.

    Use of Data Files

    You can easily subset the data into the car types that you will be modeling by first loading the csv into R, here is the code for how you do this:

    This loads the file into R

    df<-read.csv('uber.csv')

    The next codes is to subset the data into specific car types. The example below only has Uber 'Black' car types.

    df_black<-subset(uber_df, uber_df$name == 'Black')

    This next portion of code will be to load it into R. First, we must write this dataframe into a csv file on our computer in order to load it into R.

    write.csv(df_black, "nameofthefileyouwanttosaveas.csv")

    The file will appear in you working directory. If you are not familiar with your working directory. Run this code:

    getwd()

    The output will be the file path to your working directory. You will find the file you just created in that folder.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  4. Source Code - Characterizing Variability and Uncertainty for Parameter...

    • catalog.data.gov
    • s.cnmilf.com
    Updated May 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2025). Source Code - Characterizing Variability and Uncertainty for Parameter Subset Selection in PBPK Models [Dataset]. https://catalog.data.gov/dataset/source-code-characterizing-variability-and-uncertainty-for-parameter-subset-selection-in-p
    Explore at:
    Dataset updated
    May 1, 2025
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Source Code for the manuscript "Characterizing Variability and Uncertainty for Parameter Subset Selection in PBPK Models" -- This R code generates the results presented in this manuscript; the zip folder contains PBPK model files (for chloroform and DCM) and corresponding scripts to compile the models, generate human equivalent doses, and run sensitivity analysis.

  5. d

    MERRA-2 subset for evaluation of renewables with merra2ools R-package:...

    • datadryad.org
    zip
    Updated Mar 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oleg Lugovoy; Shuo Gao (2021). MERRA-2 subset for evaluation of renewables with merra2ools R-package: 1980-2020 hourly, 0.5° lat x 0.625° lon global grid [Dataset]. http://doi.org/10.5061/dryad.v41ns1rtt
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 29, 2021
    Dataset provided by
    Dryad
    Authors
    Oleg Lugovoy; Shuo Gao
    Time period covered
    Mar 19, 2021
    Description

    The merra2ools dataset has been assembled through the following steps:

    The MERRA-2 collections tavg1_2d_flx_Nx (Surface Flux Diagnostics), tavg1_2d_rad_Nx (Radiation Diagnostics), and tavg1_2d_slv_Nx (Single-level atmospheric state variables) downloaded from NASA Goddard Earth Sciences (GES) Data and Information Services Center (DISC) (https://disc.gsfc.nasa.gov/datasets?project=MERRA-2) using GNU Wget network utility (https://disc.gsfc.nasa.gov/data-access). Every of the three collections consist of daily netCDF-4 files with 3-dimensional variables (lon x lat x hour). 
    The following variables obtained from the netCDF-4 files and merged into long-term time-series:
    
    
    
    Northward (V) and Eastward (U) wind at 10 and 50 meters (V10M, V50M, U10M, U50M, respectively), and 10-meter air temperature (T10M) from the tavg1_2d_slv_Nx collection;
    Incident shortwave land (SWGDN) and Surface albedo (ALBEDO) fro...
    
  6. e

    Subset of Zurich Summer Dataset - SDAP - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Dec 24, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Subset of Zurich Summer Dataset - SDAP - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/6ad6a27f-5c7d-5720-ade2-5b0aa94ee9d1
    Explore at:
    Dataset updated
    Dec 24, 2018
    Description

    The dataset and its description are available at (https://sites.google.com/site/michelevolpiresearch/data/zurich-dataset) A subset of 4 crops is selected. Morphological Profiles are computed over each band (R,G,B,NIR) with attribute area. The data are provided in Libsvm format. To evaluation is performed with a leave-one-out estimation: held one image out and trained the model on the remaining 3 scenes. 1) training_set_sdap_1 test_set_sdap_1 2) training_set_sdap_2 test_set_sdap_2 3) training_set_sdap_3 test_set_sdap_3 4) training_set_sdap_4 test_set_sdap_4

  7. OpenML R Bot Benchmark Data (final subset)

    • figshare.com
    application/gzip
    Updated May 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Kühn; Philipp Probst; Janek Thomas; Bernd Bischl (2018). OpenML R Bot Benchmark Data (final subset) [Dataset]. http://doi.org/10.6084/m9.figshare.5882230.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 18, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Daniel Kühn; Philipp Probst; Janek Thomas; Bernd Bischl
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a clean subset of the data that was created by the OpenML R Bot that executed benchmark experiments on binary classification task of the OpenML100 benchmarking suite with six R algorithms: glmnet, rpart, kknn, svm, ranger and xgboost. The hyperparameters of these algorithms were drawn randomly. In total it contains more than 2.6 million benchmark experiments and can be used by other researchers. The subset was created by taking 500000 results of each learner (except of kknn for which only 1140 results are available). The csv-file for each learner is a table that for each benchmark experiment has a row that contains: OpenML-Data ID, hyperparameter values, performance measures (AUC, accuracy, brier score), runtime, scimark (runtime reference of the machine), and some meta features of the dataset.OpenMLRandomBotResults.RData (format for R) contains all data in seperate tables for the results, the hyperparameters, the meta features, the runtime, the scimark results and reference results.

  8. e

    Loop Functions

    • paper.erudition.co.in
    html
    Updated Aug 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Loop Functions [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Aug 11, 2025
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Loop Functions of Data Analysis with R, 2nd Semester , Bachelor of Computer Application 2023-2024

  9. Grib and ASCII data, subset ERA-I for shallow water waves Ocean Science...

    • zenodo.org
    bin, txt
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    J.-R. Bidlot; J.-R. Bidlot (2020). Grib and ASCII data, subset ERA-I for shallow water waves Ocean Science study [Dataset]. http://doi.org/10.5281/zenodo.831329
    Explore at:
    txt, binAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    J.-R. Bidlot; J.-R. Bidlot
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Specific output from ERA-I reanalysis (wave model component) containing interated parameters, see https://doi.org/10.5194/os-13-1-2017

  10. a

    NEON Woody plant survey data: ACCE DTP analytical subset

    • annakrystalli.me
    csv
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anna Krystalli (2025). NEON Woody plant survey data: ACCE DTP analytical subset [Dataset]. https://annakrystalli.me/project/data/index.html
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 19, 2025
    Dataset provided by
    R-RSE SMPC
    Authors
    Anna Krystalli
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    May 18, 2015 - Nov 16, 2018
    Area covered
    Variables measured
    uid, date, height, easting, plot_id, site_id, uid_map, uid_ppl, event_id, northing, and 22 more
    Dataset funded by
    National Science Foundation
    Description

    This data product, sourced from the NEON data portal for the purposes of the ACCE DTP tutorial, contains processed individual level data from measurements of woody individuals and shrub groups.

  11. f

    Appendix S1 - parallelMCMCcombine: An R Package for Bayesian Methods for Big...

    • plos.figshare.com
    doc
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexey Miroshnikov; Erin M. Conlon (2023). Appendix S1 - parallelMCMCcombine: An R Package for Bayesian Methods for Big Data and Analytics [Dataset]. http://doi.org/10.1371/journal.pone.0108425.s001
    Explore at:
    docAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Alexey Miroshnikov; Erin M. Conlon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Remarks on kernels and bandwidth selection for semiparametric density product estimator method. (DOC)

  12. Data and Code for: Plasticity and not adaptation is the primary source of...

    • zenodo.org
    Updated Nov 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tadeo H. Ramirez-Parada; Tadeo H. Ramirez-Parada; Isaac W. Park; Sydne Record; Charles C. Davis; Aaron M. Ellison; Susan J. Mazer; Isaac W. Park; Sydne Record; Charles C. Davis; Aaron M. Ellison; Susan J. Mazer (2023). Data and Code for: Plasticity and not adaptation is the primary source of temperature-mediated variation in flowering phenology in North America [Dataset]. http://doi.org/10.5281/zenodo.8310387
    Explore at:
    Dataset updated
    Nov 28, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Tadeo H. Ramirez-Parada; Tadeo H. Ramirez-Parada; Isaac W. Park; Sydne Record; Charles C. Davis; Aaron M. Ellison; Susan J. Mazer; Isaac W. Park; Sydne Record; Charles C. Davis; Aaron M. Ellison; Susan J. Mazer
    Description

    This submission contains all the code and data necessary for reproducing 1) the dataset, 2) the main results, and 3) all supplemental analyses appearing in the manuscript titled: Plasticity and not adaptation is the primary source of temperature-mediated variation in flowering phenology in North America (Ramirez-Parada, Park, Record, Davis, Ellison, and Mazer, 2023). A preprint of this manuscript can be accessed at: https://doi.org/10.21203/rs.3.rs-3131821/v1.

    Extracting the compressed file will generate a folder titled "Project folder", containing sub-folders named "Data" and "R code". In order for the code to work, users need to preserve the folder structure of the code and data, as the R Markdown files in the "R code" folder have relative file paths that read and write data within the "Data" folder. Moving either would require re-writing the filepaths across Rmds for the code to run.


    To replicate the results, the following R Markdowns must be run in sequence (once they have been run, the Rmds for supplemental analyses can be used in any order):


    "1. Subsetting Dataset.Rmd"

    This file processes a specimen dataset of ca. 2.3 million specimens that we assembled for this project (publicly available on Dryad: https://doi.org/10.25349/D9WP6S), filtering out duplicates, specimens out of the spatial scope of the PRISM data used for all analyses, and subsetting to only those species represented by a minimum of 300 specimens. This filtering yields a dataset of 1,038,047 specimens in flower across 1,605 species.

    For an in-depth description of the starting dataset, please refer to the "READ ME.txt" file within the "Project folder", and visit its corresponding Dryad repository (linked above).


    "2. Main Analysis - Estimating S_space, S_time, and S_diff.Rmd"

    This file uses the subset dataset produced by the previous Rmd to fit the varying-intercepts, varying-slopes model that produced the estimates of apparent plasticity and apparent adaptation underlying all main analyses. This Rmd exports a dataset of species-specific estimates of Sspace, Stime, and Sspace - Stime that is used to recreate Figures 2, 3, and 4 of the main text in the next step. This is the most time consuming R Markdown file to run, as each MCMC chain used to fit the model in Stan must be run on a dedicated processor (limiting the usefulness of parallel computation). Fitting the model using 3 MCMC chains, 1000 iterations for warmup, and 4000 iterations for sampling, took approximately 24 hours using an Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz processor.

    "3. Main Analysis - Figures 2, 3, and 4.Rmd"

    Finally, this Rmd uses the dataset of species-specific estimates to conduct all analyses underlying Figures 2, 3, and 4, recreating each of these figures.

    For detailed descriptions of all materials (code and data) and instructions for using them, please refer to the "READ ME.txt" file within "Project folder".

  13. Z

    SDSS Galaxy Subset

    • data.niaid.nih.gov
    • explore.openaire.eu
    Updated Sep 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carvalho, Nuno Ramos (2022). SDSS Galaxy Subset [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_6393487
    Explore at:
    Dataset updated
    Sep 6, 2022
    Dataset authored and provided by
    Carvalho, Nuno Ramos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Sloan Digital Sky Survey (SDSS) is a comprehensive survey of the northern sky. This dataset contains a subset of this survey, of 100077 objects classified as galaxies, it includes a CSV file with a collection of information and a set of files for each object, namely JPG image files, FITS and spectra data. This dataset is used to train and explore the astromlp-models collection of deep learning models for galaxies characterisation.

    The dataset includes a CSV data file where each row is an object from the SDSS database, and with the following columns (note that some data may not be available for all objects):

    objid: unique SDSS object identifier

    mjd: MJD of observation

    plate: plate identifier

    tile: tile identifier

    fiberid: fiber identifier

    run: run number

    rerun: rerun number

    camcol: camera column

    field: field number

    ra: right ascension

    dec: declination

    class: spectroscopic class (only objetcs with GALAXY are included)

    subclass: spectroscopic subclass

    modelMag_u: better of DeV/Exp magnitude fit for band u

    modelMag_g: better of DeV/Exp magnitude fit for band g

    modelMag_r: better of DeV/Exp magnitude fit for band r

    modelMag_i: better of DeV/Exp magnitude fit for band i

    modelMag_z: better of DeV/Exp magnitude fit for band z

    redshift: final redshift from SDSS data z

    stellarmass: stellar mass extracted from the eBOSS Firefly catalog

    w1mag: WISE W1 "standard" aperture magnitude

    w2mag: WISE W2 "standard" aperture magnitude

    w3mag: WISE W3 "standard" aperture magnitude

    w4mag: WISE W4 "standard" aperture magnitude

    gz2c_f: Galaxy Zoo 2 classification from Willett et al 2013

    gz2c_s: simplified version of Galaxy Zoo 2 classification (labels set)

    Besides the CSV file a set of directories are included in the dataset, in each directory you'll find a list of files named after the objid column from the CSV file, with the corresponding data, the following directories tree is available:

    sdss-gs/ ├── data.csv ├── fits ├── img ├── spectra └── ssel

    Where, each directory contains:

    img: RGB images from the object in JPEG format, 150x150 pixels, generated using the SkyServer DR16 API

    fits: FITS data subsets around the object across the u, g, r, i, z bands; cut is done using the ImageCutter library

    spectra: full best fit spectra data from SDSS between 4000 and 9000 wavelengths

    ssel: best fit spectra data from SDSS for specific selected intervals of wavelengths discussed by Sánchez Almeida 2010

    Changelog

    v0.0.4 - Increase number of objects to ~100k.

    v0.0.3 - Increase number of objects to ~80k.

    v0.0.2 - Increase number of objects to ~60k.

    v0.0.1 - Initial import.

  14. d

    Data release for winter peak extent analysis subset, 2003-2018, associated...

    • catalog.data.gov
    • data.usgs.gov
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Data release for winter peak extent analysis subset, 2003-2018, associated with the journal article "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" [Dataset]. https://catalog.data.gov/dataset/data-release-for-winter-peak-extent-analysis-subset-2003-2018-associated-with-the-journal-
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Western United States
    Description

    This dataset is provided in support of the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States". The data and code provided allow users to replicate, test, or further explore results. The dataset includes 2 raster datasets (folder:Rasters): 1) 'cntWinterPks2003_2018DR' provides a count of years with winter peaks from 2003-2018 in an 11-state area in the western United States. 2) 'VegClassGte5_2003_2018' raster, within the zip file 'WinterPeaksVegTypes.zip' identifies the broad vegetation types for locations with common winter peaks (5 or more years out of 16). The dataset also includes Google Earth Engine and R code files used to create the datasets. Additional files/folders provided include 1) Google Earth Engine scripts used to download MODIS data the GEE - javascript interface (folder: 'Code'). 2) Scripts used to manipulate rasters and to calculate and map the occurrence winter NDVI peaks from 2003-2018 using the statistical software package 'R'. 3) Supplemental R-project and packrat files that allow the user to apply the workflow by opening a project that will use the same package versions used in this study, for example the folders 'Rproj.user', and 'packrat', and files '.RData', and 'WinterPeakExtentPR.Rproj'. 4) Empty folders ('GEE_DataAnnPeak', 'GEE_DataLoose', and 'GEE_DataStrict') that should be used to contain the output from the GEE code files as follows: 'GEE_DataAnnPeak' should contain output from the S3 and S4 scripts, 'GEE_DataLoose' should contain output from the S1 script, and 'GEE_DataStrict' should contain output from the S2 script. 5) Graphic file 'Fig_9_MapsOfExtentPortrait2.jpg' shows temporal and ecosystem distribution of winter NDVI peaks in the western continental US, 2003 to 2018, derived from MODIS MCD43A4 product. TOP: Number of years with winter peaks in areas that meet defined thresholds for biomass (median annual peak NDVI >= 0.15) and temperature (mean December minimum daily temperature <= 0°C). BOTTOM: Predominant LANDFIRE Existing Vegetation Type physiognomy (i.e., mode of each 500-m MODIS pixel) in areas with >= 5 years of winter peaks. Present in lesser proportions but not identified on the map for legibility reasons are conifer-hardwood, exotics, riparian, and sparsely vegetated physiognomic categories as well as non-natural/non-terrestrial ecosystem categories. State abbreviations are AZ (Arizona), CA (California), CO (Colorado), ID (Idaho), MT (Montana), NV (Nevada), NM (New Mexico), OR (Oregon), WA (Washington), and WY (Wyoming). The final steps of overlaying common winter peak extent data on the Landfire data were done using ArcGIS and the publicly available Landfire dataset (see source datasets section of metadata and process steps). To successfully use the packrat information to replicate the exact processing steps that were used, the user should refer to packrat documentation available at https://cran.r-project.org/web/packages/packrat/index.html and at https://www.rdocumentation.org/packages/packrat/versions/0.5.0. Alternatively, the user may also use the descriptive documentation within this metadata along with the workflow described in the associated journal article to process the data to achieve the same results using newer packages or other software programs.

  15. E

    Data from: Subset of turbulent energy fluxes, meteorology and soil physics...

    • catalogue.ceh.ac.uk
    • hosted-metadata.bgs.ac.uk
    • +2more
    zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    R. Morrison; H.M. Cooper; A.M.J. Cumming; C. Evans; S. Oakley; N.P. McNamara; R. Pywell; P. Scarlett (2020). Subset of turbulent energy fluxes, meteorology and soil physics observations collected at eddy covariance sites in southeast England, June 2019 [Dataset]. http://doi.org/10.5285/0254620f-9cf1-4d5b-af3f-bd8a6af95e96
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    NERC EDS Environmental Information Data Centre
    Authors
    R. Morrison; H.M. Cooper; A.M.J. Cumming; C. Evans; S. Oakley; N.P. McNamara; R. Pywell; P. Scarlett
    Time period covered
    Jun 22, 2019 - Jul 6, 2019
    Area covered
    Dataset funded by
    Natural Environment Research Councilhttps://www.ukri.org/councils/nerc
    Description

    This dataset contains time series observations of surface-atmosphere exchanges of sensible heat (H) and latent heat (LE) and momentum (τ) measured at UKCEH eddy covariance flux observation sites during summer 2019. The dataset includes ancillary weather and soil physics observations made at each site. Eddy covariance (EC) and micrometeorological observations were collected using open-path eddy covariance systems. Flux, meteorological and soil physics observations were collected and processed using harmonised protocols across all sites. This work was supported by the Natural Environment Research Council award number NE/R016429/1 as part of the UK-SCAPE programme delivering National Capability.

  16. Data from: Effects of nutrient enrichment on freshwater macrophyte and...

    • zenodo.org
    Updated Dec 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Floris K. Neijnens; Floris K. Neijnens; Hadassa Moreira; Hadassa Moreira; Melinda M.J. De Jonge; Melinda M.J. De Jonge; Bart B.H.P. Linssen; Mark A.J. Huijbregts; Mark A.J. Huijbregts; Gertjan W. Geerling; Gertjan W. Geerling; Aafke M. Schipper; Aafke M. Schipper; Bart B.H.P. Linssen (2023). Effects of nutrient enrichment on freshwater macrophyte and invertebrate abundance: A meta-analysis [Dataset]. http://doi.org/10.5281/zenodo.10372444
    Explore at:
    Dataset updated
    Dec 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Floris K. Neijnens; Floris K. Neijnens; Hadassa Moreira; Hadassa Moreira; Melinda M.J. De Jonge; Melinda M.J. De Jonge; Bart B.H.P. Linssen; Mark A.J. Huijbregts; Mark A.J. Huijbregts; Gertjan W. Geerling; Gertjan W. Geerling; Aafke M. Schipper; Aafke M. Schipper; Bart B.H.P. Linssen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The zip-file contains the data and code accompanying the paper 'Effects of nutrient enrichment on freshwater macrophyte and invertebrate abundance: A meta-analysis'. Together, these files should allow for the replication of the results.

    The 'raw_data' folder contains the 'MA_database.csv' file, which contains the extracted data from all primary studies that are used in the analysis. Furthermore, this folder contains the file 'MA_database_description.txt', which gives a description of each data column in the database.

    The 'derived_data' folder contains the files that are produced by the R-scripts in this study and used for data analysis. The 'MA_database_processed.csv' and 'MA_database_processed.RData' files contain the converted raw database that is suitable for analysis. The 'DB_IA_subsets.RData' file contains the 'Individual Abundance' (IA) data subsets based on taxonomic group (invertebrates/macrophytes) and inclusion criteria. The 'DB_IA_VCV_matrices.RData' contains for all IA data subsets the variance-covariance (VCV) matrices. The 'DB_AM_subsets.RData' file contains the 'Total Abundance' (TA) and 'Mean Abundance' (MA) data subsets based on taxonomic group (invertebrates/macrophytes) and inclusion criteria.

    The 'output_data' folder contains maps with the output data for each data subset (i.e. for each metric, taxonomic group and set of inclusion criteria). For each data subset, the map contains random effects selection results ('Results1_REsel_

    The 'scripts' folder contains all R-scripts that we used for this study. The 'PrepareData.R' script takes the database as input and adjusts the file so that it can be used for data analysis. The 'PrepareDataIA.R' and 'PrepareDataAM.R' scripts make subsets of the data and prepare the data for the meta-regression analysis and mixed-effects regression analysis, respectively. The regression analyses are performed in the 'SelectModelsIA.R' and 'SelectModelsAM.R' scripts to calculate the regression model results for the IA metric and MA/TA metrics, respectively. These scripts require the 'RandomAndFixedEffects.R' script, containing the random and fixed effects parameter combinations, as well as the 'Functions.R' script. The 'CreateMap.R' script creates a global map with the location of all studies included in the analysis (figure 1 in the paper). The 'CreateForestPlots.R' script creates plots showing the IA data distribution for both taxonomic groups (figure 2 in the paper). The 'CreateHeatMaps.R' script creates heat maps for all metrics and taxonomic groups (figure 3 in the paper, figures S11.1 and S11.2 in the appendix). The 'CalculateStatistics.R' script calculates the descriptive statistics that are reported throughout the paper, and creates the figures that describe the dataset characteristics (figures S3.1 to S3.5 in the appendix). The 'CreateFunnelPlots.R' script creates the funnel plots for both taxonomic groups (figures S6.1 and S6.2 in the appendix) and performs Egger's tests. The 'CreateControlGraphs.R' script creates graphs showing the dependency of the nutrient response to control concentrations for all metrics and taxonomic groups (figures S10.1 and S10.2 in the appendix).

    The 'figures' folder contains all figures that are included in this study.

  17. e

    Data from: Data and R Scripts for: "Transcriptome network analysis...

    • b2find.eudat.eu
    • dataverse.nl
    Updated Apr 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Data and R Scripts for: "Transcriptome network analysis implicates CX3CR1-positive type 3 dendritic cells in non-infectious uveitis" [Dataset]. https://b2find.eudat.eu/dataset/0db8909d-be52-5eaa-8b77-a278822c71ee
    Explore at:
    Dataset updated
    Apr 6, 2023
    Description

    This dataset contains the processed RNA sequencing data of purified CD1c-positive conventional type 2 dendritic cells (CD1c+ cDC2s), functional enrichment analysis, manual and automatic gating data of (i.e., flowSOM) flow cytometry, and multiplex cytokine analyses as outlined in Hiddingh et al. 2022 "Transcriptome network analysis implicates CX3CR1-positive type 3 dendritic cells in non-infectious uveitis see preprint on BioRxiv Data are from two cohorts (cohort I, n=36, and cohort II, n=42) of in total 51 patients with non-infectious uveitis (HLA-B27-positive acute anterior uveitis, idiopathic intermediate uveitis, HLA-A29-positive Birdshot Uveitis (Birdshot chorioretinopathy), and 27 sex/age-matched healthy controls without ocular inflammatory disease). All raw sequencing data are available at NCBI SRA under the accession number: GSE195501 (FACS-sorted cohort I). GSE194060 (MACS-sorted cohort II). This dataverseNL dataset contains additional raw, processed, and metadata (see readme file and reproducible R notebooks (R script and Image) used for the analysis in the manuscript: R scripts (markdown + R image) with step-by-step analyses Figure_1.rmd (see "Figure_1.html") Figure_2.rmd (see "Figure_2.html") Figure_3.rmd (see "Figure_3.html") Figure_4.rmd (see "Figure_4.html") Figure_5.rmd(see "Figure_5.html") Processed RNA seq data (including WGCNA) (see folder Uveitis_mDC in files) Experimental data Manual gating data of MACS-sorted fractions cohort I (see here) Manual gating data for CD1c+ cDC2 subsets in PBMCs (see here) Manual gating CD14+ and CD14- CD1c+ cDC2 fractions from Buffy (see here) qPCR data for CX3CR1,CCR5,CCR2,IRF8,TLR7,RUNX3 and CD36 in sorted CD14+ and CD14- CD1c+ DCs (see here) qPCR data (fold change compared to medium) for RUNX3 and CD36 in overnight stimulated cDC2 cultures (see here) Cell phenotypes identified by flowSOM (7x7 grid) using the cDC2-subset flow cytometry panel (see here) IL-23 ELISA concentration in supernatant of overnight LTA-stimulated cDC2 subset cultures (see here) Luminex Multiplex Cytokine analysis of supernatant of overnight LTA-stimulated cDC2 subset cultures (see here) Other transcriptomic data used in the R scripts (above) WT Untreated cDC2 versus cDC2 from Runx3-11cKO miceGSE48590 generated by Dicken et al., PLoS One 2013 WT Untreated cDC2 versus cDC2 from Notch2-11cKO miceGSE119242 generated by Briseño et al., Proc Natl Acad Sci U S A 2018 Sorted CD14+CD5-CD163+ and CD14-CD5-CD163+ cDC2s from SLE and Scleroderma patients GSE136731 generated by Dutertre et al., Immunity 2019 Single-cell RNA-seq of aqueous humor from 4 HLA-B27-positive uveitis patients and control GSE178833 generated by Kasper et al., Elife 2021 Inflammatory [inf-]cDC2sGSE149619 generated by Bosteels et al., Immunity 2020 RNA-seq data from cDC2s generated from murine bone marrow cells in co-culture with stromal OP-9 cell line transduced with or without expression of the Notch ligand Delta-like 1GSE110577 generated by Kirkling et al., Cell Rep 2018 Transcriptome network analysis implicates CX3CR1-positive type 3 dendritic cells in non-infectious uveitis Background:Type I interferons (IFNs) promote the expansion of subsets of CD1c+ conventional dendritic cells (CD1c+ DCs), but the molecular basis of CD1c+ DCs involvement in conditions not associated without elevated type I IFNs remains unclear. Methods: We analyzed CD1c+ DCs from two cohorts of non-infectious uveitis patients and healthy donors using RNA-sequencing followed by high-dimensional flow cytometry to characterize the CD1c+ DC populations. Results: We report that the CD1c+ DCs pool from patients with non-infectious uveitis is skewed towards a gene module with the chemokine receptor CX3CR1 as the key hub gene. We confirmed these results in an independent case-control cohort and show that the disease-associated gene module is not mediated by type I IFNs. An analysis of peripheral blood using flow cytometry revealed that CX3CR1+ DC3s were diminished, whereas CX3CR1- DC3s were not. Stimulated CX3CR1+ DC3s secrete high levels of inflammatory cytokines, including TNF-alpha, and CX3CR1+ DC3-like cells can be detected in inflamed eyes of patients. Conclusion: These results show that CX3CR1+ DC3s are implicated in non-infectious uveitis and can secrete proinflammatory mediators implicated in its pathophysiology.

  18. g

    Indonesian Family Life Study, merged subset

    • laurabotzet.github.io
    Updated 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RAND corporation (2016). Indonesian Family Life Study, merged subset [Dataset]. https://laurabotzet.github.io/birth_order_ifls/2_codebook.html
    Explore at:
    Dataset updated
    2016
    Authors
    RAND corporation
    Time period covered
    2014 - 2015
    Area covered
    13 Indonesian provinces. The sample is representative of about 83% of the Indonesian population and contains over 30, 000 individuals living in 13 of the 27 provinces in the country. See URL for more.
    Variables measured
    a1, a2, c1, c3, e1, e3, n2, n3, o1, o2, and 138 more
    Description

    Data from the IFLS, merged across waves, most outcomes taken from wave 5. Includes birth order, family structure, Big 5 Personality, intelligence tests, and risk lotteries

    Table of variables

    This table contains variable names, labels, and number of missing values. See the complete codebook for more.

    [truncated]

    Note

    This dataset was automatically described using the codebook R package (version 0.8.2).

  19. Data from: Defining Privileged Reagents Using Subsimilarity Comparison

    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brett A. Tounge; Charles H. Reynolds (2023). Defining Privileged Reagents Using Subsimilarity Comparison [Dataset]. http://doi.org/10.1021/ci049854j.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    ACS Publications
    Authors
    Brett A. Tounge; Charles H. Reynolds
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    We have developed a new method for assigning a drug-like score to reagents. This algorithm uses topological torsion (TT) 2D descriptors to compute the subsimilarity of any given reagent to a substructural element of any compound in the CMC. The utility of this approach is demonstrated by scoring a test set of reagents derived from the “Comprehensive Survey of Combinatorial Library Synthesis:  2000” (J. Comb. Chem.). R-groups were extracted from the most-active compounds found in each of the reviewed libraries, and the distribution of the subsimilarity scores for these monomers were compared to the ACD. This comparison showed a dramatic shift in the distribution of the JCC R-group subset toward higher subsimilarity scores in comparison to the entire ACD database. The ACD was also used to examine the relationship between molecular weight and various subsimilarity scoring algorithms. This analysis was used to derive a subsimilarity score that is less biased by molecular weight.

  20. H

    Replication Data for: Free riding or discounted riding? How the framing of a...

    • dataverse.harvard.edu
    csv, tsv, txt
    Updated Oct 31, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2019). Replication Data for: Free riding or discounted riding? How the framing of a bike share offer impacts offer-redemption [Dataset]. http://doi.org/10.7910/DVN/BFSGZI
    Explore at:
    tsv(91032), tsv(175036), txt(4289), csv(308111)Available download formats
    Dataset updated
    Oct 31, 2019
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This data set contains three .csv files and one .txt file, which contains the code used in R to produce the results shown in the paper. The three .csv files are: PortlandALL (all the data in one place). PortlandNewDocks (only the subset of data for residents with a new station built near their existing home). And PortlandNewMovers (only the subset of data for residents who have newly moved to the area). These are in three separate files as the author is not proficient in using R and found it easier to work with three different datasets.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Einetic (2025). Subsetting [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-computer-application-2023-2024/2/data-analysis-with-r/subsetting

Subsetting

3

Explore at:
htmlAvailable download formats
Dataset updated
Aug 11, 2025
Dataset authored and provided by
Einetic
License

https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

Description

Question Paper Solutions of chapter Subsetting of Data Analysis with R, 2nd Semester , Bachelor of Computer Application 2023-2024

Search
Clear search
Close search
Google apps
Main menu