100+ datasets found
  1. d

    Data from: Source code for R tutorials and dataset for empirical case study...

    • datadryad.org
    • data.niaid.nih.gov
    • +2more
    zip
    Updated Jul 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martijn van de Pol; Lyanne Brouwer (2021). Source code for R tutorials and dataset for empirical case study on Malurus elegans (red-winged fairy wren) [Dataset]. http://doi.org/10.5061/dryad.7h44j0ztw
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 27, 2021
    Dataset provided by
    Dryad
    Authors
    Martijn van de Pol; Lyanne Brouwer
    Time period covered
    Jul 23, 2021
    Description

    Biological processes exhibit complex temporal dependencies due to the sequential nature of allocation decisions in organisms’ life-cycles, feedback loops, and two-way causality. Consequently, longitudinal data often contain cross-lags: the predictor variable depends on the response variable of the previous time-step. Although statisticians have warned that regression models that ignore such covariate endogeneity in time series are likely to be inappropriate, this has received relatively little attention in biology. Furthermore, the resulting degree of estimation bias remains largely unexplored.

    We use a graphical model and numerical simulations to understand why and how regression models that ignore cross-lags can be biased, and how this bias depends on the length and number of time series. Ecological and evolutionary examples are provided to illustrate that cross-lags may be more common than is typically appreciated and that they occur in functionally different ways.

    We show that rou...

  2. Open-Source Spatial Analytics (R) - Datasets - AmericaView - CKAN

    • ckan.americaview.org
    Updated Sep 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.americaview.org (2022). Open-Source Spatial Analytics (R) - Datasets - AmericaView - CKAN [Dataset]. https://ckan.americaview.org/dataset/open-source-spatial-analytics-r
    Explore at:
    Dataset updated
    Sep 10, 2022
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this course, you will learn to work within the free and open-source R environment with a specific focus on working with and analyzing geospatial data. We will cover a wide variety of data and spatial data analytics topics, and you will learn how to code in R along the way. The Introduction module provides more background info about the course and course set up. This course is designed for someone with some prior GIS knowledge. For example, you should know the basics of working with maps, map projections, and vector and raster data. You should be able to perform common spatial analysis tasks and make map layouts. If you do not have a GIS background, we would recommend checking out the West Virginia View GIScience class. We do not assume that you have any prior experience with R or with coding. So, don't worry if you haven't developed these skill sets yet. That is a major goal in this course. Background material will be provided using code examples, videos, and presentations. We have provided assignments to offer hands-on learning opportunities. Data links for the lecture modules are provided within each module while data for the assignments are linked to the assignment buttons below. Please see the sequencing document for our suggested order in which to work through the material. After completing this course you will be able to: prepare, manipulate, query, and generally work with data in R. perform data summarization, comparisons, and statistical tests. create quality graphs, map layouts, and interactive web maps to visualize data and findings. present your research, methods, results, and code as web pages to foster reproducible research. work with spatial data in R. analyze vector and raster geospatial data to answer a question with a spatial component. make spatial models and predictions using regression and machine learning. code in the R language at an intermediate level.

  3. d

    Replication Data for: Revisiting 'The Rise and Decline' in a Population of...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TeBlunthuis, Nathan; Aaron Shaw; Benjamin Mako Hill (2023). Replication Data for: Revisiting 'The Rise and Decline' in a Population of Peer Production Projects [Dataset]. http://doi.org/10.7910/DVN/SG3LP1
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    TeBlunthuis, Nathan; Aaron Shaw; Benjamin Mako Hill
    Description

    This archive contains code and data for reproducing the analysis for “Replication Data for Revisiting ‘The Rise and Decline’ in a Population of Peer Production Projects”. Depending on what you hope to do with the data you probabbly do not want to download all of the files. Depending on your computation resources you may not be able to run all stages of the analysis. The code for all stages of the analysis, including typesetting the manuscript and running the analysis, is in code.tar. If you only want to run the final analysis or to play with datasets used in the analysis of the paper, you want intermediate_data.7z or the uncompressed tab and csv files. The data files are created in a four-stage process. The first stage uses the program “wikiq” to parse mediawiki xml dumps and create tsv files that have edit data for each wiki. The second stage generates all.edits.RDS file which combines these tsvs into a dataset of edits from all the wikis. This file is expensive to generate and at 1.5GB is pretty big. The third stage builds smaller intermediate files that contain the analytical variables from these tsv files. The fourth stage uses the intermediate files to generate smaller RDS files that contain the results. Finally, knitr and latex typeset the manuscript. A stage will only run if the outputs from the previous stages do not exist. So if the intermediate files exist they will not be regenerated. Only the final analysis will run. The exception is that stage 4, fitting models and generating plots, always runs. If you only want to replicate from the second stage onward, you want wikiq_tsvs.7z. If you want to replicate everything, you want wikia_mediawiki_xml_dumps.7z.001 wikia_mediawiki_xml_dumps.7z.002, and wikia_mediawiki_xml_dumps.7z.003. These instructions work backwards from building the manuscript using knitr, loading the datasets, running the analysis, to building the intermediate datasets. Building the manuscript using knitr This requires working latex, latexmk, and knitr installations. Depending on your operating system you might install these packages in different ways. On Debian Linux you can run apt install r-cran-knitr latexmk texlive-latex-extra. Alternatively, you can upload the necessary files to a project on Overleaf.com. Download code.tar. This has everything you need to typeset the manuscript. Unpack the tar archive. On a unix system this can be done by running tar xf code.tar. Navigate to code/paper_source. Install R dependencies. In R. run install.packages(c("data.table","scales","ggplot2","lubridate","texreg")) On a unix system you should be able to run make to build the manuscript generalizable_wiki.pdf. Otherwise you should try uploading all of the files (including the tables, figure, and knitr folders) to a new project on Overleaf.com. Loading intermediate datasets The intermediate datasets are found in the intermediate_data.7z archive. They can be extracted on a unix system using the command 7z x intermediate_data.7z. The files are 95MB uncompressed. These are RDS (R data set) files and can be loaded in R using the readRDS. For example newcomer.ds <- readRDS("newcomers.RDS"). If you wish to work with these datasets using a tool other than R, you might prefer to work with the .tab files. Running the analysis Fitting the models may not work on machines with less than 32GB of RAM. If you have trouble, you may find the functions in lib-01-sample-datasets.R useful to create stratified samples of data for fitting models. See line 89 of 02_model_newcomer_survival.R for an example. Download code.tar and intermediate_data.7z to your working folder and extract both archives. On a unix system this can be done with the command tar xf code.tar && 7z x intermediate_data.7z. Install R dependencies. install.packages(c("data.table","ggplot2","urltools","texreg","optimx","lme4","bootstrap","scales","effects","lubridate","devtools","roxygen2")). On a unix system you can simply run regen.all.sh to fit the models, build the plots and create the RDS files. Generating datasets Building the intermediate files The intermediate files are generated from all.edits.RDS. This process requires about 20GB of memory. Download all.edits.RDS, userroles_data.7z,selected.wikis.csv, and code.tar. Unpack code.tar and userroles_data.7z. On a unix system this can be done using tar xf code.tar && 7z x userroles_data.7z. Install R dependencies. In R run install.packages(c("data.table","ggplot2","urltools","texreg","optimx","lme4","bootstrap","scales","effects","lubridate","devtools","roxygen2")). Run 01_build_datasets.R. Building all.edits.RDS The intermediate RDS files used in the analysis are created from all.edits.RDS. To replicate building all.edits.RDS, you only need to run 01_build_datasets.R when the int... Visit https://dataone.org/datasets/sha256%3Acfa4980c107154267d8eb6dc0753ed0fde655a73a062c0c2f5af33f237da3437 for complete metadata about this dataset.

  4. Plocamium reproductive system data and R code

    • usap-dc.org
    • search.dataone.org
    html, xml
    Updated Nov 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amsler, Charles (2022). Plocamium reproductive system data and R code [Dataset]. http://doi.org/10.15784/601622
    Explore at:
    xml, htmlAvailable download formats
    Dataset updated
    Nov 21, 2022
    Dataset provided by
    United States Antarctic Programhttp://www.usap.gov/
    Authors
    Amsler, Charles
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Data and R code from Sabrina Heiser's study of the reproductive system of Plocamium sp. in the Palmer Station region.

  5. Clustering of samples and variables with mixed-type data

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manuela Hummel; Dominic Edelmann; Annette Kopp-Schneider (2023). Clustering of samples and variables with mixed-type data [Dataset]. http://doi.org/10.1371/journal.pone.0188274
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Manuela Hummel; Dominic Edelmann; Annette Kopp-Schneider
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of data measured on different scales is a relevant challenge. Biomedical studies often focus on high-throughput datasets of, e.g., quantitative measurements. However, the need for integration of other features possibly measured on different scales, e.g. clinical or cytogenetic factors, becomes increasingly important. The analysis results (e.g. a selection of relevant genes) are then visualized, while adding further information, like clinical factors, on top. However, a more integrative approach is desirable, where all available data are analyzed jointly, and where also in the visualization different data sources are combined in a more natural way. Here we specifically target integrative visualization and present a heatmap-style graphic display. To this end, we develop and explore methods for clustering mixed-type data, with special focus on clustering variables. Clustering of variables does not receive as much attention in the literature as does clustering of samples. We extend the variables clustering methodology by two new approaches, one based on the combination of different association measures and the other on distance correlation. With simulation studies we evaluate and compare different clustering strategies. Applying specific methods for mixed-type data proves to be comparable and in many cases beneficial as compared to standard approaches applied to corresponding quantitative or binarized data. Our two novel approaches for mixed-type variables show similar or better performance than the existing methods ClustOfVar and bias-corrected mutual information. Further, in contrast to ClustOfVar, our methods provide dissimilarity matrices, which is an advantage, especially for the purpose of visualization. Real data examples aim to give an impression of various kinds of potential applications for the integrative heatmap and other graphical displays based on dissimilarity matrices. We demonstrate that the presented integrative heatmap provides more information than common data displays about the relationship among variables and samples. The described clustering and visualization methods are implemented in our R package CluMix available from https://cran.r-project.org/web/packages/CluMix.

  6. p

    data_neo.Rdata

    • psycharchives.org
    Updated Dec 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). data_neo.Rdata [Dataset]. https://psycharchives.org/handle/20.500.12034/4717
    Explore at:
    Dataset updated
    Dec 20, 2021
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R is a very powerful language for statistical computing in many disciplines of research and has a steep learning curve. The software is open source, freely available and has a thriving community. This crash course provides an overview of Base-R concepts for beginners and covers the topics 1) introduction into R, 2) reading, saving, and viewing data, 3) selecting and changing objects in R, and 4) descriptive statistics.This course was held by Lisa Spitzer on September 3, 2021, as a precursor to the R tidyverse Workshop by Aurélien Ginolhac and Roland Krause (September 8 - 10, 2021). This entry features the slides, exercises/results, and chat messages of the crash course. Related to this entry are the recordings of the course, and the r tidyverse workshop materials. Click on "related PsychArchives objects" to view or download the recordings of the workshop.:

  7. Semantic Segmentation - BEV

    • kaggle.com
    zip
    Updated Dec 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sakshay Mahna (2024). Semantic Segmentation - BEV [Dataset]. https://www.kaggle.com/datasets/sakshaymahna/semantic-segmentation-bev/versions/922
    Explore at:
    zip(2155380825 bytes)Available download formats
    Dataset updated
    Dec 10, 2024
    Authors
    Sakshay Mahna
    Description

    Context

    This dataset has been created as part of the Cam2BEV project. There, the datasets are used for the computation of a semantically segmented bird's eye view (BEV) image given the images of multiple vehicle-mounted cameras as presented in the paper:

    A Sim2Real Deep Learning Approach for the Transformation of Images from Multiple Vehicle-Mounted Cameras to a Semantically Segmented Image in Bird’s Eye View (arXiv)

    Lennart Reiher, Bastian Lampe, and Lutz Eckstein
    Institute for Automotive Engineering (ika), RWTH Aachen University

    Content

    360° Surround Cameras

    • front camera
    • rear camera
    • left camera
    • right camera
    • bird's eye view
    • bird's eye view incl. occlusion
    • homography view

    https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data/-/raw/master/1_FRLR/examples/front.png"> https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data/-/raw/master/1_FRLR/examples/rear.png"> https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data/-/raw/master/1_FRLR/examples/left.png"> https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data/-/raw/master/1_FRLR/examples/right.png"> https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data/-/raw/master/1_FRLR/examples/bev.png"> https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data/-/raw/master/1_FRLR/examples/bev+occlusion.png"> https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data/-/raw/master/1_FRLR/examples/homography.png">

    Characteristics

    # Training Samples# Validation Samples# Vehicle Cameras# Semantic Classes
    3319937314 (front, rear, left, right)30 (CityScapes)

    Note: The CityScapes colors for semantic classes Pedestrian and Rider are switched due to technical reasons.

    Front Camera

    Resolution (x,y)Focal Length (x,y)Principal Point (x,y)Position (X,Y,Z)Rotation (H, P, R)
    964, 604278.283, 408.1295482, 3021.7, 0.0, 1.40.0, 0.0, 0.0

    Rear Camera

    Resolution (x,y)Focal Length (x,y)Principal Point (x,y)Position (X,Y,Z)Rotation (H, P, R)
    964, 604278.283, 408.1295482, 302-0.6, 0.0, 1.43.1415, 0.0, 0.0

    Left Camera

    Resolution (x,y)Focal Length (x,y)Principal Point (x,y)Position (X,Y,Z)Rotation (H, P, R)
    964, 604278.283, 408.1295482, 3020.5, 0.5, 1.51.5708, 0.0, 0.0

    Right Camera

    Resolution (x,y)Focal Length (x,y)Principal Point (x,y)Position (X,Y,Z)Rotation (H, P, R)
    964, 604278.283, 408.1295482, 3020.5, -0.5, 1.5-1.5708, 0.0, 0.0

    Drone Camera

    Resolution (x,y)Focal Length (x,y)Principal Point (x,y)Position (X,Y,Z)Rotation (H, P, R)
    964, 604682.578, 682.578482, 3020.0, 0.0, 50.00.0, 1.5708, -1.5708

    Acknowledgements

    The original dataset is taken from this website.

    License

    The Cam2BEV data available from the corresponding website has the following license:

    This dataset is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching or scientific publications. Permission is granted to use the data given that you agree:

    1. That the dataset comes "AS IS", without express or implied warranty. Although every effort has been made to ensure accuracy, we (ika) do not accept any responsibility for errors or omissions.
    2. That you include a reference to the Cam2BEV dataset in any work that makes use of the dataset. For research papers, cite our preferred publication as listed in the readme of this repository; for other media cite our preferred publication as listed in the readme of this repository or link to the github page of the Cam2BEV project.
    3. That you do not distribute this dataset or modified versions. It is permissible to distribute derivative works in as far as they are abstract representations of this dataset (such as models trained on it or additional annotations that do not directly include any of our data) and do not allow to recover the dataset or something similar in character.
    4. That you may not use the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain.
    5. That all rights not expressly granted to you are r...
  8. Data from: The temporal specificity of BOLD fMRI is systematically related...

    • openneuro.org
    Updated Mar 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel E.P. Gomez; Jonathan R. Polimeni; Laura D. Lewis (2025). The temporal specificity of BOLD fMRI is systematically related to anatomical and vascular features of the human brain [Dataset]. http://doi.org/10.18112/openneuro.ds006005.v1.0.1
    Explore at:
    Dataset updated
    Mar 13, 2025
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Daniel E.P. Gomez; Jonathan R. Polimeni; Laura D. Lewis
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Dataset associated with "The temporal specificity of BOLD fMRI is systematically related to anatomical and vascular features of the human brain"

    In the anat folder, three additional files beyond the T1w MPRAGE can be found:

    1. An inversion recovery T1 shuffled EPI, named with acq-epi and a suffix _inplaneT1.
    2. A point spread function (PSF) mapping scan, named with acq-psf and a suffix _T2starw.
    3. A larger field of view dataset used to aid registration, named with acq-epi and a suffix _T2starw.

    The larger field of view dataset was used in the work that resulted in our publication, but the T1 shuffled EPI and the PSF were not. They are added here though in the hope that they may be for other purposes useful.

  9. e

    Broad View Textile Limited Kwun Tong R Export Import Data | Eximpedia

    • eximpedia.app
    Updated Sep 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Broad View Textile Limited Kwun Tong R Export Import Data | Eximpedia [Dataset]. https://www.eximpedia.app/companies/broad-view-textile-limited-kwun-tong-r/10834021
    Explore at:
    Dataset updated
    Sep 26, 2025
    Area covered
    Kwun Tong
    Description

    Broad View Textile Limited Kwun Tong R Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.

  10. s

    Jay r smith mfg company USA Import & Buyer Data

    • seair.co.in
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim Solutions, Jay r smith mfg company USA Import & Buyer Data [Dataset]. https://www.seair.co.in/us-importers/jay-r-smith-mfg-company.aspx
    Explore at:
    .text/.csv/.xml/.xls/.binAvailable download formats
    Dataset authored and provided by
    Seair Exim Solutions
    Area covered
    United States
    Description

    View Jay r smith mfg company import data USA including customs records, shipments, HS codes, suppliers, buyer details & company profile at Seair Exim.

  11. u

    HLY-08-01 Raw Knudsen 320B/R Depth Sounder Data [Cooper, L./LDEO]

    • data.ucar.edu
    • search.dataone.org
    • +1more
    archive
    Updated Oct 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lee W. Cooper (2025). HLY-08-01 Raw Knudsen 320B/R Depth Sounder Data [Cooper, L./LDEO] [Dataset]. http://doi.org/10.5065/D6M043DN
    Explore at:
    archiveAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    Lee W. Cooper
    Time period covered
    Mar 14, 2008 - Mar 26, 2008
    Area covered
    Description

    This dataset includes data from the Knudsen 320B/R Depth Sounder system onboard the US Coast Guard Cutter Healy during the Bering Sea Ecosystem Study-Bering Sea Integrated Ecosystem Research Program (BEST-BSIERP) 2008 0801 (early spring) cruise. BEST-BSIERP together are the Bering Sea project. The Knudsen depth sounder records the 3 - 6 kHz data (Sub Bottom Profile) underway. The following is a list of the file extensions and their meaning: .keb - Binary Knudsen Playback File .kea - ASCII log of depth, settings and environmental data .sgy - Binary SEG-Y extended Seismic format. These files are available in tar files by day.

  12. CTD Data Acquired by R/V Xue Long in the Prydz Bay- Amery Ice Shelf Region,...

    • usap-dc.org
    • get.iedadata.org
    • +1more
    html, xml
    Updated May 2, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuan, Xiaojun (2016). CTD Data Acquired by R/V Xue Long in the Prydz Bay- Amery Ice Shelf Region, 2015-2017 [Dataset]. http://doi.org/10.15784/600174
    Explore at:
    html, xmlAvailable download formats
    Dataset updated
    May 2, 2016
    Dataset provided by
    United States Antarctic Programhttp://www.usap.gov/
    Authors
    Yuan, Xiaojun
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Area covered
    Description

    This dataset contains inventories and location maps for CTD data acquired by the icebreaker R/V Xue Long in the Prydz Bay- Amery Ice Shelf region. A total of 68 stations were acquired in February 2015 and 24 stations in March 2017, as part of a joint US/China project to study Antarctic Bottom Water (AABW) formation.

  13. HadISD: Global sub-daily, surface meteorological station data, 1931-2023,...

    • catalogue.ceda.ac.uk
    • data-search.nerc.ac.uk
    Updated Mar 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NERC EDS Centre for Environmental Data Analysis (2025). HadISD: Global sub-daily, surface meteorological station data, 1931-2023, v3.4.0.2023f [Dataset]. https://catalogue.ceda.ac.uk/uuid/b82b58d085d0433b821f4ae31cb608de
    Explore at:
    Dataset updated
    Mar 21, 2025
    Dataset provided by
    Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
    License

    http://www.nationalarchives.gov.uk/doc/non-commercial-government-licence/version/2/http://www.nationalarchives.gov.uk/doc/non-commercial-government-licence/version/2/

    Time period covered
    Jan 1, 1931 - Dec 31, 2023
    Area covered
    Earth
    Variables measured
    time, altitude, latitude, longitude, wind_speed, air_temperature, wind_speed_of_gust, cloud_area_fraction, cloud_base_altitude, wind_from_direction, and 7 more
    Description

    This is version v3.4.0.2023f of Met Office Hadley Centre's Integrated Surface Database, HadISD. These data are global sub-daily surface meteorological data.

    This update (v3.4.0.2023f) to HadISD corrects a long-standing bug which was discovered in autumn 2023 whereby the neighbour checks (and associated [un]flagging for some other tests) were not being implemented. For more details see the posts on the HadISD blog: https://hadisd.blogspot.com/2023/10/bug-in-buddy-checks.html & https://hadisd.blogspot.com/2024/01/hadisd-v3402023f-future-look.html

    The quality controlled variables in this dataset are: temperature, dewpoint temperature, sea-level pressure, wind speed and direction, cloud data (total, low, mid and high level). Past significant weather and precipitation data are also included, but have not been quality controlled, so their quality and completeness cannot be guaranteed. Quality control flags and data values which have been removed during the quality control process are provided in the qc_flags and flagged_values fields, and ancillary data files show the station listing with a station listing with IDs, names and location information.

    The data are provided as one NetCDF file per station. Files in the station_data folder station data files have the format "station_code"_HadISD_HadOBS_19310101-20240101_v3.4.1.2023f.nc. The station codes can be found under the docs tab. The station codes file has five columns as follows: 1) station code, 2) station name 3) station latitude 4) station longitude 5) station height.

    To keep informed about updates, news and announcements follow the HadOBS team on twitter @metofficeHadOBS.

    For more detailed information e.g bug fixes, routine updates and other exploratory analysis, see the HadISD blog: http://hadisd.blogspot.co.uk/

    References: When using the dataset in a paper you must cite the following papers (see Docs for link to the publications) and this dataset (using the "citable as" reference) :

    Dunn, R. J. H., (2019), HadISD version 3: monthly updates, Hadley Centre Technical Note.

    Dunn, R. J. H., Willett, K. M., Parker, D. E., and Mitchell, L.: Expanding HadISD: quality-controlled, sub-daily station data from 1931, Geosci. Instrum. Method. Data Syst., 5, 473-491, doi:10.5194/gi-5-473-2016, 2016.

    Dunn, R. J. H., et al. (2012), HadISD: A Quality Controlled global synoptic report database for selected variables at long-term stations from 1973-2011, Clim. Past, 8, 1649-1679, 2012, doi:10.5194/cp-8-1649-2012

    Smith, A., N. Lott, and R. Vose, 2011: The Integrated Surface Database: Recent Developments and Partnerships. Bulletin of the American Meteorological Society, 92, 704–708, doi:10.1175/2011BAMS3015.1

    For a homogeneity assessment of HadISD please see this following reference

    Dunn, R. J. H., K. M. Willett, C. P. Morice, and D. E. Parker. "Pairwise homogeneity assessment of HadISD." Climate of the Past 10, no. 4 (2014): 1501-1522. doi:10.5194/cp-10-1501-2014, 2014.

  14. s

    R & d equipment company inc USA Import & Buyer Data

    • seair.co.in
    Updated Oct 6, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim Solutions (2018). R & d equipment company inc USA Import & Buyer Data [Dataset]. https://www.seair.co.in/us-importers/r--d-equipment-company-inc.aspx
    Explore at:
    .text/.csv/.xml/.xls/.binAvailable download formats
    Dataset updated
    Oct 6, 2018
    Dataset authored and provided by
    Seair Exim Solutions
    Area covered
    United States
    Description

    View R & d equipment company inc import data USA including customs records, shipments, HS codes, suppliers, buyer details & company profile at Seair Exim.

  15. z

    Galvanising the Open Access Community: A Study on the Impact of Plan S -...

    • zenodo.org
    bin, csv
    Updated Oct 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    W. Benedikt Schmal; W. Benedikt Schmal (2024). Galvanising the Open Access Community: A Study on the Impact of Plan S - Data and Code [Dataset]. http://doi.org/10.5281/zenodo.12523229
    Explore at:
    csv, binAvailable download formats
    Dataset updated
    Oct 15, 2024
    Dataset provided by
    Scidecode
    Authors
    W. Benedikt Schmal; W. Benedikt Schmal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the datasets and code underpinning Chapter 3 "Counterfactual Impact Evaluation of Plan S" of the report "Galvanising the Open Access Community: A Study on the Impact of Plan S" commissioned by the cOAlition S to scidecode science consulting.

    Two categories of files are part of this repository:

    1. Datasets

    The 21 CSV source files contain the subsets of publications funded by the funding agencies that are part of this study. These files have been provided by OA.Works, with whom scidecode has collaborated for the data collection process. Data sources and collection and processing workflows applied by OA.Works are described on their website and specifically at https://about.oa.report/docs/data.

    The file "plan_s.dta" is the aggregated data file stored in the format ".dta", which can be accessed with STATA by default or with plenty of programming languages using the respective packages, e.g., R or Python.

    2. Code files

    The associated code files that have been used to process the data files are:

     - data_prep_and_analysis_script.do
    - coef_plots_script.R

    The first file has been used to process the CSV data files above for data preparation and analysis purposes. Here, data aggregation and data preprocessing is executed. Furthermore, all statistical regressions for the ounterfactual impact evaluation are listed in this code file. The second code file "coef_plots_script.R" uses the computed results of the counterfactual impact evaluation to create the final graphic plots using the ggplot2 package.

    The first ".do" file has to be run in STATA, the second one (".R") requires the use of an integrated development environment for R.

    Further Information are avilable in the final report and via the followng URLs:
    https://www.coalition-s.org/
    https://scidecode.com/
    https://oa.works/
    https://openalex.org/
    https://sites.google.com/view/wbschmal
  16. d

    Health and Retirement Study (HRS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Health and Retirement Study (HRS) [Dataset]. http://doi.org/10.7910/DVN/ELEKOY
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D

  17. Dataset for 'From vision toward best practices: Evaluating in vitro...

    • catalog.data.gov
    • datasets.ai
    Updated Jun 29, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2023). Dataset for 'From vision toward best practices: Evaluating in vitro transcriptomic points of departure for application in risk assessment using a uniform workflow' [Dataset]. https://catalog.data.gov/dataset/dataset-for-from-vision-toward-best-practices-evaluating-in-vitro-transcriptomic-points-of
    Explore at:
    Dataset updated
    Jun 29, 2023
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Data for Reardon AJF, et al., From vision toward best practices: Evaluating in vitro transcriptomic points of departure for application in risk assessment using a uniform workflow. Front. Toxicol. 5:1194895. doi: 10.3389/ftox.2023.1194895. PMC10242042. This dataset is associated with the following publication: Reardon, A., R. Farmahin, A. Williams, M. Meier, G. Addicks, C. Yauk, G. Matteo, E. Atlas, J. Harrill, L. Everett, I. Shah, R. Judson, S. Ramaiahgari, S. Ferguson, and T. Barton-Maclaren. From vision toward best practices: Evaluating in vitro transcriptomic points of departure for application in risk assessment using a uniform workflow. Frontiers in Toxicology. Frontiers, Lausanne, SWITZERLAND, 5: 1194895, (2023).

  18. Z

    ELKI Multi-View Clustering Data Sets Based on the Amsterdam Library of...

    • data.niaid.nih.gov
    • elki-project.github.io
    • +1more
    Updated May 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schubert, Erich; Zimek, Arthur (2024). ELKI Multi-View Clustering Data Sets Based on the Amsterdam Library of Object Images (ALOI) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6355683
    Explore at:
    Dataset updated
    May 2, 2024
    Dataset provided by
    Ludwig-Maximilians-Universität München
    Authors
    Schubert, Erich; Zimek, Arthur
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These data sets were originally created for the following publications:

    M. E. Houle, H.-P. Kriegel, P. Kröger, E. Schubert, A. Zimek Can Shared-Neighbor Distances Defeat the Curse of Dimensionality? In Proceedings of the 22nd International Conference on Scientific and Statistical Database Management (SSDBM), Heidelberg, Germany, 2010.

    H.-P. Kriegel, E. Schubert, A. Zimek Evaluation of Multiple Clustering Solutions In 2nd MultiClust Workshop: Discovering, Summarizing and Using Multiple Clusterings Held in Conjunction with ECML PKDD 2011, Athens, Greece, 2011.

    The outlier data set versions were introduced in:

    E. Schubert, R. Wojdanowski, A. Zimek, H.-P. Kriegel On Evaluation of Outlier Rankings and Outlier Scores In Proceedings of the 12th SIAM International Conference on Data Mining (SDM), Anaheim, CA, 2012.

    They are derived from the original image data available at https://aloi.science.uva.nl/

    The image acquisition process is documented in the original ALOI work: J. M. Geusebroek, G. J. Burghouts, and A. W. M. Smeulders, The Amsterdam library of object images, Int. J. Comput. Vision, 61(1), 103-112, January, 2005

    Additional information is available at: https://elki-project.github.io/datasets/multi_view

    The following views are currently available:

        Feature type
        Description
        Files
    
    
        Object number
        Sparse 1000 dimensional vectors that give the true object assignment
        objs.arff.gz
    
    
        RGB color histograms
        Standard RGB color histograms (uniform binning)
        aloi-8d.csv.gz aloi-27d.csv.gz aloi-64d.csv.gz aloi-125d.csv.gz aloi-216d.csv.gz aloi-343d.csv.gz aloi-512d.csv.gz aloi-729d.csv.gz aloi-1000d.csv.gz
    
    
        HSV color histograms
        Standard HSV/HSB color histograms in various binnings
        aloi-hsb-2x2x2.csv.gz aloi-hsb-3x3x3.csv.gz aloi-hsb-4x4x4.csv.gz aloi-hsb-5x5x5.csv.gz aloi-hsb-6x6x6.csv.gz aloi-hsb-7x7x7.csv.gz aloi-hsb-7x2x2.csv.gz aloi-hsb-7x3x3.csv.gz aloi-hsb-14x3x3.csv.gz aloi-hsb-8x4x4.csv.gz aloi-hsb-9x5x5.csv.gz aloi-hsb-13x4x4.csv.gz aloi-hsb-14x5x5.csv.gz aloi-hsb-10x6x6.csv.gz aloi-hsb-14x6x6.csv.gz
    
    
        Color similiarity
        Average similarity to 77 reference colors (not histograms) 18 colors x 2 sat x 2 bri + 5 grey values (incl. white, black)
        aloi-colorsim77.arff.gz (feature subsets are meaningful here, as these features are computed independently of each other)
    
    
        Haralick features
        First 13 Haralick features (radius 1 pixel)
        aloi-haralick-1.csv.gz
    
    
        Front to back
        Vectors representing front face vs. back faces of individual objects
        front.arff.gz
    
    
        Basic light
        Vectors indicating basic light situations
        light.arff.gz
    
    
        Manual annotations
        Manually annotated object groups of semantically related objects such as cups
        manual1.arff.gz
    

    Outlier Detection Versions

    Additionally, we generated a number of subsets for outlier detection:

        Feature type
        Description
        Files
    
    
        RGB Histograms
        Downsampled to 100000 objects (553 outliers)
        aloi-27d-100000-max10-tot553.csv.gz aloi-64d-100000-max10-tot553.csv.gz
    
    
    
        Downsampled to 75000 objects (717 outliers)
        aloi-27d-75000-max4-tot717.csv.gz aloi-64d-75000-max4-tot717.csv.gz
    
    
    
        Downsampled to 50000 objects (1508 outliers)
        aloi-27d-50000-max5-tot1508.csv.gz aloi-64d-50000-max5-tot1508.csv.gz
    
  19. u

    Data and R code for What you see is where you go: visibility influences...

    • repository.uantwerpen.be
    • datadryad.org
    Updated 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aben, Job; Signer, Johannes; Heiskanen, Janne; Pelllikka, Petri; Travis, Justing (2020). Data and R code for What you see is where you go: visibility influences movement decisions of a forest bird navigating a 3D structured matrix [Dataset]. http://doi.org/10.5061/dryad.69p8cz905
    Explore at:
    Dataset updated
    2020
    Dataset provided by
    University of Antwerp
    Faculty of Sciences. Biology
    Dryad
    Authors
    Aben, Job; Signer, Johannes; Heiskanen, Janne; Pelllikka, Petri; Travis, Justing
    Description

    Animal spatial behaviour is often presumed to reflect responses to visual cues. However, inference of behaviour in relation to the environment is challenged by the lack of objective methods to identify the information that effectively is available to an animal from a given location. In general, animals are assumed to have unconstrained information on the environment within a detection circle of a certain radius (the perceptual range; PR). However, visual cues are only available up to the first physical obstruction within an animal’s PR, making information availability a function of an animal’s location within the physical environment (the effective visual perceptual range; EVPR). By using LiDAR data and viewshed analysis, we model forest birds’ EVPRs at each step along a movement path. We found that the EVPR was on average 0.063% that of an unconstrained PR and, by applying a step-selection analysis, that individuals are 1.57 times more likely to move to a tree within their EVPR than to an equivalent tree outside it. This demonstrates that behavioural choices can be substantially impacted by the characteristics of an individual’s EVPR and highlights that inferences made from movement data may be improved by accounting for the EVPR.

  20. r

    WATER RESOURCE INFORMATION

    • researchdata.edu.au
    Updated Oct 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NSW Department of Climate Change, Energy, the Environment and Water (2025). WATER RESOURCE INFORMATION [Dataset]. https://researchdata.edu.au/water-resource-information/3852640
    Explore at:
    Dataset updated
    Oct 15, 2025
    Dataset provided by
    data.nsw.gov.au
    Authors
    NSW Department of Climate Change, Energy, the Environment and Water
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    The Water Resource Information collection brings together key datasets and reports describing how water is allocated, accounted for, and reported across New South Wales. Developed by the NSW Department of Climate Change, Energy, the Environment and Water (DCCEEW), these assets promote transparency, consistency, and informed decision-making across the water sector.\r \r The collection includes:\r \r -\t_Allocation Account Balance Summary_: Summarised information showing how water is held, used, traded, or carried over within licensed allocation accounts at the end of each water year.\r \r -\t_Available Water Determinations (AWD)_: Annual Allocation Announcements that define the percentage of licensed entitlement credited to water accounts each year, ensuring equitable and sustainable access in accordance with Water Sharing Plans.\r \r -\t_NSW General Purpose Water Accounting Reports (GPWAR)_: Annual reports prepared under the Australian Water Accounting Standard 1 (AWAS 1), providing consistent, transparent accounting across NSW’s major regulated inland valleys.\r \r -\t_NSW Water Dashboards_: Interactive dashboards presenting views of water availability, trade, and usage information — enhancing public access, transparency, and understanding of the state’s water resources.\r \r Together, these datasets form a connected picture of how water moves through the allocation and accounting cycle, supporting evidence-based management and the Department’s commitment to open data and public accountability.\r \r \r \r -----------------------------------\r \r How to Access the Data on the SEED Open Data Portal \r This landing page provides an overview only. To view or download the data, perform the following:\r \r 1. Scroll down and click “Related Datasets” (Dataset relationship).\r 2. Select one of the listed datasets.\r 3. On that dataset’s page, the data will either be available in the Dataset Packages section (if files are attached) or in the External Links section (if the dataset redirects you to another site to view or download the data).\r \r -----------------------------------\r \r Note: If you would like to ask a question, make any suggestions, or tell us how you are using this dataset, please visit the NSW Water Hub which has an online forum you can join.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Martijn van de Pol; Lyanne Brouwer (2021). Source code for R tutorials and dataset for empirical case study on Malurus elegans (red-winged fairy wren) [Dataset]. http://doi.org/10.5061/dryad.7h44j0ztw

Data from: Source code for R tutorials and dataset for empirical case study on Malurus elegans (red-winged fairy wren)

Related Article
Explore at:
zipAvailable download formats
Dataset updated
Jul 27, 2021
Dataset provided by
Dryad
Authors
Martijn van de Pol; Lyanne Brouwer
Time period covered
Jul 23, 2021
Description

Biological processes exhibit complex temporal dependencies due to the sequential nature of allocation decisions in organisms’ life-cycles, feedback loops, and two-way causality. Consequently, longitudinal data often contain cross-lags: the predictor variable depends on the response variable of the previous time-step. Although statisticians have warned that regression models that ignore such covariate endogeneity in time series are likely to be inappropriate, this has received relatively little attention in biology. Furthermore, the resulting degree of estimation bias remains largely unexplored.

We use a graphical model and numerical simulations to understand why and how regression models that ignore cross-lags can be biased, and how this bias depends on the length and number of time series. Ecological and evolutionary examples are provided to illustrate that cross-lags may be more common than is typically appreciated and that they occur in functionally different ways.

We show that rou...

Search
Clear search
Close search
Google apps
Main menu