28 datasets found
  1. d

    Data from: Data to Assess Nitrogen Export from Forested Watersheds in and...

    • catalog.data.gov
    • data.usgs.gov
    Updated Sep 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Data to Assess Nitrogen Export from Forested Watersheds in and near the Long Island Sound Basin with Weighted Regressions on Time, Discharge, and Season (WRTDS) [Dataset]. https://catalog.data.gov/dataset/data-to-assess-nitrogen-export-from-forested-watersheds-in-and-near-the-long-island-sound-
    Explore at:
    Dataset updated
    Sep 12, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Long Island Sound, Long Island
    Description

    The U.S. Geological Survey, in cooperation with the U.S. Environmental Protection Agency's Long Island Sound Study (https://longislandsoundstudy.net), characterized nitrogen export from forested watersheds and whether nitrogen loading has been increasing or decreasing to help inform Long Island Sound management strategies. The Weighted Regressions on Time, Discharge, and Season (WRTDS; Hirsch and others, 2010) method was used to estimate annual concentrations and fluxes of nitrogen species using long-term records (14 to 37 years in length) of stream total nitrogen, dissolved organic nitrogen, nitrate, and ammonium concentrations and daily discharge data from 17 watersheds located in the Long Island Sound basin or in nearby areas of Massachusetts, New Hampshire, or New York. This data release contains the input water-quality and discharge data, annual outputs (including concentrations, fluxes, yields, and confidence intervals about these estimates), statistical tests for trends between the periods of water years 1999-2000 and 2016-2018, and model diagnostic statistics. These datasets are organized into one zip file (WRTDSeLists.zip) and six comma-separated values (csv) data files (StationInformation.csv, AnnualResults.csv, TrendResults.csv, ModelStatistics.csv, InputWaterQuality.csv, and InputStreamflow.csv). The csv file (StationInformation.csv) contains information about the stations and input datasets. Finally, a short R script (SampleScript.R) is included to facilitate viewing the input and output data and to re-run the model. Reference: Hirsch, R.M., Moyer, D.L., and Archfield, S.A., 2010, Weighted Regressions on Time, Discharge, and Season (WRTDS), with an application to Chesapeake Bay River inputs: Journal of the American Water Resources Association, v. 46, no. 5, p. 857–880.

  2. Petre_Slide_CategoricalScatterplotFigShare.pptx

    • figshare.com
    pptx
    Updated Sep 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benj Petre; Aurore Coince; Sophien Kamoun (2016). Petre_Slide_CategoricalScatterplotFigShare.pptx [Dataset]. http://doi.org/10.6084/m9.figshare.3840102.v1
    Explore at:
    pptxAvailable download formats
    Dataset updated
    Sep 19, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Benj Petre; Aurore Coince; Sophien Kamoun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Categorical scatterplots with R for biologists: a step-by-step guide

    Benjamin Petre1, Aurore Coince2, Sophien Kamoun1

    1 The Sainsbury Laboratory, Norwich, UK; 2 Earlham Institute, Norwich, UK

    Weissgerber and colleagues (2015) recently stated that ‘as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies’. They called for more scatterplot and boxplot representations in scientific papers, which ‘allow readers to critically evaluate continuous data’ (Weissgerber et al., 2015). In the Kamoun Lab at The Sainsbury Laboratory, we recently implemented a protocol to generate categorical scatterplots (Petre et al., 2016; Dagdas et al., 2016). Here we describe the three steps of this protocol: 1) formatting of the data set in a .csv file, 2) execution of the R script to generate the graph, and 3) export of the graph as a .pdf file.

    Protocol

    • Step 1: format the data set as a .csv file. Store the data in a three-column excel file as shown in Powerpoint slide. The first column ‘Replicate’ indicates the biological replicates. In the example, the month and year during which the replicate was performed is indicated. The second column ‘Condition’ indicates the conditions of the experiment (in the example, a wild type and two mutants called A and B). The third column ‘Value’ contains continuous values. Save the Excel file as a .csv file (File -> Save as -> in ‘File Format’, select .csv). This .csv file is the input file to import in R.

    • Step 2: execute the R script (see Notes 1 and 2). Copy the script shown in Powerpoint slide and paste it in the R console. Execute the script. In the dialog box, select the input .csv file from step 1. The categorical scatterplot will appear in a separate window. Dots represent the values for each sample; colors indicate replicates. Boxplots are superimposed; black dots indicate outliers.

    • Step 3: save the graph as a .pdf file. Shape the window at your convenience and save the graph as a .pdf file (File -> Save as). See Powerpoint slide for an example.

    Notes

    • Note 1: install the ggplot2 package. The R script requires the package ‘ggplot2’ to be installed. To install it, Packages & Data -> Package Installer -> enter ‘ggplot2’ in the Package Search space and click on ‘Get List’. Select ‘ggplot2’ in the Package column and click on ‘Install Selected’. Install all dependencies as well.

    • Note 2: use a log scale for the y-axis. To use a log scale for the y-axis of the graph, use the command line below in place of command line #7 in the script.

    7 Display the graph in a separate window. Dot colors indicate

    replicates

    graph + geom_boxplot(outlier.colour='black', colour='black') + geom_jitter(aes(col=Replicate)) + scale_y_log10() + theme_bw()

    References

    Dagdas YF, Belhaj K, Maqbool A, Chaparro-Garcia A, Pandey P, Petre B, et al. (2016) An effector of the Irish potato famine pathogen antagonizes a host autophagy cargo receptor. eLife 5:e10856.

    Petre B, Saunders DGO, Sklenar J, Lorrain C, Krasileva KV, Win J, et al. (2016) Heterologous Expression Screens in Nicotiana benthamiana Identify a Candidate Effector of the Wheat Yellow Rust Pathogen that Associates with Processing Bodies. PLoS ONE 11(2):e0149035

    Weissgerber TL, Milic NM, Winham SJ, Garovic VD (2015) Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLoS Biol 13(4):e1002128

    https://cran.r-project.org/

    http://ggplot2.org/

  3. Global Landslide Catalog Export

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated May 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Aeronautics and Space Administration (2025). Global Landslide Catalog Export [Dataset]. https://catalog.data.gov/dataset/global-landslide-catalog-export
    Explore at:
    Dataset updated
    May 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The Global Landslide Catalog (GLC) was developed with the goal of identifying rainfall-triggered landslide events around the world, regardless of size, impacts or location. The GLC considers all types of mass movements triggered by rainfall, which have been reported in the media, disaster databases, scientific reports, or other sources. The GLC has been compiled since 2007 at NASA Goddard Space Flight Center. This is a unique data set with the ID tag “GLC” in the landslide editor. This dataset on data.nasa.gov was a one-time export from the Global Landslide Catalog maintained separately. It is current as of March 7, 2016. The original catalog is available here: http://www.arcgis.com/home/webmap/viewer.html?url=https%3A%2F%2Fmaps.nccs.nasa.gov%2Fserver%2Frest%2Fservices%2Fglobal_landslide_catalog%2Fglc_viewer_service%2FFeatureServer&source=sd To export GLC data, you must agree to the “Terms and Conditions”. We request that anyone using the GLC cite the two sources of this database: Kirschbaum, D. B., Adler, R., Hong, Y., Hill, S., & Lerner-Lam, A. (2010). A global landslide catalog for hazard applications: method, results, and limitations. Natural Hazards, 52(3), 561–575. doi:10.1007/s11069-009-9401-4. [1] Kirschbaum, D.B., T. Stanley, Y. Zhou (In press, 2015). Spatial and Temporal Analysis of a Global Landslide Catalog. Geomorphology. doi:10.1016/j.geomorph.2015.03.016. [2]

  4. Global Landslide Catalog Export - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Mar 26, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2016). Global Landslide Catalog Export - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/global-landslide-catalog-export
    Explore at:
    Dataset updated
    Mar 26, 2016
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The Global Landslide Catalog (GLC) was developed with the goal of identifying rainfall-triggered landslide events around the world, regardless of size, impacts or location. The GLC considers all types of mass movements triggered by rainfall, which have been reported in the media, disaster databases, scientific reports, or other sources. The GLC has been compiled since 2007 at NASA Goddard Space Flight Center. This is a unique data set with the ID tag “GLC” in the landslide editor. This dataset on data.nasa.gov was a one-time export from the Global Landslide Catalog maintained separately. It is current as of March 7, 2016. The original catalog is available here: http://www.arcgis.com/home/webmap/viewer.html?url=https%3A%2F%2Fmaps.nccs.nasa.gov%2Fserver%2Frest%2Fservices%2Fglobal_landslide_catalog%2Fglc_viewer_service%2FFeatureServer&source=sd To export GLC data, you must agree to the “Terms and Conditions”. We request that anyone using the GLC cite the two sources of this database: Kirschbaum, D. B., Adler, R., Hong, Y., Hill, S., & Lerner-Lam, A. (2010). A global landslide catalog for hazard applications: method, results, and limitations. Natural Hazards, 52(3), 561–575. doi:10.1007/s11069-009-9401-4. [1] Kirschbaum, D.B., T. Stanley, Y. Zhou (In press, 2015). Spatial and Temporal Analysis of a Global Landslide Catalog. Geomorphology. doi:10.1016/j.geomorph.2015.03.016. [2]

  5. Data supporting the Master thesis "Monitoring von Open Data Praktiken -...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Nov 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katharina Zinke; Katharina Zinke (2024). Data supporting the Master thesis "Monitoring von Open Data Praktiken - Herausforderungen beim Auffinden von Datenpublikationen am Beispiel der Publikationen von Forschenden der TU Dresden" [Dataset]. http://doi.org/10.5281/zenodo.14196539
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 21, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Katharina Zinke; Katharina Zinke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Dresden
    Description

    Data supporting the Master thesis "Monitoring von Open Data Praktiken - Herausforderungen beim Auffinden von Datenpublikationen am Beispiel der Publikationen von Forschenden der TU Dresden" (Monitoring open data practices - challenges in finding data publications using the example of publications by researchers at TU Dresden) - Katharina Zinke, Institut für Bibliotheks- und Informationswissenschaften, Humboldt-Universität Berlin, 2023

    This ZIP-File contains the data the thesis is based on, interim exports of the results and the R script with all pre-processing, data merging and analyses carried out. The documentation of the additional, explorative analysis is also available. The actual PDFs and text files of the scientific papers used are not included as they are published open access.

    The folder structure is shown below with the file names and a brief description of the contents of each file. For details concerning the analyses approach, please refer to the master's thesis (publication following soon).

    ## Data sources

    Folder 01_SourceData/

    - PLOS-Dataset_v2_Mar23.csv (PLOS-OSI dataset)

    - ScopusSearch_ExportResults.csv (export of Scopus search results from Scopus)

    - ScopusSearch_ExportResults.ris (export of Scopus search results from Scopus)

    - Zotero_Export_ScopusSearch.csv (export of the file names and DOIs of the Scopus search results from Zotero)

    ## Automatic classification

    Folder 02_AutomaticClassification/

    - (NOT INCLUDED) PDFs folder (Folder for PDFs of all publications identified by the Scopus search, named AuthorLastName_Year_PublicationTitle_Title)

    - (NOT INCLUDED) PDFs_to_text folder (Folder for all texts extracted from the PDFs by ODDPub, named AuthorLastName_Year_PublicationTitle_Title)

    - PLOS_ScopusSearch_matched.csv (merge of the Scopus search results with the PLOS_OSI dataset for the files contained in both)

    - oddpub_results_wDOIs.csv (results file of the ODDPub classification)

    - PLOS_ODDPub.csv (merge of the results file of the ODDPub classification with the PLOS-OSI dataset for the publications contained in both)

    ## Manual coding

    Folder 03_ManualCheck/

    - CodeSheet_ManualCheck.txt (Code sheet with descriptions of the variables for manual coding)

    - ManualCheck_2023-06-08.csv (Manual coding results file)

    - PLOS_ODDPub_Manual.csv (Merge of the results file of the ODDPub and PLOS-OSI classification with the results file of the manual coding)

    ## Explorative analysis for the discoverability of open data

    Folder04_FurtherAnalyses

    Proof_of_of_Concept_Open_Data_Monitoring.pdf (Description of the explorative analysis of the discoverability of open data publications using the example of a researcher) - in German

    ## R-Script

    Analyses_MA_OpenDataMonitoring.R (R-Script for preparing, merging and analyzing the data and for performing the ODDPub algorithm)

  6. Data export CSV files from HDX Workbench, software platform for the analysis...

    • figshare.com
    txt
    Updated Oct 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christiane Brugger; Jacob Schwartz; Scott Novick; Song Tong; Joel Hoskins; Nadim Majdalani; Rebecca Kim; Martin Filipovski; Sue Wickner; Susan Gottesman; Patrick R. Griffin; Alexandra M. Deaconescu (2023). Data export CSV files from HDX Workbench, software platform for the analysis of hydrogen/deuterium exchange (HDX) mass spectrometry data. [Dataset]. http://doi.org/10.6084/m9.figshare.24329482.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 18, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Christiane Brugger; Jacob Schwartz; Scott Novick; Song Tong; Joel Hoskins; Nadim Majdalani; Rebecca Kim; Martin Filipovski; Sue Wickner; Susan Gottesman; Patrick R. Griffin; Alexandra M. Deaconescu
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    In enterobacteria such as Escherichia coli, the general stress response is mediatedby σs, the stationary phase dissociable promoter specificity subunit of RNApolymerase. σs is degraded by ClpXP during active growth in a process dependent onthe RssB adaptor, which is thought to be stimulated by phosphorylation of a conservedaspartate in its N-terminal receiver domain. Here we present the crystal structure offull-length RssB bound to a beryllofluoride phosphomimic. Compared to the structure ofRssB bound to the IraD anti-adaptor, our new RssB structure with bound beryllofluoridereveals conformational differences and coil-to-helix transitions in the C-terminal regionof the RssB receiver domain and in the inter-domain segmented helical linker. Theseare accompanied by masking of the α4-β5-α5 (4-5-5) “signaling” face of the RssBreceiver domain by its C-terminal domain. Critically, using hydrogen-deuteriumexchange mass spectrometry we identify σs binding determinants on the 4-5-5 face,implying that this surface needs to be unmasked to effect an interdomain interfaceswitch and enable full σs engagement and hand-off to ClpXP. In activated receiverdomains, the 4-5-5 face is often the locus of intermolecular interactions, but its maskingby intramolecular contacts upon phosphorylation is unusual, emphasizing that RssB isa response regulator that undergoes atypical regulation.Files included are data export from HDX Workbench software from the HDX-MS experiments in support of this work. The files are in CSV format.

  7. d

    Data from: Indoor air quality in California homes with code-required...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Apr 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wanyu Chan; Yang-Seon Kim; William Delp; Iain Walker; Brett Singer (2020). Indoor air quality in California homes with code-required mechanical ventilation [Dataset]. http://doi.org/10.7941/D1ZS7X
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2020
    Dataset provided by
    Dryad
    Authors
    Wanyu Chan; Yang-Seon Kim; William Delp; Iain Walker; Brett Singer
    Time period covered
    Feb 7, 2020
    Area covered
    California
    Description

    Time Series Data Handling and Quality Assurance Review

    Most instruments had internal logging and special software to download data from the field instruments as binary files or ascii/csv files. The instruments for which files downloaded as binary provide software to view the data or export the data to csv files.

    One-minute resolution time-series data files were created for each house using an R script that pulled data from the csv files, aligned data by time, executed unit conversions, and translated from instruments with longer or different data intervals (e.g. 30 min formaldehyde data and 1.5 min for anemometer data). Visual review was conducted on the compiled files (and primary csv or binary files were consulted as needed) to check for translation or writing errors (especially from terminal emulator), indications of instrument malfunction, mislabeled units or unit conversion errors, mislabeled location, and time stamp errors.

    The draft final set of time-series data&nb...

  8. Replication Package: Unboxing Default Argument Breaking Changes in 1 + 2...

    • zenodo.org
    application/gzip
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Daniel Prates; Arthur Bonifácio; Ghizlane El Boussaidi; João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Daniel Prates; Arthur Bonifácio; Ghizlane El Boussaidi (2024). Replication Package: Unboxing Default Argument Breaking Changes in 1 + 2 Data Science Libraries in Python [Dataset]. http://doi.org/10.5281/zenodo.11584961
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jul 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Daniel Prates; Arthur Bonifácio; Ghizlane El Boussaidi; João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Daniel Prates; Arthur Bonifácio; Ghizlane El Boussaidi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Replication Package

    This repository contains data and source files needed to replicate our work described in the paper "Unboxing Default Argument Breaking Changes in Scikit Learn".

    Requirements

    We recommend the following requirements to replicate our study:

    1. Internet access
    2. At least 100GB of space
    3. Docker installed
    4. Git installed

    Package Structure

    We relied on Docker containers to provide a working environment that is easier to replicate. Specifically, we configure the following containers:

    • data-analysis, an R-based Container we used to run our data analysis.
    • data-collection, a Python Container we used to collect Scikit's default arguments and detect them in client applications.
    • database, a Postgres Container we used to store clients' data, obtainer from Grotov et al.
    • storage, a directory used to store the data processed in data-analysis and data-collection. This directory is shared in both containers.
    • docker-compose.yml, the Docker file that configures all containers used in the package.

    In the remainder of this document, we describe how to set up each container properly.

    Using VSCode to Setup the Package

    We selected VSCode as the IDE of choice because its extensions allow us to implement our scripts directly inside the containers. In this package, we provide configuration parameters for both data-analysis and data-collection containers. This way you can directly access and run each container inside it without any specific configuration.

    You first need to set up the containers

    $ cd /replication/package/folder
    $ docker-compose build
    $ docker-compose up
    # Wait docker creating and running all containers
    

    Then, you can open them in Visual Studio Code:

    1. Open VSCode in project root folder
    2. Access the command palette and select "Dev Container: Reopen in Container"
      1. Select either Data Collection or Data Analysis.
    3. Start working

    If you want/need a more customized organization, the remainder of this file describes it in detail.

    Longest Road: Manual Package Setup

    Database Setup

    The database container will automatically restore the dump in dump_matroskin.tar in its first launch. To set up and run the container, you should:

    Build an image:

    $ cd ./database
    $ docker build --tag 'dabc-database' .
    $ docker image ls
    REPOSITORY  TAG    IMAGE ID    CREATED     SIZE
    dabc-database latest  b6f8af99c90d  50 minutes ago  18.5GB
    

    Create and enter inside the container:

    $ docker run -it --name dabc-database-1 dabc-database
    $ docker exec -it dabc-database-1 /bin/bash
    root# psql -U postgres -h localhost -d jupyter-notebooks
    jupyter-notebooks=# \dt
           List of relations
     Schema |    Name    | Type | Owner
    --------+-------------------+-------+-------
     public | Cell       | table | root
     public | Code_cell     | table | root
     public | Md_cell      | table | root
     public | Notebook     | table | root
     public | Notebook_features | table | root
     public | Notebook_metadata | table | root
     public | repository    | table | root
    

    If you got the tables list as above, your database is properly setup.

    It is important to mention that this database is extended from the one provided by Grotov et al.. Basically, we added three columns in the table Notebook_features (API_functions_calls, defined_functions_calls, andother_functions_calls) containing the function calls performed by each client in the database.

    Data Collection Setup

    This container is responsible for collecting the data to answer our research questions. It has the following structure:

    • dabcs.py, extract DABCs from Scikit Learn source code, and export them to a CSV file.
    • dabcs-clients.py, extract function calls from clients and export them to a CSV file. We rely on a modified version of Matroskin to leverage the function calls. You can find the tool's source code in the `matroskin`` directory.
    • Makefile, commands to set up and run both dabcs.py and dabcs-clients.py
    • matroskin, the directory containing the modified version of matroskin tool. We extended the library to collect the function calls performed on the client notebooks of Grotov's dataset.
    • storage, a docker volume where the data-collection should save the exported data. This data will be used later in Data Analysis.
    • requirements.txt, Python dependencies adopted in this module.

    Note that the container will automatically configure this module for you, e.g., install dependencies, configure matroskin, download scikit learn source code, etc. For this, you must run the following commands:

    $ cd ./data-collection
    $ docker build --tag "data-collection" .
    $ docker run -it -d --name data-collection-1 -v $(pwd)/:/data-collection -v $(pwd)/../storage/:/data-collection/storage/ data-collection
    $ docker exec -it data-collection-1 /bin/bash
    $ ls
    Dockerfile Makefile config.yml dabcs-clients.py dabcs.py matroskin storage requirements.txt utils.py
    

    If you see project files, it means the container is configured accordingly.

    Data Analysis Setup

    We use this container to conduct the analysis over the data produced by the Data Collection container. It has the following structure:

    • dependencies.R, an R script containing the dependencies used in our data analysis.
    • data-analysis.Rmd, the R notebook we used to perform our data analysis
    • datasets, a docker volume pointing to the storage directory.

    Execute the following commands to run this container:

    $ cd ./data-analysis
    $ docker build --tag "data-analysis" .
    $ docker run -it -d --name data-analysis-1 -v $(pwd)/:/data-analysis -v $(pwd)/../storage/:/data-collection/datasets/ data-analysis
    $ docker exec -it data-analysis-1 /bin/bash
    $ ls
    data-analysis.Rmd datasets dependencies.R Dockerfile figures Makefile
    

    If you see project files, it means the container is configured accordingly.

    A note on storage shared folder

    As mentioned, the storage folder is mounted as a volume and shared between data-collection and data-analysis containers. We compressed the content of this folder due to space constraints. Therefore, before starting working on Data Collection or Data Analysis, make sure you extracted the compressed files. You can do this by running the Makefile inside storage folder.

    $ make unzip # extract files
    $ ls
    clients-dabcs.csv clients-validation.csv dabcs.csv Makefile scikit-learn-versions.csv versions.csv
    $ make zip # compress files
    $ ls
    csv-files.tar.gz Makefile
  9. Electronic Disclosure System - State and Local Election Funding and...

    • researchdata.edu.au
    • data.qld.gov.au
    • +1more
    Updated Jan 10, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.qld.gov.au (2019). Electronic Disclosure System - State and Local Election Funding and Donations [Dataset]. https://researchdata.edu.au/electronic-disclosure-state-funding-donations/1360703
    Explore at:
    Dataset updated
    Jan 10, 2019
    Dataset provided by
    Queensland Governmenthttp://qld.gov.au/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Electoral Commission of Queensland is responsible for the Electronic Disclosure System (EDS), which provides real-time reporting of political donations. It aims to streamline the disclosure process while increasing transparency surrounding gifts.\r \r All entities conducting or supporting political activity in Queensland are required to submit a disclosure return to the Electoral Commission of Queensland. These include reporting of gifts and loans, as well as periodic reporting of other dealings such as advertising and expenditure. EDS makes these returns readily available to the public, providing faster and easier access to political financial disclosure information.\r \r The EDS is an outcome of the Electoral Commission of Queensland's ongoing commitment to the people of Queensland, to drive improvements to election services and meet changing community needs.\r \r To export the data from the EDS as a CSV file, consult this page: https://helpcentre.disclosures.ecq.qld.gov.au/hc/en-us/articles/115003351428-Can-I-export-the-data-I-can-see-in-the-map-\r \r For a detailed glossary of terms used by the EDS, please consult this page: https://helpcentre.disclosures.ecq.qld.gov.au/hc/en-us/articles/115002784587-Glossary-of-Terms-in-EDS\r \r For other information about how to use the EDS, please consult the FAQ page here: https://helpcentre.disclosures.ecq.qld.gov.au/hc/en-us/categories/115000599068-FAQs

  10. Data from: RAW data from Towards Holistic Environmental Policy Assessment:...

    • data.europa.eu
    • research.science.eus
    unknown
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). RAW data from Towards Holistic Environmental Policy Assessment: Multi-Criteria Frameworks and recommendations for modelers paper [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-13909413?locale=cs
    Explore at:
    unknown(2990)Available download formats
    Dataset updated
    Jul 8, 2025
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Name: Data used to rate the relevance of each dimension necessary for a Holistic Environmental Policy Assessment. Summary: This dataset contains answers from a panel of experts and the public to rate the relevance of each dimension on a scale of 0 (Nor relevant at all) to 100 (Extremely relevant). License: CC-BY-SA Acknowledge: These data have been collected in the framework of the DECIPHER project. This project has received funding from the European Union’s Horizon Europe programme under grant agreement No. 101056898. Disclaimer: Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them. Collection Date: 2024-1 / 2024-04 Publication Date: 22/04/2025 DOI: 10.5281/zenodo.13909413 Other repositories: - Author: University of Deusto Objective of collection: This data was originally collected to prioritise the dimensions to be further used for Environmental Policy Assessment and IAMs enlarged scope. Description: Data Files (CSV) decipher-public.csv : Public participants' general survey results in the framework of the Decipher project, including socio demographic characteristics and overall perception of each dimension necessary for a Holistic Environmental Policy Assessment. decipher-risk.csv : Contains individual survey responses regarding prioritisation of dimensions in risk situations. Includes demographic and opinion data from a targeted sample. decipher-experts.csv : Experts’ opinions collected on risk topics through surveys in the framework of Decipher Project, targeting professionals in relevant fields. decipher-modelers.csv: Answers given by the developers of models about the characteristics of the models and dimensions covered by them. prolific_export_risk.csv : Exported survey data from Prolific, focusing specifically on ratings in risk situations. Includes response times, demographic details, and survey metadata. prolific_export_public_{1,2}.csv : Public survey exports from Prolific, gathering prioritisation of dimensions necessary for environmental policy assessment. curated.csv : Final cleaned and harmonized dataset combining multiple survey sources. Designed for direct statistical analysis with standardized variable names. Scripts files (R) decipher-modelers.R: Script to assess the answers given modelers about the characteristics of the models. joint.R: Script to clean and joint the RAW answers from the different surveys to retrieve overall perception of each dimension necessary for a Holistic Environmental Policy Assessment. Report Files decipher-modelers.pdf: Diagram with the result of the full-Country.html : Full interactive report showing dimension prioritisation broken down by participant country. full-Gender.html : Visualization report displaying differences in dimension prioritisation by gender. full-Education.html : Detailed breakdown of dimension prioritisation results based on education level. full-Work.html : Report focusing on participant occupational categories and associated dimension prioritisation. full-Income.html : Analysis report showing how income level correlates with dimension prioritisation. full-PS.html : Report analyzing Political Sensitivity scores across all participants. full-type.html : Visualization report comparing participant dimensions prioritisation (public vs experts) in normal and risk situations. full-joint-Country.html : Joint analysis report integrating multiple dimensions of country-based dimension prioritisation in normal and risk situations. Combines demographic and response patterns. full-joint-Gender.html : Combined gender-based analysis across datasets, exploring intersections of demographic factors and dimensions prioritisation in normal and risk situations. full-joint-Education.html : Education-focused report merging various datasets to show consistent or divergent patterns of dimensions prioritisation in normal and risk awareness. full-joint-Work.html : Cross-dataset analysis of occupational groups and their dimensions prioritisation in normal and risk situation full-joint-Income.html : Income-stratified joint analysis, merging public and expert datasets to find common trends and significant differences during dimensions prioritisation in normal and risks situations. full-joint-PS.html : Comprehensive Political Sensitivity score report from merged datasets, highlighting general patterns and subgroup variations in normal and risk situations. 5 star: ⭐⭐⭐ Preprocessing steps: The data has been re-coded and cleaned using the scripts provided. Reuse: NA Update policy: No more updates are planned. Ethics and legal aspects: Names of the persons involved have been removed. Technical aspects: Other:

  11. u

    Raw data (CSVs and pipelines) for Cell Painting and beta catenin...

    • deepblue.lib.umich.edu
    Updated Jul 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    A. Tapaswi; N. Cemalovic; K. Polemi; J. Sexton; J. Colacino (2024). Raw data (CSVs and pipelines) for Cell Painting and beta catenin immunofluorescence in MCF10A cells exposed to common chemical exposures [Dataset]. http://doi.org/10.7302/seb7-cc14
    Explore at:
    Dataset updated
    Jul 31, 2024
    Dataset provided by
    Deep Blue Data
    Authors
    A. Tapaswi; N. Cemalovic; K. Polemi; J. Sexton; J. Colacino
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Aug 13, 2020
    Description

    MCF10A non-tumorigenic breast cells were dosed with environmental toxicants and stained with multiple cellular stains to study morphological perturbations. Following up on feature results, MCF10A cells were stained with an anti-beta catenin antibody to study beta catenin nuclear translocation. Cell profiler software was used to measure and export per cell data .CSV formats to be further analyze din BMDExpress2 and R studio

  12. g

    2007-08 V3 CEAMARC-CASO Bathymetry Plots Over Time During Events | gimi9.com...

    • gimi9.com
    Updated Apr 20, 2008
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2008). 2007-08 V3 CEAMARC-CASO Bathymetry Plots Over Time During Events | gimi9.com [Dataset]. https://gimi9.com/dataset/au_2007-08-v3-ceamarc-caso-bathymetry-plots-over-time-during-events1/
    Explore at:
    Dataset updated
    Apr 20, 2008
    Description

    A routine was developed in R ('bathy_plots.R') to plot bathymetry data over time during individual CEAMARC events. This is so we can analyse benthic data in relation to habitat, ie. did we trawl over a slope or was the sea floor relatively flat. Note that the depth range in the plots is autoscaled to the data, so a small range in depths appears as a scatetring of points. As long as you look at the depth scale though interpretation will be ok. The R files need a file of bathymetry data in '200708V3_one_minute.csv' which is a file containing a data export from the underway PostgreSQL ship database and 'events.csv' which is a stripped down version of the events export from the ship board events database export. If you wish to run the code again you may need to change the pathnames in the R script to relevant locations. If you have opened the csv files in excel at any stage and the R script gets an error you may need to format the date/time columns as yyyy-mm-dd hh;mm:ss, save and close the file as csv without opening it again and then run the R script. However, all output files are here for every CEAMARC event. Filenames contain a reference to CEAMARC event id. Files are in eps format and can be viewed using Ghostview which is available as a free download on the internet.

  13. d

    Data from: Commercial harvest and export of snapping turtles (Chelydra...

    • datadryad.org
    • data-staging.niaid.nih.gov
    zip
    Updated Nov 17, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin C. Colteaux; Derek M. Johnson (2017). Commercial harvest and export of snapping turtles (Chelydra serpentina) in the United States: trends and the efficacy of size limits at reducing harvest [Dataset]. http://doi.org/10.5061/dryad.j5v05
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 17, 2017
    Dataset provided by
    Dryad
    Authors
    Benjamin C. Colteaux; Derek M. Johnson
    Time period covered
    Nov 16, 2016
    Area covered
    United States
    Description

    State Harvest Data (csv)Commercial snapping turtle harvest data (in individuals) for eleven states from 1998 - 2013. States reporting are Arkansas, Delaware, Iowa, Maryland, Massachusetts, Michigan, Minnesota, New Jersey, North Carolina, Pennsylvania, and Virginia.StateHarvestData.csvInput and execution code for Colteaux_Johnson_2016Attached R file includes the code described in the listed publication. The companion JAGS (just another Gibbs sampler) code is also stored in this repository under separate cover.ColteauxJohnsonNatureConservation.RJAGS model code for Colteaux_Johnson_2016Attached R file includes the JAGS (just another Gibbs sampler) code described in the listed publication. The companion input and execution code is also stored in this repository under separate cover.ColteauxJohnsonNatureConservationJAGS.R

  14. n

    ESG rating of general stock indices

    • narcis.nl
    • data.mendeley.com
    Updated Oct 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erhart, S (via Mendeley Data) (2021). ESG rating of general stock indices [Dataset]. http://doi.org/10.17632/58mwkj5pf8.1
    Explore at:
    Dataset updated
    Oct 22, 2021
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Erhart, S (via Mendeley Data)
    Description
    ################################################################################################## THE FILES HAVE BEEN CREATED BY SZILÁRD ERHART FOR A RESEARCH: ERHART (2021): ESG RATINGS OF GENERAL # STOCK EXCHANGE INDICES, INTERNATIONAL REVIEW OF FINANCIAL ANALYSIS# USERS OF THE FILES AGREE TO QUOTE THE ABOVE PAPER# THE PYTHON SCRIPT (PYTHONESG_ERHART.TXT) HELPS USERS TO GET TICKERS BY STOCK EXCHANGES AND EXTRACT ESG SCORES FOR THE UNDERLYING STOCKS FROM YAHOO FINANCE.# THE R SCRIPT (ESG_UA.TXT) HELPS TO REPLICATE THE MONTE CARLO EXPERIMENT DETAILED IN THE STUDY.# THE EXPORT_ALL CSV CONTAINS THE DOWNLOADED ESG DATA (SCORES, CONTROVERSIES, ETC) ORGANIZED BY STOCKS AND EXCHANGES.############################################################################################################################################################################################################### DISCLAIMER # The author takes no responsibility for the timeliness, accuracy, completeness or quality of the information provided. # The author is in no event liable for damages of any kind incurred or suffered as a result of the use or non-use of the # information presented or the use of defective or incomplete information. # The contents are subject to confirmation and not binding. # The author expressly reserves the right to alter, amend, whole and in part, # without prior notice or to discontinue publication for a period of time or even completely. ###########################################################################################################################################READ ME############################################################# BEFORE USING THE MONTE CARLO SIMULATIONS SCRIPT: # (1) COPY THE goascores.csv and goalscores_alt.csv FILES ONTO YOUR ON COMPUTER DRIVE. THE TWO FILES ARE IDENTICAL.# (2) SET THE EXACT FILE LOCATION INFORMATION IN THE 'Read in data' SECTION OF THE MONTE CARLO SCRIPT AND FOR THE OUTPUT FILES AT THE END OF THE SCRIPT# (3) LOAD MISC TOOLS AND MATRIXSTATS IN YOUR R APPLICATION# (4) RUN THE CODE.####################################READ ME
  15. Reddit /r/Bitcoin Data for Jun 2022

    • kaggle.com
    zip
    Updated Jul 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lexyr (2022). Reddit /r/Bitcoin Data for Jun 2022 [Dataset]. https://www.kaggle.com/pavellexyr/reddit-r-bitcoin-data-for-jun-2022
    Explore at:
    zip(15104905 bytes)Available download formats
    Dataset updated
    Jul 25, 2022
    Authors
    Lexyr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context

    As anyone who's been keeping track for the last ten years can tell you, the world of cryptocurrency moves fast. Its movements are all too often supported or hindered by viral fads - be it posts on Reddit, Twitter takes, or something else entirely. We have compiled a month of the most famous cryptocurrency subreddit, /r/Bitcoin, into two convenient CSV files, creating a large cryptocurrency dataset for use both enterprise and academic.

    For a larger version, please see our Reddit /r/Bitcoin dataset.

    Content

    This dataset contains a comprehensive collection of posts and comments mentioning AAPL in their title and body text respectively. The data is procured using SocialGrep.

    To preserve users' anonymity and to prevent targeted harassment, the data does not include usernames.

    Acknowledgements

    This dataset was created using SocialGrep Exports. If social data analysis is your thing, we also have a good Reddit search tool.

    We would also like to thank André François McKenzie for providing us with the background image for this dataset.

    Inspiration

    Cryptocurrency is still a new topic in everyone's minds. It fluctuates wildly as time goes on - can we predict any future trends from seeing the public opinion shift?

  16. Endoscopy Procedures and Video Analysis Dataset

    • kaggle.com
    zip
    Updated Dec 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alevtyna Mozolyuk (2023). Endoscopy Procedures and Video Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/alyamozolyuk/endoscopy-procedures-and-video-analysis-dataset/code
    Explore at:
    zip(285507156 bytes)Available download formats
    Dataset updated
    Dec 15, 2023
    Authors
    Alevtyna Mozolyuk
    Description

    Overview

    This dataset, generated using R, provides a detailed perspective on various aspects of healthcare data. It encompasses information on patient demographics, endoscopy procedures, video analysis data, and hardware utilization. This rich dataset is ideal for studies in medical research, healthcare analytics, and machine learning applications.

    Dataset Description

    The dataset is divided into four main tables, each focusing on different aspects of healthcare information:

    1. Patient Table

    Content: Demographic and health-related information for 1,000,000 patients. Fields: PatientID: Unique identifier for each patient. Age: Age of the patient. Gender: Gender of the patient. Ethnicity: Ethnic background of the patient. MaritalStatus: Marital status of the patient. EmploymentStatus: Employment status of the patient. MedicalHistory: Medical history details. CurrentMedications: Information on current medications. LifestyleChoices: Lifestyle choices that might impact health. InsuranceProvider: Details of the insurance provider.

    2. Endoscopy Procedure Table

    Content: Detailed records of endoscopy procedures. Fields: ProcedureID: Unique identifier for each procedure. PatientID: Linked patient identifier. ProcedureDate: Date of the procedure. ProcedureTime: Time when the procedure was performed. EndoscopyType: Type of endoscopy procedure. Indication: Reason or indication for the procedure. FindingsAnomalies: Findings or anomalies detected. BiopsyResults: Results of any biopsies taken. Complications: Any complications during the procedure. PhysicianID: Identifier of the physician. Location: Location where the procedure was performed.

    3. Video Analysis Table

    Content: Data from computer vision analysis of endoscopy videos. Fields: VideoAnalysisID: Unique identifier for each video analysis. ProcedureID: Linked procedure identifier. AnalysisDate: Date of the video analysis. AnalysisTime: Time of the video analysis. VideoDuration: Duration of the endoscopy video. DetectedAnomalies: Anomalies detected by the software. SoftwareConfidenceLevel: Confidence level of the software in its analysis. ComparisonWithPhysicianAnalysis: Comparison of software analysis with physician’s observation. PhysicianComments: Comments from the physician. AdditionalVideoMetadata: Additional metadata about the video. FrameRate: Frame rate of the video.

    4. Hardware Utilization Table

    Content: Data on the utilization of hardware in procedures.* Fields: ProcedureID: Linked procedure identifier. Utilization: Metric indicating the utilization level of the hardware. Data Generation The dataset was generated using the R programming language. Key libraries used include dplyr for data manipulation, lubridate for handling dates and times, and stringi for string operations. Custom functions random_dates and random_times were developed to create realistic date and time entries.

    Intended Use and Applications

    This simulated dataset is ideal for:

    • Testing and developing healthcare data analysis methodologies.
    • Exploring specific use cases in healthcare, such as patient care strategies, procedural efficiency, or AI in medical imaging.

    Export Format The data is exported in CSV format, making it easy to use in a variety of data analysis tools and environments.

  17. r

    Music Learning by older adult novices

    • researchdata.edu.au
    Updated Oct 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MacRitchie Jennifer; Stevens Kate; Dean Roger; Chmiel Anthony; Roger Thornton Dean; Jennifer MacRitchie; Catherine J Stevens; Catherine J Stevens; Anthony Chmiel (2025). Music Learning by older adult novices [Dataset]. http://doi.org/10.17605/OSF.IO/2GVTY
    Explore at:
    Dataset updated
    Oct 13, 2025
    Dataset provided by
    Western Sydney University
    osf
    Authors
    MacRitchie Jennifer; Stevens Kate; Dean Roger; Chmiel Anthony; Roger Thornton Dean; Jennifer MacRitchie; Catherine J Stevens; Catherine J Stevens; Anthony Chmiel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These are the data for the PLoS paper of Chmiel, A. et al, 2025, concerning music learning by novice older people. These .csv data files have been exported in R from its .RDS format into .csv. After such exports, Excel does not format ctimprep(counts) data correctly i.e. i-r (e.g. 1-2), instead converting the second, unless 0, to a 3 letter month (Jan or Feb). However, when the .csv is read back into R, without further change, the correct format is regenerated. If a user has any problems with this, ctimprep can easily be reassembled from coexistent separate imprct and repct data, or extracted from the Excel .csv representation. This data also appears correctly in many text and script readers. The five (5) files are listed here (and _di indicates they are anonymised (de-identified): replicreslong_di.csv replication data (4184 obs of 12 variables) improvreslong_di.csv improvisation data for analysing performance items 4,5 ( 7902 obs of 25 variables) improv1reslong_di.csv improvisation data for analysising improvs given as Performance item 1 in improvisation blocks (1980 obs of 24 variables) kdbfluency_di.csv data on self-asssessed fluency in keyboard usage (391 obs of 5 variables) MDT_data_di.csv Melody detection task data (371 obs 10 variables) Like most, our data contain a few elements with missing entries (such as NAs), which in general are disregarded in R analyses. For example there is one incomplete row in the kbdfluency data (with two NA values). There are also a few incidences of 3 personal ids that duplicate others, apparently as a result of data acquisition errors. In certain modelling approaches, these would be included as participants, but as the Group and Session information is correct, there would be only a slight impact on the number of participants in the group effects. We viewed their retention as preferable to assuming that our understanding of their causes was absolutely correct, and so merging the corresponding id pairs. However, a user may conflate the matching pairs into a single pid if they see benefit: lilij and fahif belong together; as do jojav and tikit; and holar and gokaj.The data files are suffixed by _di to indicate that they are anonymised (de-identified). Note that in order to develop models like those we present, it is essential to checkthe factor vs numeric status of all the outcome and predictor variables being used. In translation into and out of R these may be lost. Replication and Improvisation data are in the 'long' format, but subsets andother formats can easily be generated if required for modelling (e.g. in R, with dplyr, or by the use of tibbles).

  18. Data and script for the GenABEL paper

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lennart C. Karssen; Lennart C. Karssen; Cornelia M. Van Duijn; Yurii S. Aulchenko; Yurii S. Aulchenko; Cornelia M. Van Duijn (2020). Data and script for the GenABEL paper [Dataset]. http://doi.org/10.5281/zenodo.51008
    Explore at:
    csv, binAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Lennart C. Karssen; Lennart C. Karssen; Cornelia M. Van Duijn; Yurii S. Aulchenko; Yurii S. Aulchenko; Cornelia M. Van Duijn
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains the automatically collected data used for the overview paper about the GenABEL Project (Karssen et al, 2016, DOI:10.12688/f1000research.8733.1). Some data used for the paper was collected manually and is therefore not included in this dataset.

    The file "tracker_report-2016-04-16.csv" is an export of the bug reports from the GenABEL R-forge bug tracker on the date listed in the file name.

    The file "Analytics www.genabel.org Locatie Lennart 20150428-20160428.csv" is a custom export of the Google Analytics data for visits to the GenABEL website (www.genabel.org) in the period marked by the dates listed in the file name. The columns contain the ISO code of the country, city, number of sessions, number of new viewers, bounce percentage, pages per session and average session duration, respectively.

    The file analysis_GenABELpaper.org contains the source code
    used for the automated data extraction for this paper in Emacs
    Org mode literate programming format (http://orgmode.org, Schulte 2012, doi:10.18637/jss.v046.i03)

  19. d

    Salmon age, sex, and length data from Westward and Southeast Alaska,...

    • search.dataone.org
    • knb.ecoinformatics.org
    • +3more
    Updated Aug 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alaska Department of Fish and Game, Division of Commercial Fisheries (2021). Salmon age, sex, and length data from Westward and Southeast Alaska, 1979-2017 [Dataset]. http://doi.org/10.5063/J38QX8
    Explore at:
    Dataset updated
    Aug 19, 2021
    Dataset provided by
    Knowledge Network for Biocomplexity
    Authors
    Alaska Department of Fish and Game, Division of Commercial Fisheries
    Time period covered
    Jul 21, 1979 - Mar 18, 2017
    Area covered
    Variables measured
    SEX, Sex, GEAR, Gear, MESH, FW_AGE, Length, SW_AGE, Source, WEIGHT, and 32 more
    Description

    Age, sex and length data provide population dynamics information that can indicate how populations trends occur and may be changing. These data can help researchers estimate population growth rates, age-class distribution and population demographics. Knowing population demographics, growth rates and trends is particularly valuable to fisheries managers who must perform population assessments to inform management decisions. These data are therefore particularly important in valuable fisheries like the salmon fisheries of Alaska. This dataset includes age, sex and length data compiled from annual sampling of commercial and subsistence salmon harvests and research projects in westward and southeast Kodiak. It includes data on five salmon species: chinook, chum, coho, pink and sockeye. Age estimates were made by examining scales or bony structures (e.g. otoliths - ear bones). Scales were removed from the side of the fish; usually the left side above the lateral line. Scales or bony structures were then mounted on gummed cards and pressed on acetate to make an impression. The number of freshwater and saltwater annuli (i.e. rings) was counted to estimate age in years. Age is recorded in European Notation, which is a method of recording both fresh and saltwater annuli. For example, for a fish that spent one year in freshwater and 3 years in saltwater, its age is recorded as 1.3. The total fish age is the sum of the first and second numbers, plus one to account for the time between deposition and emergence. Therefore the fish in this example is 5 years old. Fish sex was determined by either examining external morphology (eg. head and belly shape) or internal sex organ. Length was measured in millimeters, generally from mid-eye to the fork of the tail. This data package includes the original data file (ASL DATA EXPORT.csv), a reformatting script that reformats the original data file into a consistent format (ASL_Formatting_SoutheastKodiak.R), and the reformatted dataset as a .csv file (ASL_formatted_SoutheastKodiak.csv).

  20. Taylor Swift Spotify Data

    • kaggle.com
    zip
    Updated Nov 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ARTHUR BOARI (2024). Taylor Swift Spotify Data [Dataset]. https://www.kaggle.com/datasets/arthurboari/taylor-swift-spotify-data/code
    Explore at:
    zip(904070 bytes)Available download formats
    Dataset updated
    Nov 24, 2024
    Authors
    ARTHUR BOARI
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset presents the full non-altered Spotify data of the U.S. singer-songwriter, Taylor Swift, as of 2024-11-24. The data was obtained through the usage of the spotifyr package in the R programming language. The readr package was used to export the dataset to .csv.

    The user must observe the multiple versions of all albums are available on the most recent version of this dataset.

    UPDATE: the new version contains not only album data, but singles and compilations too. The most recent release was I Can Do It With A Broken Heart (Remix).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. Geological Survey (2025). Data to Assess Nitrogen Export from Forested Watersheds in and near the Long Island Sound Basin with Weighted Regressions on Time, Discharge, and Season (WRTDS) [Dataset]. https://catalog.data.gov/dataset/data-to-assess-nitrogen-export-from-forested-watersheds-in-and-near-the-long-island-sound-

Data from: Data to Assess Nitrogen Export from Forested Watersheds in and near the Long Island Sound Basin with Weighted Regressions on Time, Discharge, and Season (WRTDS)

Related Article
Explore at:
Dataset updated
Sep 12, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Long Island Sound, Long Island
Description

The U.S. Geological Survey, in cooperation with the U.S. Environmental Protection Agency's Long Island Sound Study (https://longislandsoundstudy.net), characterized nitrogen export from forested watersheds and whether nitrogen loading has been increasing or decreasing to help inform Long Island Sound management strategies. The Weighted Regressions on Time, Discharge, and Season (WRTDS; Hirsch and others, 2010) method was used to estimate annual concentrations and fluxes of nitrogen species using long-term records (14 to 37 years in length) of stream total nitrogen, dissolved organic nitrogen, nitrate, and ammonium concentrations and daily discharge data from 17 watersheds located in the Long Island Sound basin or in nearby areas of Massachusetts, New Hampshire, or New York. This data release contains the input water-quality and discharge data, annual outputs (including concentrations, fluxes, yields, and confidence intervals about these estimates), statistical tests for trends between the periods of water years 1999-2000 and 2016-2018, and model diagnostic statistics. These datasets are organized into one zip file (WRTDSeLists.zip) and six comma-separated values (csv) data files (StationInformation.csv, AnnualResults.csv, TrendResults.csv, ModelStatistics.csv, InputWaterQuality.csv, and InputStreamflow.csv). The csv file (StationInformation.csv) contains information about the stations and input datasets. Finally, a short R script (SampleScript.R) is included to facilitate viewing the input and output data and to re-run the model. Reference: Hirsch, R.M., Moyer, D.L., and Archfield, S.A., 2010, Weighted Regressions on Time, Discharge, and Season (WRTDS), with an application to Chesapeake Bay River inputs: Journal of the American Water Resources Association, v. 46, no. 5, p. 857–880.

Search
Clear search
Close search
Google apps
Main menu