63 datasets found
  1. Dataset 1. Contains all the variables necessary to reproduce the results of...

    • zenodo.org
    zip
    Updated Jan 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Liebrenz; Michael Liebrenz (2020). Dataset 1. Contains all the variables necessary to reproduce the results of Liebrenz et al. [Dataset]. http://doi.org/10.5281/zenodo.19623
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 21, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Michael Liebrenz; Michael Liebrenz
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    File formats:

    .xls: Excel file with variable names in 1. row and variable labels in 2. row

    .xpt/.xpf: SAS XPORT data file (.xpt) and value labels (formats.xpf).

    Note that the following variables were renamed in the output file: sumcadhssb -> SUMCADHS, sumcwursk -> SUMCWURS, adhdnotest -> ADHDNOTE, subs_subnotob -> SUBS_SUB, and that the internally recorded dataset name was shortened to "Liebrenz" .dta: Stata 13 data file

  2. a

    quakes

    • rstudio-pubs-static.s3.amazonaws.com
    • rpubs.com
    Updated Dec 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). quakes [Dataset]. https://rstudio-pubs-static.s3.amazonaws.com/852040_3106898049c64edda3a2b49f893a0c41.html
    Explore at:
    Dataset updated
    Dec 31, 2021
    Variables measured
    lat, mag, long, depth, stations
    Description

    The dataset has N=1000 rows and 5 columns. 1000 rows have no missing values on any column.

    Table of variables

    This table contains variable names, labels, and number of missing values. See the complete codebook for more.

    namelabeln_missing
    latNA0
    longNA0
    depthNA0
    magNA0
    stationsNA0

    Note

    This dataset was automatically described using the codebook R package (version 0.9.2).

  3. r

    epr

    • rpubs.com
    Updated Sep 18, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). epr [Dataset]. https://rpubs.com/zhmy89/codebook_EPR
    Explore at:
    Dataset updated
    Sep 18, 2020
    Variables measured
    Q57, Q60, Q65, Q75, CSNT, Q2.1, Q2.2, Q2.6, Q2.7, Q4.4, and 255 more
    Description

    The dataset has N=2778 rows and 265 columns. 0 rows have no missing values on any column.

    Table of variables

    This table contains variable names, labels, and number of missing values. See the complete codebook for more.

    [truncated]

    Note

    This dataset was automatically described using the codebook R package (version 0.9.2).

  4. Production data at HS6 level from Solleder, Silvy, and Olarreaga (2024)

    • zenodo.org
    bin, csv
    Updated Sep 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jean-Marc Solleder; Jean-Marc Solleder; Fulvio Silvy; Fulvio Silvy; Marcelo Olarreaga; Marcelo Olarreaga (2024). Production data at HS6 level from Solleder, Silvy, and Olarreaga (2024) [Dataset]. http://doi.org/10.5281/zenodo.13365106
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Sep 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jean-Marc Solleder; Jean-Marc Solleder; Fulvio Silvy; Fulvio Silvy; Marcelo Olarreaga; Marcelo Olarreaga
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Aug 26, 2024
    Description

    Description

    This dataset contains the data from the paper "Protection for sale without aggregation bias" (Solleder, Silvy, and Olarreaga, 2024).

    Production data indicated by "Estimates" in the source field of the dataset were estimated using the Regression-Enhanced Random Forest (RERF) algorithm developed by Zhang et al. (2017). It allows us to predict production data trained and tested on the Prodcom dataset, containing mainly European countries, for which production data is available at a high level of disaggregation with a correspondence to the HS six-digit classification. For more information on the procedure, please take a look at Solleder, Silvy, and Olarreaga (2024).

    Production data indicated by "FAO" in the source field of the dataset is sourced from FAO and has been converted by the authors to 6-digit HS codes.

    The dta file has been created with STATA 17. The csv file is a comma-separated value file. The separator is ',', and the first row is variable names. The content is the same in both files. Variables are:

    • country: ISO 3166 3-character country codes, string;
    • year: years (1999-2015), numeric;
    • commoditycode: product 6-digit HS codes in HS revision 1992 (H0), numeric;
    • production: production in country - year in 1000 USD, numeric;
    • source: "Estimates" / "FAO" (see above)

    References

    Solleder, J, F Silvy and M Olarreaga, 2024. Protection for sale without aggregation bias, CEPR Discussion Paper No. 19418. CEPR Press, Paris & London. https://cepr.org/publications/dp19418

    Zhang, Haozhe, Dan Nettleton and Zhengyuan Zhu, 2017. Regression-Enhanced Random Forests. JSM Proceedings, Section on Statistical Learning and Data Science, Alexandria, 636-647. https://doi.org/10.48550/arXiv.1904.10416

  5. c

    Parameter estimates of mixed generalized Gaussian distribution for modelling...

    • research-data.cardiff.ac.uk
    • datasetcatalog.nlm.nih.gov
    zip
    Updated Sep 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zoe Salinger; Alla Sikorskii; Michael J. Boivin; Nenad Šuvak; Maria Veretennikova; Nikolai N. Leonenko (2024). Parameter estimates of mixed generalized Gaussian distribution for modelling the increments of electroencephalogram data [Dataset]. http://doi.org/10.17035/d.2023.0277307170
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 18, 2024
    Dataset provided by
    Cardiff University
    Authors
    Zoe Salinger; Alla Sikorskii; Michael J. Boivin; Nenad Šuvak; Maria Veretennikova; Nikolai N. Leonenko
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Electroencephalogram (EEG) is used to monitor child's brain during coma by recording data on electrical neural activity of the brain. Signals are captured by multiple electrodes called channels located over the scalp. Statistical analyses of EEG data includes classification and prediction using arrays of EEG features, but few models for the underlying stochastic processes have been proposed. For this purpose, a new strictly stationary strong mixing diffusion model with marginal multimodal (three-peak) distribution (MixGGDiff) and exponentially decaying autocorrelation function for modeling of increments of EEG data was proposed. The increments were treated as discrete-time observations and a diffusion process where the stationary distribution is viewed as a mixture of three non-central generalized Gaussian distributions (MixGGD) was constructed.Probability density function of a mixed generalized Gaussian distribution (MixGGD) consists of three components and is described using a total of 12 parameters:\muk, location parameter of each of the components,sk, shape parameter of each of the components, \sigma2k, parameter related to the scale of each of the components andwk, weight of each of the components, where k, k={1,2,3} refers to theindex of the component of a MixGGD. The parameters of this distribution were estimated using the expectation-maximization algorithm, where the added shape parameter is estimated using the higher order statistics approach based on an analytical relationship between the shape parameter and kurtosis.To illustrate an application of the MixGGDiff to real data, analysis of EEG data collected in Uganda between 2008 and 2015 from 78 children within age-range of 18 months to 12 years who were in coma due to cerebral malaria was performed. EEG were recorded using the International 10–20 system with the sampling rate of 500 Hz and the average record duration of 30 min. EEG signal for every child was the result of a recording from 19 channels. MixGGD was fitted to each channel of every child's recording separately, hence for each channel a total of 12 parameter estimates were obtained. The data is presented in a matrix form (dimension 79*228) in a .csv format and consists of 79 rows where the first row is a header row which contains the names of the variables and the subsequent 78 rows represent parameter estimates of one instance (i.e. one child, without identifiers that could be related back to a specific child). There are a total of 228 columns (19 channels times 12 parameter estimates) where each column represents one parameter estimate of one component of MixGGD in the order of the channels, thus columns 1 to 12 refer to parameter estimates on the first channel, columns 13 to 24 refer to parameter estimates on the second channel and so on. Each variable name starts with "chi" where "ch" is an abbreviation of "channel" and i refers to the order of the channel from EEG recording. The rest of the characters in variable names refer to the parameter estimate names of the components of a MixGGD, thus for example "ch3sigmasq1" refers to the parameter estimate of \sigma2 of the first component of MixGGD obtained from EEG increments on the third channel. Parameter estimates contained in the .csv file are all real numbers within a range of -671.11 and 259326.96.Research results based upon these data are published at https://doi.org/10.1007/s00477-023-02524-y

  6. Z

    Film Circulation dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Loist, Skadi; Samoilova, Evgenia (Zhenya) (2024). Film Circulation dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7887671
    Explore at:
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Film University Babelsberg KONRAD WOLF
    Authors
    Loist, Skadi; Samoilova, Evgenia (Zhenya)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Complete dataset of “Film Circulation on the International Film Festival Network and the Impact on Global Film Culture”

    A peer-reviewed data paper for this dataset is in review to be published in NECSUS_European Journal of Media Studies - an open access journal aiming at enhancing data transparency and reusability, and will be available from https://necsus-ejms.org/ and https://mediarep.org

    Please cite this when using the dataset.

    Detailed description of the dataset:

    1 Film Dataset: Festival Programs

    The Film Dataset consists a data scheme image file, a codebook and two dataset tables in csv format.

    The codebook (csv file “1_codebook_film-dataset_festival-program”) offers a detailed description of all variables within the Film Dataset. Along with the definition of variables it lists explanations for the units of measurement, data sources, coding and information on missing data.

    The csv file “1_film-dataset_festival-program_long” comprises a dataset of all films and the festivals, festival sections, and the year of the festival edition that they were sampled from. The dataset is structured in the long format, i.e. the same film can appear in several rows when it appeared in more than one sample festival. However, films are identifiable via their unique ID.

    The csv file “1_film-dataset_festival-program_wide” consists of the dataset listing only unique films (n=9,348). The dataset is in the wide format, i.e. each row corresponds to a unique film, identifiable via its unique ID. For easy analysis, and since the overlap is only six percent, in this dataset the variable sample festival (fest) corresponds to the first sample festival where the film appeared. For instance, if a film was first shown at Berlinale (in February) and then at Frameline (in June of the same year), the sample festival will list “Berlinale”. This file includes information on unique and IMDb IDs, the film title, production year, length, categorization in length, production countries, regional attribution, director names, genre attribution, the festival, festival section and festival edition the film was sampled from, and information whether there is festival run information available through the IMDb data.

    2 Survey Dataset

    The Survey Dataset consists of a data scheme image file, a codebook and two dataset tables in csv format.

    The codebook “2_codebook_survey-dataset” includes coding information for both survey datasets. It lists the definition of the variables or survey questions (corresponding to Samoilova/Loist 2019), units of measurement, data source, variable type, range and coding, and information on missing data.

    The csv file “2_survey-dataset_long-festivals_shared-consent” consists of a subset (n=161) of the original survey dataset (n=454), where respondents provided festival run data for films (n=206) and gave consent to share their data for research purposes. This dataset consists of the festival data in a long format, so that each row corresponds to the festival appearance of a film.

    The csv file “2_survey-dataset_wide-no-festivals_shared-consent” consists of a subset (n=372) of the original dataset (n=454) of survey responses corresponding to sample films. It includes data only for those films for which respondents provided consent to share their data for research purposes. This dataset is shown in wide format of the survey data, i.e. information for each response corresponding to a film is listed in one row. This includes data on film IDs, film title, survey questions regarding completeness and availability of provided information, information on number of festival screenings, screening fees, budgets, marketing costs, market screenings, and distribution. As the file name suggests, no data on festival screenings is included in the wide format dataset.

    3 IMDb & Scripts

    The IMDb dataset consists of a data scheme image file, one codebook and eight datasets, all in csv format. It also includes the R scripts that we used for scraping and matching.

    The codebook “3_codebook_imdb-dataset” includes information for all IMDb datasets. This includes ID information and their data source, coding and value ranges, and information on missing data.

    The csv file “3_imdb-dataset_aka-titles_long” contains film title data in different languages scraped from IMDb in a long format, i.e. each row corresponds to a title in a given language.

    The csv file “3_imdb-dataset_awards_long” contains film award data in a long format, i.e. each row corresponds to an award of a given film.

    The csv file “3_imdb-dataset_companies_long” contains data on production and distribution companies of films. The dataset is in a long format, so that each row corresponds to a particular company of a particular film.

    The csv file “3_imdb-dataset_crew_long” contains data on names and roles of crew members in a long format, i.e. each row corresponds to each crew member. The file also contains binary gender assigned to directors based on their first names using the GenderizeR application.

    The csv file “3_imdb-dataset_festival-runs_long” contains festival run data scraped from IMDb in a long format, i.e. each row corresponds to the festival appearance of a given film. The dataset does not include each film screening, but the first screening of a film at a festival within a given year. The data includes festival runs up to 2019.

    The csv file “3_imdb-dataset_general-info_wide” contains general information about films such as genre as defined by IMDb, languages in which a film was shown, ratings, and budget. The dataset is in wide format, so that each row corresponds to a unique film.

    The csv file “3_imdb-dataset_release-info_long” contains data about non-festival release (e.g., theatrical, digital, tv, dvd/blueray). The dataset is in a long format, so that each row corresponds to a particular release of a particular film.

    The csv file “3_imdb-dataset_websites_long” contains data on available websites (official websites, miscellaneous, photos, video clips). The dataset is in a long format, so that each row corresponds to a website of a particular film.

    The dataset includes 8 text files containing the script for webscraping. They were written using the R-3.6.3 version for Windows.

    The R script “r_1_unite_data” demonstrates the structure of the dataset, that we use in the following steps to identify, scrape, and match the film data.

    The R script “r_2_scrape_matches” reads in the dataset with the film characteristics described in the “r_1_unite_data” and uses various R packages to create a search URL for each film from the core dataset on the IMDb website. The script attempts to match each film from the core dataset to IMDb records by first conducting an advanced search based on the movie title and year, and then potentially using an alternative title and a basic search if no matches are found in the advanced search. The script scrapes the title, release year, directors, running time, genre, and IMDb film URL from the first page of the suggested records from the IMDb website. The script then defines a loop that matches (including matching scores) each film in the core dataset with suggested films on the IMDb search page. Matching was done using data on directors, production year (+/- one year), and title, a fuzzy matching approach with two methods: “cosine” and “osa.” where the cosine similarity is used to match titles with a high degree of similarity, and the OSA algorithm is used to match titles that may have typos or minor variations.

    The script “r_3_matching” creates a dataset with the matches for a manual check. Each pair of films (original film from the core dataset and the suggested match from the IMDb website was categorized in the following five categories: a) 100% match: perfect match on title, year, and director; b) likely good match; c) maybe match; d) unlikely match; and e) no match). The script also checks for possible doubles in the dataset and identifies them for a manual check.

    The script “r_4_scraping_functions” creates a function for scraping the data from the identified matches (based on the scripts described above and manually checked). These functions are used for scraping the data in the next script.

    The script “r_5a_extracting_info_sample” uses the function defined in the “r_4_scraping_functions”, in order to scrape the IMDb data for the identified matches. This script does that for the first 100 films, to check, if everything works. Scraping for the entire dataset took a few hours. Therefore, a test with a subsample of 100 films is advisable.

    The script “r_5b_extracting_info_all” extracts the data for the entire dataset of the identified matches.

    The script “r_5c_extracting_info_skipped” checks the films with missing data (where data was not scraped) and tried to extract data one more time to make sure that the errors were not caused by disruptions in the internet connection or other technical issues.

    The script “r_check_logs” is used for troubleshooting and tracking the progress of all of the R scripts used. It gives information on the amount of missing values and errors.

    4 Festival Library Dataset

    The Festival Library Dataset consists of a data scheme image file, one codebook and one dataset, all in csv format.

    The codebook (csv file “4_codebook_festival-library_dataset”) offers a detailed description of all variables within the Library Dataset. It lists the definition of variables, such as location and festival name, and festival categories, units of measurement, data sources and coding and missing data.

    The csv file “4_festival-library_dataset_imdb-and-survey” contains data on all unique festivals collected from both IMDb and survey sources. This dataset appears in wide format, all information for each festival is listed in one row. This

  7. d

    Supplementary materials for: \"Comparing Internet experiences and...

    • dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hargittai, Eszter; Shaw, Aaron (2023). Supplementary materials for: \"Comparing Internet experiences and prosociality in Amazon Mechanical Turk and population-based survey samples\" [Dataset]. http://doi.org/10.7910/DVN/UFL6MI
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Hargittai, Eszter; Shaw, Aaron
    Description

    Overview Supplementary materials for the paper "Comparing Internet experiences and prosociality in Amazon Mechanical Turk and population-based survey samples" by Eszter Hargittai and Aaron Shaw published in Socius in 2020 (https://doi.org/10.1177/2378023119889834). License The materials provided here are issued under the same (Creative Commons Attribution Non-Commercial 4.0) license as the paper. Details and a copy of the license are available at: http://creativecommons.org/licenses/by-nc/4.0/. Manifest The files included are: Hargittai-Shaw-AMT-NORC-2019.rds and Hargittai-Shaw-AMT-NORC-2019.tsv: Two (identical) versions the dataset used for the analysis. The tsv file is provided to facilitate import into software other than R. R analysis code files: 01-import.R - Imports dataset. Creates a mapping of dependent variables and variable names used elsewhere in the figure and analysis. 02-gen_figure.R - Generates Figure 1 in PDF and PNG formats and saves them in the "figures" directory. 03-gendescriptivestats.R - Generates results reported in Table 1. 04-gen_models.R - Fits models reported in Tables 2-4. 05-alternative_specifications.R - Fits models using log-transformed version of the income variable. Makefile: Executes all of the R files in sequence, produces corresponding .log files in the "log" directory that contain the full R session from each file as well as separate error log files (also in the "log" directory) that capture any error messages and warnings generated by R along the way. HargittaiShaw2019Socius-Instrument.pdf: The questions distributed to both the NORC and AMT survey participants used in the analysis reported in this paper. How to reproduce the analysis presented in the paper Depending on your computing environment, reproducing the analysis presented in the paper may be as easy as invoking "make all" or "make" in the directory containing this file on a system that has the appropriate software installed. Once compilation is complete, you can review the log files in a text editor. See below for more on software and dependencies. If calling the makefile fails, the individual R scripts can also be run interactively or in batch mode. Software and dependencies The R and compilation materials provided here were created and tested on a 64-bit laptop pc running Ubuntu 18.04.3 LTS, R version 3.6.1, ggplot2 version 3.2.1, reshape2 version 1.4.3, forcats version 0.4.0, pscl version 1.5.2, and stargazer version 5.2.2 (these last five are R packages called in specific .R files). As with all software, your mileage may vary and the authors provide no warranties. Codebook The dataset consists of 36 variables (columns) and 2,716 participants (rows). The variable names and brief descriptions follow below. Additional details of measurement are provided in the paper and survey instrument. All dichotomous indicators are coded 0/1 where 1 is the affirmative response implied by the variable name: id: Index to identify individual units (participants). svy_raked_wgt: Raked survey weights provided by NORC. amtsample: Data source coded 0 (NORC) or 1 (AMT). age: Participant age in years. female: Participant selected "female" gender. incomecont: Income in USD (continuous) coded from center-points of categories reported in the instruments. incomediv: Income in $1,000s USD (=incomecont/1000). incomesqrt: Square-root of incomecont. lincome: Natural logarithm of incomecont. rural: Participant resides in a rural area. employed: Participant is fully or partially employed. eduhsorless: Highest education level is high school or less. edusc: Highest education level is completed some college. edubaormore: Highest education level is BA or more. white: Race = white. black: Race = black. nativeam: Race = native american. hispanic: Ethnicity = hispanic. asian: Race = asian. raceother: Race = other. skillsmean: Internet use skills index (described in paper). accesssum: Internet use autonomy (described in paper). webweekhrs: Internet use frequency (described in paper). do_sum: Participatory online activities (described in paper). snssumcompare: Social network site activities (described in paper). altru_scale: Generous behaviors (described in paper). trust_scale: Trust scale score (described in paper). pts_give: Points donated in unilateral dictator game (described in paper). std_accesssum: Standardized (z-score) version of accesssum. std_webweekhrs: Standardized (z-score) version of webweekhrs. std_skillsmean: Standardized (z-score) version of skillsmean. std_do_sum: Standardized (z-score) version of do_sum. std_snssumcompare: Standardized (z-score) version of snssumcompare. std_trust_scale: Standardized (z-score) version of trust_scale. std_altru_scale: Standardized (z-score) version of altru_scale. std_pts_give: Standardized (z-score) version of pts_give.

  8. Z

    Vertical distribution of heterotrophic nanoflagellates in the Baltic Proper

    • data.niaid.nih.gov
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Piwosz, Kasia (2024). Vertical distribution of heterotrophic nanoflagellates in the Baltic Proper [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13945311
    Explore at:
    Dataset updated
    Nov 25, 2024
    Dataset provided by
    National Marine Fisheries Research Institute
    Authors
    Piwosz, Kasia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains data on the abundance of prokaryotes, heterotrophic nanoflagellates (HNF), specific lineages of HNF and environmental factors in the Baltic Sea collected during four cruises of r/v Baltica (National Fisheries Research Institute) in 2021. The Excel file includes six sheets:

    The "Parameters-Data" sheet lists all parameters for data presented in the "Data" sheet. Column A (Name) contains the variables names, column B (Unit) contains units in which they were measured, column C (Method/Device) contains information on the methodology, and column D (Comments) contains additional information

    The "Data" sheet contains data in a wide format for all variables listed in the "Parameters-Data" sheet measured at sampling depths. The first row contains variable names (listed in Column A of the Parameters-Data sheet) with units in square brackets

    The "Parameter-Size" sheet lists parameters for data presented in the "Size" sheet in the same format as described for the "Parameters-Data" sheet. Starting from row 5 in columns A and B, the number of measured HNF cells for each sample is given

    The "Size" sheet contains size measurements of HNF in the samples in a long format. The number of cells measured in each sample is provided in the "Parameter-Size" sheet

    The "Parameters-CTD depth profiles" sheet lists parameters for data presented in the "CTD depth profiles" sheet in the same format as described for the "Parameters-Data" sheet.

    The "CTD depth profiles" sheet contains full-depth profiles of variables measured with a CTD probe with 1 m resolution.

  9. f

    Parameters for the logistic regression model to predict Name Generator ties....

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Feb 19, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthews, Luke J.; DeWan, Peter; Rula, Elizabeth Y. (2013). Parameters for the logistic regression model to predict Name Generator ties. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001669184
    Explore at:
    Dataset updated
    Feb 19, 2013
    Authors
    Matthews, Luke J.; DeWan, Peter; Rula, Elizabeth Y.
    Description

    *This field indicates a dummy variable was also included. If a data point for the row variable was a 0, the dummy took on a value of 1. Otherwise the dummy was 0. Row variables with blank entries did not exhibit over-dispersion of zeros and so did not require dummy variables.†Variable was log transformed to better meet generalized linear model assumptions.

  10. C

    Replication data for "High life satisfaction reported among small-scale...

    • dataverse.csuc.cat
    csv, txt
    Updated Feb 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric Galbraith; Eric Galbraith; Victoria Reyes Garcia; Victoria Reyes Garcia (2024). Replication data for "High life satisfaction reported among small-scale societies with low incomes" [Dataset]. http://doi.org/10.34810/data904
    Explore at:
    csv(1620), csv(7829), txt(7017), csv(227502)Available download formats
    Dataset updated
    Feb 7, 2024
    Dataset provided by
    CORA.Repositori de Dades de Recerca
    Authors
    Eric Galbraith; Eric Galbraith; Victoria Reyes Garcia; Victoria Reyes Garcia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2021 - Oct 24, 2023
    Area covered
    Ghana, Kumbungu, United Republic of, Mafia Island, Tanzania, Laprak, Nepal, Ba, Fiji, China, Shangri-la, Argentina, Puna, Senegal, Bassari country, Darjeeling, India, Mongolia, Bulgan soum, Western highlands, Guatemala
    Dataset funded by
    European Commission
    Description

    This dataset was created in order to document self-reported life evaluations among small-scale societies that exist on the fringes of mainstream industrialized socieities. The data were produced as part of the LICCI project, through fieldwork carried out by LICCI partners. The data include individual responses to a life satisfaction question, and household asset values. Data from Gallup World Poll and the World Values Survey are also included, as used for comparison. TABULAR DATA-SPECIFIC INFORMATION --------------------------------- 1. File name: LICCI_individual.csv Number of rows and columns: 2814,7 Variable list: Variable names: User, Site, village Description: identification of investigator and location Variable name: Well.being.general Description: numerical score for life satisfaction question Variable names: HH_Assets_US, HH_Assets_USD_capita Description: estimated value of representative assets in the household of respondent, total and per capita (accounting for number of household inhabitants) 2. File name: LICCI_bySite.csv Number of rows and columns: 19,8 Variable list: Variable names: Site, N Description: site name and number of respondents at the site Variable names: SWB_mean, SWB_SD Description: mean and standard deviation of life satisfaction score Variable names: HHAssets_USD_mean, HHAssets_USD_sd Description: Site mean and standard deviation of household asset value Variable names: PerCapAssets_USD_mean, PerCapAssets_USD_sd Description: Site mean and standard deviation of per capita asset value 3. File name: gallup_WVS_GDP_pk.csv Number of rows and columns: 146,8 Variable list: Variable name: Happiness Score, Whisker-high, Whisker-low Description: from Gallup World Poll as documented in World Happiness Report 2022. Variable name: GDP-PPP2017 Description: Gross Domestic Product per capita for year 2020 at PPP (constant 2017 international $). Accessed May 2022. Variable name: pk Description: Produced capital per capita for year 2018 (in 2018 US$) for available countries, as estimated by the World Bank (accessed February 2022). Variable names: WVS7_mean, WVS7_std Description: Results of Question 49 in the World Values Survey, Wave 7.

  11. d

    Data from: Trade-offs between growth rate, tree size and lifespan of...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated May 26, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christof Bigler (2016). Trade-offs between growth rate, tree size and lifespan of mountain pine (Pinus montana) in the Swiss National Park [Dataset]. http://doi.org/10.5061/dryad.d2680
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 26, 2016
    Dataset provided by
    Dryad
    Authors
    Christof Bigler
    Time period covered
    Jul 6, 2015
    Area covered
    Switzerland, Swiss National Park, canton of Grisons
    Description

    A within-species trade-off between growth rates and lifespan has been observed across different taxa of trees, however, there is some uncertainty whether this trade-off also applies to shade-intolerant tree species. The main objective of this study was to investigate the relationships between radial growth, tree size and lifespan of shade-intolerant mountain pines. For 200 dead standing mountain pines (Pinus montana) located along gradients of aspect, slope steepness and elevation in the Swiss National Park, radial annual growth rates and lifespan were reconstructed. While early growth (i.e. mean tree-ring width over the first 50 years) correlated positively with diameter at the time of tree death, a negative correlation resulted with lifespan, i.e. rapidly growing mountain pines face a trade-off between reaching a large diameter at the cost of early tree death. Slowly growing mountain pines may reach a large diameter and a long lifespan, but risk to die young at a small size. Early gro...

  12. S

    Experimental Dataset on the Impact of Unfair Behavior by AI and Humans on...

    • scidb.cn
    Updated Apr 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yang Luo (2025). Experimental Dataset on the Impact of Unfair Behavior by AI and Humans on Trust: Evidence from Six Experimental Studies [Dataset]. http://doi.org/10.57760/sciencedb.psych.00565
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 30, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Yang Luo
    Description

    This dataset originates from a series of experimental studies titled “Tough on People, Tolerant to AI? Differential Effects of Human vs. AI Unfairness on Trust” The project investigates how individuals respond to unfair behavior (distributive, procedural, and interactional unfairness) enacted by artificial intelligence versus human agents, and how such behavior affects cognitive and affective trust.1 Experiment 1a: The Impact of AI vs. Human Distributive Unfairness on TrustOverview: This dataset comes from an experimental study aimed at examining how individuals respond in terms of cognitive and affective trust when distributive unfairness is enacted by either an artificial intelligence (AI) agent or a human decision-maker. Experiment 1a specifically focuses on the main effect of the “type of decision-maker” on trust.Data Generation and Processing: The data were collected through Credamo, an online survey platform. Initially, 98 responses were gathered from students at a university in China. Additional student participants were recruited via Credamo to supplement the sample. Attention check items were embedded in the questionnaire, and participants who failed were automatically excluded in real-time. Data collection continued until 202 valid responses were obtained. SPSS software was used for data cleaning and analysis.Data Structure and Format: The data file is named “Experiment1a.sav” and is in SPSS format. It contains 28 columns and 202 rows, where each row corresponds to one participant. Columns represent measured variables, including: grouping and randomization variables, one manipulation check item, four items measuring distributive fairness perception, six items on cognitive trust, five items on affective trust, three items for honesty checks, and four demographic variables (gender, age, education, and grade level). The final three columns contain computed means for distributive fairness, cognitive trust, and affective trust.Additional Information: No missing data are present. All variable names are labeled in English abbreviations to facilitate further analysis. The dataset can be directly opened in SPSS or exported to other formats.2 Experiment 1b: The Mediating Role of Perceived Ability and Benevolence (Distributive Unfairness)Overview: This dataset originates from an experimental study designed to replicate the findings of Experiment 1a and further examine the potential mediating role of perceived ability and perceived benevolence.Data Generation and Processing: Participants were recruited via the Credamo online platform. Attention check items were embedded in the survey to ensure data quality. Data were collected using a rolling recruitment method, with invalid responses removed in real time. A total of 228 valid responses were obtained.Data Structure and Format: The dataset is stored in a file named Experiment1b.sav in SPSS format and can be directly opened in SPSS software. It consists of 228 rows and 40 columns. Each row represents one participant’s data record, and each column corresponds to a different measured variable. Specifically, the dataset includes: random assignment and grouping variables; one manipulation check item; four items measuring perceived distributive fairness; six items on perceived ability; five items on perceived benevolence; six items on cognitive trust; five items on affective trust; three items for attention check; and three demographic variables (gender, age, and education). The last five columns contain the computed mean scores for perceived distributive fairness, ability, benevolence, cognitive trust, and affective trust.Additional Notes: There are no missing values in the dataset. All variables are labeled using standardized English abbreviations to facilitate reuse and secondary analysis. The file can be analyzed directly in SPSS or exported to other formats as needed.3 Experiment 2a: Differential Effects of AI vs. Human Procedural Unfairness on TrustOverview: This dataset originates from an experimental study aimed at examining whether individuals respond differently in terms of cognitive and affective trust when procedural unfairness is enacted by artificial intelligence versus human decision-makers. Experiment 2a focuses on the main effect of the decision agent on trust outcomes.Data Generation and Processing: Participants were recruited via the Credamo online survey platform from two universities located in different regions of China. A total of 227 responses were collected. After excluding those who failed the attention check items, 204 valid responses were retained for analysis. Data were processed and analyzed using SPSS software.Data Structure and Format: The dataset is stored in a file named Experiment2a.sav in SPSS format and can be directly opened in SPSS software. It contains 204 rows and 30 columns. Each row represents one participant’s response record, while each column corresponds to a specific variable. Variables include: random assignment and grouping; one manipulation check item; seven items measuring perceived procedural fairness; six items on cognitive trust; five items on affective trust; three attention check items; and three demographic variables (gender, age, and education). The final three columns contain computed average scores for procedural fairness, cognitive trust, and affective trust.Additional Notes: The dataset contains no missing values. All variables are labeled using standardized English abbreviations to facilitate reuse and secondary analysis. The file can be directly analyzed in SPSS or exported to other formats as needed.4 Experiment 2b: Mediating Role of Perceived Ability and Benevolence (Procedural Unfairness)Overview: This dataset comes from an experimental study designed to replicate the findings of Experiment 2a and to further examine the potential mediating roles of perceived ability and perceived benevolence in shaping trust responses under procedural unfairness.Data Generation and Processing: Participants were working adults recruited through the Credamo online platform. A rolling data collection strategy was used, where responses failing attention checks were excluded in real time. The final dataset includes 235 valid responses. All data were processed and analyzed using SPSS software.Data Structure and Format: The dataset is stored in a file named Experiment2b.sav, which is in SPSS format and can be directly opened using SPSS software. It contains 235 rows and 43 columns. Each row corresponds to a single participant, and each column represents a specific measured variable. These include: random assignment and group labels; one manipulation check item; seven items measuring procedural fairness; six items for perceived ability; five items for perceived benevolence; six items for cognitive trust; five items for affective trust; three attention check items; and three demographic variables (gender, age, education). The final five columns contain the computed average scores for procedural fairness, perceived ability, perceived benevolence, cognitive trust, and affective trust.Additional Notes: There are no missing values in the dataset. All variables are labeled using standardized English abbreviations to support future reuse and secondary analysis. The dataset can be directly analyzed in SPSS and easily converted into other formats if needed.5 Experiment 3a: Effects of AI vs. Human Interactional Unfairness on TrustOverview: This dataset comes from an experimental study that investigates how interactional unfairness, when enacted by either artificial intelligence or human decision-makers, influences individuals’ cognitive and affective trust. Experiment 3a focuses on the main effect of the “decision-maker type” under interactional unfairness conditions.Data Generation and Processing: Participants were college students recruited from two universities in different regions of China through the Credamo survey platform. After excluding responses that failed attention checks, a total of 203 valid cases were retained from an initial pool of 223 responses. All data were processed and analyzed using SPSS software.Data Structure and Format: The dataset is stored in the file named Experiment3a.sav, in SPSS format and compatible with SPSS software. It contains 203 rows and 27 columns. Each row represents a single participant, while each column corresponds to a specific measured variable. These include: random assignment and condition labels; one manipulation check item; four items measuring interactional fairness perception; six items for cognitive trust; five items for affective trust; three attention check items; and three demographic variables (gender, age, education). The final three columns contain computed average scores for interactional fairness, cognitive trust, and affective trust.Additional Notes: There are no missing values in the dataset. All variable names are provided using standardized English abbreviations to facilitate secondary analysis. The data can be directly analyzed using SPSS and exported to other formats as needed.6 Experiment 3b: The Mediating Role of Perceived Ability and Benevolence (Interactional Unfairness)Overview: This dataset comes from an experimental study designed to replicate the findings of Experiment 3a and further examine the potential mediating roles of perceived ability and perceived benevolence under conditions of interactional unfairness.Data Generation and Processing: Participants were working adults recruited via the Credamo platform. Attention check questions were embedded in the survey, and responses that failed these checks were excluded in real time. Data collection proceeded in a rolling manner until a total of 227 valid responses were obtained. All data were processed and analyzed using SPSS software.Data Structure and Format: The dataset is stored in the file named Experiment3b.sav, in SPSS format and compatible with SPSS software. It includes 227 rows and

  13. 👗 Women's E-Commerce Clothing Reviews

    • kaggle.com
    zip
    Updated Jun 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mexwell (2024). 👗 Women's E-Commerce Clothing Reviews [Dataset]. https://www.kaggle.com/datasets/mexwell/womens-e-commerce-clothing-reviews
    Explore at:
    zip(2922113 bytes)Available download formats
    Dataset updated
    Jun 10, 2024
    Authors
    mexwell
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description

    Welcome. This is a Women’s Clothing E-Commerce dataset revolving around the reviews written by customers. Its nine supportive features offer a great environment to parse out the text through its multiple dimensions. Because this is real commercial data, it has been anonymized, and references to the company in the review text and body have been replaced with “retailer”.

    Variables

    This dataset includes 23486 rows and 10 feature variables. Each row corresponds to a customer review, and includes the variables:

    • Clothing ID: Integer Categorical variable that refers to the specific piece being reviewed.
    • Age: Positive Integer variable of the reviewers age.
    • Title: String variable for the title of the review.
    • Review Text: String variable for the review body.
    • Rating: Positive Ordinal Integer variable for the product score granted by the customer from 1 Worst, to 5 Best.
    • Recommended IND: Binary variable stating where the customer recommends the product where 1 is recommended, 0 is not recommended.
    • Positive Feedback Count: Positive Integer documenting the number of other customers who found this review positive.
    • Division Name: Categorical name of the product high level division.
    • Department Name: Categorical name of the product department name.
    • Class Name: Categorical name of the product class name.

    Acknowlegement

    Foto von Fujiphilm auf Unsplash

  14. CollegeScorecard US College Graduation and

    • kaggle.com
    zip
    Updated Jan 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). CollegeScorecard US College Graduation and [Dataset]. https://www.kaggle.com/datasets/thedevastator/collegescorecard-us-college-graduation-and-oppor/discussion
    Explore at:
    zip(6248358 bytes)Available download formats
    Dataset updated
    Jan 12, 2023
    Authors
    The Devastator
    Description

    CollegeScorecard US College Graduation and Opportunity Data

    Exploring Student Success and Outcomes

    By Noah Rippner [source]

    About this dataset

    This dataset provides an in-depth look at the data elements for the US College CollegeScorecard Graduation and Opportunity Project Use Case. It contains information on the variables used to create a comprehensive report, including Year, dev-category, developer-friendly name, VARIABLE NAME, API data type, label, VALUE, LABEL , SCORECARD? Y/N , SOURCE and NOTES. The data is provided by the U.S Department of Education and allows parents, students and policymakers to take meaningful action to improve outcomes. This dataset contains more than enough information to allow people like Maria - a 25 year old recent US Army veteran who wants a degree in Management Systems and Information Technology -to distinguish between her school options; access services; find affordable housing near high-quality schools which are located in safe neighborhoods that have access to transport links as well as employment opportunities nearby. This highly useful dataset provides detailed analysis of all this criteria so that users can make an informed decision about which school is best for them!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains data related to college students, including their college graduation rates, access to opportunity indicators such as geographic mobility and career readiness, and other important indicators of the overall learning experience in the United States. This guide will show you how to use this dataset to make meaningful conclusions about high education in America.

    First, you will need to be familiar with the different fields included in this CollegeScorecard’s US College Graduation and Opportunity Data set. Each record is comprised of several data elements which are defined by concise labels on the left side of each observation row. These include labels such as Name of Data Element, Year, dev-category (i.e., developmental category), Variable Name, API data type (i.e., type information for programmatic interface), Label (i.e., descriptive content labeling for visual reporting), Value , Label (i.e., descriptive value labeling for visual reporting). SCORECARD? Y/N indicates whether or not a field pertains to U.S Department of Education’s College Scorecard program and SOURCE indicates where the source of the variable can be found among other minor details about that variable are found within Notes column attributed beneath each row entry for further analysis or comparison between elements captured across observations

    Now that you understand the components associated within each element or label related within Observation Rows identified beside each header label let’s go over some key steps you can take when working with this particular dataset:

    • Utilize year specific filters on specified fields if needed — e.g.; Year = 2020 & API Data Type = Character
    • Look up any ‘NCalPlaceHolder” values if applicable — these are placeholders often stating values have been absolved fromScorecards display versioning due conflicting formatting requirements across standard conditions being met or may state these details have still yet been updated recently so upon assessment wait patiently until returns minor changes via API interface incorporate latest returned results statements inventory configuration options relevant against budgetary cycle limits established positions

    • Pivot data points into more custom tabular structured outputs tapering down complex unstructured RAW sources into more digestible Medium Level datasets consumed often via PowerBI / Tableau compatible Snapshots expanding upon Delimited text exports baseline formats provided formerly

    • Explore correlations between education metrics our third parties documents generated frequently such values indicative educational adherence effects ROI growth potential looking beyond Campus Panoramic recognition metrics often supported outside Social Medial Primary

    Research Ideas

    • Creating an interactive dashboard to compare school performance in terms of safety, entrepreneurship and other criteria.
    • Using the data to create a heat map visualization that shows which cities are most conducive to a successful educational experience for students like Maria.
    • Gathering information about average course costs at different universities and mapping them relative to US unemployment rates indicates which states might offer the best value for money when it comes to higher education expenses

    Ack...

  15. w

    myview

    • data.wu.ac.at
    Updated Dec 16, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sindhu (2015). myview [Dataset]. https://data.wu.ac.at/schema/data_kcmo_org/aG11ay1qdGk3
    Explore at:
    Dataset updated
    Dec 16, 2015
    Dataset provided by
    Sindhu
    Description

    This dataset contains basic data for each page on kcmo.gov. The data is monthly aggregate data and contains every page on the kcmo.gov domain.

    This data is pulled directly from Google Analytics into R via the RGoogleAnalytics package (https://github.com/Tatvic/RGoogleAnalytics). The data is then manipulated to change variable names (column headers) and to assign a row ID and sort them in the order page title > Year Month.

  16. f

    doi:10.1038/s41598-017-17553-1: Radar tracks, raw data

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Feb 26, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Makinson, James; Woodgate, Joseph; Chittka, Lars; Lim, Ka; Reynolds, Andrew (2018). doi:10.1038/s41598-017-17553-1: Radar tracks, raw data [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000629575
    Explore at:
    Dataset updated
    Feb 26, 2018
    Authors
    Makinson, James; Woodgate, Joseph; Chittka, Lars; Lim, Ka; Reynolds, Andrew
    Description

    206 x text files (comma separated values) These files contain the raw output data from the harmonic radar for bee flights that underpin the publication: Woodgate JL, Makinson JC, Lim KS, Reynolds AM, Chittka L. Continuous Radar Tracking Illustrates the Development of Multi-destination Routes of Bumblebees. Scientific reports. 2017 Dec 11;7(1):17323. doi:10.1038/s41598-017-17553-1 These files are the output from the radar. For most uses, the processed data from these files (provided in the linked OSF archive, section ‘Coordinate data from each track’) will be preferable since they have been converted from radar-centric to geocentric coordinates so can be compared to one another and aligned with features in the real world. Each file contains tracking data from one flight by one bee. Most files contain an entire flight but in some cases a single flight has been broken up into several sections, denoted ‘a’, ‘b’, etc. Flights were defined as the movements undertaken by a bee between leaving the nest and subsequently returning again. The radar revolves once every 3s and records a positional fix if a signal from a transponder is detected during each rotation. Filenames are of the form ‘YYMMDD_BID_BoutXX’, where YYMMDD: date on which track was recorded. BID: a unique identifier for each individual bee (will be the same for every datapoint within a file). The bees were renamed for clarity in the publication according to the following scheme: Name in data files Name in manuscript B56 Bee 1 B61 Bee 2 B74 Bee 3 Y01 Bee 4 O72 Bee 5 G20 Bee 6 BoutXX: number identifying how many flights each bee had made (A single bout encompasses all the activity made between the bee leaving the nest and returning again. Each file is a text file with a 3-line header identifying the radar software, the date on which the tracks were extracted, and variable names for the subsequent rows. The rest of the file consists of comma delimited columns with a single row for each positional fix recorded. The variables are: Date: date on which recording was made. Time: timestamp for each positional fix. Range: distance in m of bee from radar. Azimuth: angle of bee from axis of radar (NB, the radar records bees’ position in polar coordinates relative to the radar’s own position. The radar is mobile and cannot be guaranteed to be in the same position on different days. To convert the track data to geocentric coordinates we record the position of several fixed landmarks at the beginning of each day’s tracking and use these to triangulate the position of the radar.) Elev_angle: data recorded by the radar on its own state. Never used in any analysis for this paper but included for completeness. Elevation: data recorded by the radar on its own state. Never used in any analysis for this paper but included for completeness. V1: data recorded by the radar on its own state. Never used in any analysis for this paper but included for completeness. V2: data recorded by the radar on its own state. Never used in any analysis for this paper but included for completeness.

  17. Z

    Data from: Impact of delayed response on Wearable Cognitive Assistance

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olguín Muñoz, Manuel; Klatzky, Roberta; Wang, Junjue; Padmanabhan, Pillai; Satyanaryanan, Mahadev; Gross, James (2021). Impact of delayed response on Wearable Cognitive Assistance [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4489265
    Explore at:
    Dataset updated
    Feb 2, 2021
    Dataset provided by
    Intel Labs Pittsburgh
    School of Computer Science, Carnegie Mellon University
    School of Electrical Engineering & Computer Science, KTH Royal Institute of Technology
    Department of Psychology, Carnegie Mellon University
    Authors
    Olguín Muñoz, Manuel; Klatzky, Roberta; Wang, Junjue; Padmanabhan, Pillai; Satyanaryanan, Mahadev; Gross, James
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the data associated with our research project titled Impact of delayed response on Wearable Cognitive Assistance. A preprint of the associated paper can be found at https://arxiv.org/abs/2011.02555.

    GENERAL INFORMATION

    1. Title of Dataset: Impact of delayed response on Wearable Cognitive Assistance

    2. Author Information

    First Author Contact Information Name: Manuel Olguín Muñoz Institution: School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology Address: Malvinas väg 10, Stockholm 11428, Sweden Email: molguin@kth.se Phone Number: +46 73 652 7628

    Author Contact Information Name: Roberta L. Klatzky Institution: Department of Psychology, Carnegie Mellon University Address: 5000 Forbes Ave, Pittsburgh, PA 15213 Email: klatzky@cmu.edu Phone Number: +1 412 268 8026

    Author Contact Information Name: Mahadev Satyanarayanan Institution: School of Computer Science, Carnegie Mellon University Address: 5000 Forbes Ave, Pittsburgh, PA 15213 Email: satya@cs.cmu.edu Phone Number: +1 412 268 3743

    Author Contact Information Name: James R. Gross Institution: School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology Address: Malvinas väg 10, Stockholm 11428, Sweden Email: jamesgr@kth.se Phone Number: +46 8 790 8819

    DATA & FILE OVERVIEW

    Directory of Files: A. Filename: accelerometer_data.csv Short description: Time-series accelerometer data. Each row corresponds to a sample.

    B. Filename: block_aggregate.csv
      Short description: Contains the block- and slice-level aggregates for each of the metrics and statistics present in this dataset. Each row corresponds to either a full block or a slice of a block, see below for details.
    
    
    C. Filename: block_metadata.csv
      Short description: Contains the metadata for each block in the task for each participant. Each row corresponds to a block.
    
    D. Filename: bvp_data.csv
      Short description: Time-series blood-volume-pulse data. Each row corresponds to a sample.
    
    
    E. Filename: eeg_data.csv
      Short description: Time-series electroencephalogram data, represented as power per band. Each row corresponds to a sample; power was calculated in 0.5 second intervals.
    
    
    F. Filename: frame_metadata.csv
      Short description: Contains the metadata for each video frame processed by the cognitive assistant. Each row corresponds to a processed frame.
    
    
    G. Filename: gsr_data.csv
      Short description: Time-series galvanic skin response data. Each row corresponds to a sample.
    
    
    H. Filename: task_step_metadata.csv
      Short description: Contains the metadata for each step in the task for each participant. Each row corresponds to a step in the task.
    
    
    I. Filename: temperature_data.csv
      Short description: Time-series thermometer data. Each row corresponds to a sample.
    

    Additional Notes on File Relationships, Context, or Content (for example, if a user wants to reuse and/or cite your data, what information would you want them to know?):

    • The data contained in these CSVs was obtained from 40 participants in a study performed with approval from the Carnegie Mellon University Institutional Research Board. In this study, participants were asked to interact with a Cognitive Assistant while wearing an array of physiological sensors. The data contained in this dataset corresponds to the actual collected data, after some preliminary preprocessing to convert from sensors readings into meaningful values.

    • Participants have been anonymized using random integer identifiers.

    • block_aggregate.csv can be replicated by cross-referencing the start and end timestamps of each block in block_metadata.csv and the timestamps for each desired metric.

    • The actual video frames mentioned in frame_metadata.csv are not included in the dataset since their contents were not relevant to the research.

    File Naming Convention: N/A

    DATA DESCRIPTION FOR: accelerometer_data.csv

    1. Number of variables: 7

    2. Number of cases/rows: 1844688

    3. Missing data codes: N/A

    4. Variable list:

      A. Name: timestamp Description: Timestamp of the sample.

      B. Name: x Description: Acceleration reading from the x-axis of the accelerometer in g-forces [g].

      C. Name: y Description: Acceleration reading from the y-axis of the accelerometer in g-forces [g].

      D. Name: z Description: Acceleration reading from the z-axis of the accelerometer in g-forces [g].

      E. Name: ts Description: Time difference with respect to first sample.

      F. Name: participant Description: Denotes the numeric ID representing each individual participant.

      G. Name: delay Description: Delay that was being applied on the task when this reading was obtained in time delta format.

    DATA DESCRIPTION FOR: block_aggregate.csv

    1. Number of variables: 16

    2. Number of cases/rows: 2520

    3. Missing data codes:

      • Except for the 'slice' columns, empty cells mean that the data is not applicable or was removed from the dataset due to noise or instrument failure.
      • For the 'slice' column, a missing value indicates that the row corresponds to the whole block as opposed to a slice of it.
    4. Variable List:

      A. Name: participant Description: Denotes the numeric ID representing each individual participant.

      B. Name: block_seq Description: Denotes the position of the block in the task. Ranges from 1 to 21.

      C. Name: slice Description: Index of the 4-step slice of the block over which the data was aggregated. Ranges from 0 to 2, however higher values are only applicable for blocks of appropriate length (i.e. blocks of length 4 only have a 0-slice, length 8 have 0 and 1, and length 12 have slices from 0 to 2). A missing value indicates that this row instead contains aggregate values for the whole block.

      D. Name: block_length Description: Length of the block. Valid values are 4, 8 and 12.

      C. Name: block_delay Description: Delay applied to the block, in seconds.

      F. Name: start Description: Timestamp marking the start of the block or slice.

      G. Name: end Description: Timestamp marking the end of the block or slice.

      H. Name: duration Description: Duration of the block or slice, in seconds.

      I. Name: exec_time_per_step_mean Description: Mean execution time for each step in the block or slice.

      J. Name: bpm_mean Description: Mean heart rate, in beats-per-minute, for the block or slice.

      K. Name: bpm_std Description: Standard deviation of the heart rate, in beats-per-minute, for the block or slice.

      L. Name: gsr_per_second Description: Galvanic skin response in microsiemens, summed and then normalized by block or slice duration.

      M. Name: movement_score Description: Movement score for the block or slice. The movement score is calculated as the sum of the magnitude of all the acceleration vectors in the block or slice, divided by duration in seconds.

      N. Name: eeg_alpha_log_mean Description: Log of the average EEG power for the alpha band for the, block or slice.

      O. Name: eeg_beta_log_mean Description: Log of the average EEG power for the beta band for the, block or slice.

      P. Name: eeg_total_log_mean Description: Log of the average EEG power for the complete EEG signal, for the block or slice.

    DATA DESCRIPTION FOR: block_metadata.csv

    1. Number of variables: 8

    2. Number of cases/rows: 880

    3. Missing data codes: N/A

    4. Variable list:

      A. Name: participant Description: Denotes the numeric ID representing each individual participant.

      B. Name: seq Description: Index of the block in the task, ranging from 0 to 21. Note that block 0 is not to be included in aggregate calculations.

      C. Name: length Description: Length of the block in number of steps.

      D. Name: delay Description: Delay applied to the block.

      E. Name: start Description: Timestamp marking the start of the block.

      F. Name: end Description: Timestamp marking the end of the block.

      G. Name: duration Description: Duration of the block as a timedelta.

      H. Name: exec_time Description: Execution time of the block as a timedelta.

    DATA DESCRIPTION FOR: bvp_data.csv

    1. Number of variables: 8

    2. Number of cases/rows: 3683504

    3. Missing data codes: Columns bpm and ibi only contain values for rows corresponding to a sample taken at a heartbeat.

    4. Variable list:

      A. Name: ts Description: Time difference with respect to first sample.

      B. Name: timestamp Description: Timestamp of the sample.

      C. Name: bvp Description: Blood-volume-pulse reading, in millivolts.

      D. Name: onset Description: Boolean indicating if this sample corresponds to the onset of a pulse.

      E. Name: bpm Description: Instantaneous beat-per-minute value.

      F. Name: ibi Description: Instantaneous inter-beat-interval value.

      G. Name: delay Description: Delay that was being applied on the task when this reading was obtained in time delta format.

      H. Name: participant Description: Denotes the numeric ID representing each individual

  18. Z

    Experimental auction to assess consumer perception and willingness to pay...

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    • +1more
    Updated May 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gil, Jose María; Rahmani, Djamel; Goumeida, Kenza (2024). Experimental auction to assess consumer perception and willingness to pay for ready-to-eat fresh salads with more sustainable packaging [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_11242500
    Explore at:
    Dataset updated
    May 27, 2024
    Dataset provided by
    Universitat Politècnica de Catalunya
    Universitat Politècnica De Catalunya
    Authors
    Gil, Jose María; Rahmani, Djamel; Goumeida, Kenza
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The provided Excel sheet contains a dataset generated from an experiment conducted face-to-face with consumers of ready to eat fresh salads on December 19th and 20th, 2023, using a questionnaire. It was an auction aimed at understanding consumers' preferences and willingness to pay for 100% recyclable and 100% biodegradable packaging.

    Each column in the table represents a particular variable, and each row corresponds to a specific record in the dataset. The dataset comprises 70 columns and 306 rows, where the first row contains the variable names and the second row describes the meaning or question associated with each variable.To enhance readability, each variable is color-coded differently. This means that as you move from one color to another, you are transitioning from one variable to the next.Most variables are derived from validated scales, such as General Shopping Behavior, General Personal Values and Attitudes, Environmental Consciousness Scale, Openness to Innovation Scale, NEP Scale, and Green Consumption Values. These scales are generally assessed using a Likert scale of 5 or 7 points for each statement or item included in the scale. Each statement is in a separate column, and the combined columns of statements form the respective variable, which is the scale in this context. Therefore, the total number of variables included in this dataset is 24.

  19. r

    Station name frequencies

    • redivis.com
    Updated Apr 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Environmental Impact Data Collaborative (2023). Station name frequencies [Dataset]. https://redivis.com/datasets/5ax9-92h8psyqx
    Explore at:
    Dataset updated
    Apr 16, 2023
    Dataset authored and provided by
    Environmental Impact Data Collaborative
    Description

    The table Station name frequencies is part of the dataset Capital Bikeshare, available at https://redivis.com/datasets/5ax9-92h8psyqx. It contains 1743 rows across 2 variables.

  20. COBRE preprocessed with NIAK 0.17 - lightweight release

    • figshare.com
    application/gzip
    Updated Nov 3, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierre Bellec; Pierre Bellec (2016). COBRE preprocessed with NIAK 0.17 - lightweight release [Dataset]. http://doi.org/10.6084/m9.figshare.4197885.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Nov 3, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Pierre Bellec; Pierre Bellec
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ContentThis work is a derivative from the COBRE sample found in the International Neuroimaging Data-sharing Initiative (INDI), originally released under Creative Commons -- Attribution Non-Commercial. It includes preprocessed resting-state functional magnetic resonance images for 72 patients diagnosed with schizophrenia (58 males, age range = 18-65 yrs) and 74 healthy controls (51 males, age range = 18-65 yrs). The fMRI dataset for each subject are single nifti files (.nii.gz), featuring 150 EPI blood-oxygenation level dependent (BOLD) volumes were obtained in 5 mns (TR = 2 s, TE = 29 ms, FA = 75°, 32 slices, voxel size = 3x3x4 mm3 , matrix size = 64x64, FOV = mm2 ). The data processing as well as packaging was implemented by Pierre Bellec, CRIUGM, Department of Computer Science and Operations Research, University of Montreal, 2016.The COBRE preprocessed fMRI release more specifically contains the following files:README.md: a markdown (text) description of the release.phenotypic_data.tsv.gz: A gzipped tabular-separated value file, with each column representing a phenotypic variable as well as measures of data quality (related to motions). Each row corresponds to one participant, except the first row which contains the names of the variables (see file below for a description).keys_phenotypic_data.json: a json file describing each variable found in phenotypic_data.tsv.gz.fmri_XXXXXXX.tsv.gz: A gzipped tabular-separated value file, with each column representing a confounding variable for the time series of participant XXXXXXX (which is the same participant ID found in phenotypic_data.tsv.gz). Each row corresponds to a time frame, except for the first row, which contains the names of the variables (see file below for a definition).keys_confounds.json: a json file describing each variable found in the files fmri_XXXXXXX.tsv.gz.fmri_XXXXXXX.nii.gz: a 3D+t nifti volume at 6 mm isotropic resolution, stored as short (16 bits) integers, in the MNI non-linear 2009a symmetric space (http://www.bic.mni.mcgill.ca/ServicesAtlases/ICBM152NLin2009). Each fMRI data features 150 volumes.Usage recommendationsIndividual analyses: You may want to remove some time frames with excessive motion for each subject, see the confounding variable called scrub in fmri_XXXXXXX.tsv.gz. Also, after removing these time frames there may not be enough usable data. We recommend a minimum number of 60 time frames. A fairly large number of confounds have been made available as part of the release (slow time drifts, motion paramaters, frame displacement, scrubbing, average WM/Vent signal, COMPCOR, global signal). We strongly recommend regression of slow time drifts. Everything else is optional.Group analyses: There will also be some residuals effect of motion, which you may want to regress out from connectivity measures at the group level. The number of acceptable time frames as well as a measure of residual motion (called frame displacement, as described by Power et al., Neuroimage 2012), can be found in the variables Frames OK and FD scrubbed in phenotypic_data.tsv.gz. Finally, the simplest use case with these data is to predict the overall presence of a diagnosis of schizophrenia (values Control or Patient in the phenotypic variable Subject Type). You may want to try to match the control and patient samples in terms of amounts of motion, as well as age and sex. Note that more detailed diagnostic categories are available in the variable Diagnosis.PreprocessingThe datasets were analysed using the NeuroImaging Analysis Kit (NIAK https://github.com/SIMEXP/niak) version 0.17, under CentOS version 6.3 with Octave (http://gnu.octave.org) version 4.0.2 and the Minc toolkit (http://www.bic.mni.mcgill.ca/ServicesSoftware/ServicesSoftwareMincToolKit) version 0.3.18. Each fMRI dataset was corrected for inter-slice difference in acquisition time and the parameters of a rigid-body motion were estimated for each time frame. Rigid-body motion was estimated within as well as between runs, using the median volume of the first run as a target. The median volume of one selected fMRI run for each subject was coregistered with a T1 individual scan using Minctracc (Collins and Evans, 1998), which was itself non-linearly transformed to the Montreal Neurological Institute (MNI) template (Fonov et al., 2011) using the CIVET pipeline (Ad-Dabbagh et al., 2006). The MNI symmetric template was generated from the ICBM152 sample of 152 young adults, after 40 iterations of non-linear coregistration. The rigid-body transform, fMRI-to-T1 transform and T1-to-stereotaxic transform were all combined, and the functional volumes were resampled in the MNI space at a 6 mm isotropic resolution.Note that a number of confounding variables were estimated and are made available as part of the release. WARNING: no confounds were actually regressed from the data, so it can be done interactively by the user who will be able to explore different analytical paths easily. The “scrubbing” method of (Power et al., 2012), was used to identify the volumes with excessive motion (frame displacement greater than 0.5 mm). A minimum number of 60 unscrubbed volumes per run, corresponding to ~120 s of acquisition, is recommended for further analysis. The following nuisance parameters were estimated: slow time drifts (basis of discrete cosines with a 0.01 Hz high-pass cut-off), average signals in conservative masks of the white matter and the lateral ventricles as well as the six rigid-body motion parameters (Giove et al., 2009), anatomical COMPCOR signal in the ventricles and white matter (Chai et al., 2012), PCA-based estimator of the global signal (Carbonell et al., 2011). The fMRI volumes were not spatially smoothed.ReferencesAd-Dab’bagh, Y., Einarson, D., Lyttelton, O., Muehlboeck, J. S., Mok, K., Ivanov, O., Vincent, R. D., Lepage, C., Lerch, J., Fombonne, E., Evans, A. C., 2006. The CIVET Image-Processing Environment: A Fully Automated Comprehensive Pipeline for Anatomical Neuroimaging Research. In: Corbetta, M. (Ed.), Proceedings of the 12th Annual Meeting of the Human Brain Mapping Organization. Neuroimage, Florence, Italy.Bellec, P., Rosa-Neto, P., Lyttelton, O. C., Benali, H., Evans, A. C., Jul. 2010. Multi-level bootstrap analysis of stable clusters in resting-state fMRI. NeuroImage 51 (3), 1126–1139. URL http://dx.doi.org/10.1016/j.neuroimage.2010.02.082F. Carbonell, P. Bellec, A. Shmuel. Validation of a superposition model of global and system-specific resting state activity reveals anti-correlated networks. Brain Connectivity 2011 1(6): 496-510. doi:10.1089/brain.2011.0065Chai, X. J., Castan, A. N. N., Ongr, D., Whitfield-Gabrieli, S., Jan. 2012. Anticorrelations in resting state networks without global signal regression. NeuroImage 59 (2), 1420-1428. http://dx.doi.org/10.1016/j.neuroimage.2011.08.048 Collins, D. L., Evans, A. C., 1997. Animal: validation and applications of nonlinear registration-based segmentation. International Journal of Pattern Recognition and Artificial Intelligence 11, 1271–1294.Fonov, V., Evans, A. C., Botteron, K., Almli, C. R., McKinstry, R. C., Collins, D. L., Jan. 2011. Unbiased average age-appropriate atlases for pediatric studies. NeuroImage 54 (1), 313–327. URLhttp://dx.doi.org/10.1016/j.neuroimage.2010.07.033Giove, F., Gili, T., Iacovella, V., Macaluso, E., Maraviglia, B., Oct. 2009. Images-based suppression of unwanted global signals in resting-state functional connectivity studies. Magnetic resonance imaging 27 (8), 1058–1064. URLhttp://dx.doi.org/10.1016/j.mri.2009.06.004Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L., Petersen, S. E., Feb. 2012. Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. NeuroImage 59 (3), 2142–2154. URLhttp://dx.doi.org/10.1016/j.neuroimage.2011.10.018

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Michael Liebrenz; Michael Liebrenz (2020). Dataset 1. Contains all the variables necessary to reproduce the results of Liebrenz et al. [Dataset]. http://doi.org/10.5281/zenodo.19623
Organization logo

Dataset 1. Contains all the variables necessary to reproduce the results of Liebrenz et al.

Explore at:
zipAvailable download formats
Dataset updated
Jan 21, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Michael Liebrenz; Michael Liebrenz
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

File formats:

.xls: Excel file with variable names in 1. row and variable labels in 2. row

.xpt/.xpf: SAS XPORT data file (.xpt) and value labels (formats.xpf).

Note that the following variables were renamed in the output file: sumcadhssb -> SUMCADHS, sumcwursk -> SUMCWURS, adhdnotest -> ADHDNOTE, subs_subnotob -> SUBS_SUB, and that the internally recorded dataset name was shortened to "Liebrenz" .dta: Stata 13 data file

Search
Clear search
Close search
Google apps
Main menu