13 datasets found
  1. d

    Current Population Survey (CPS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

  2. e

    Merger of BNV-D data (2008 to 2019) and enrichment

    • data.europa.eu
    zip
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick VINCOURT (2025). Merger of BNV-D data (2008 to 2019) and enrichment [Dataset]. https://data.europa.eu/data/datasets/5f1c3eca9d149439e50c740f?locale=en
    Explore at:
    zip(18530465)Available download formats
    Dataset updated
    Jan 16, 2025
    Dataset authored and provided by
    Patrick VINCOURT
    Description

    Merging (in Table R) data published on https://www.data.gouv.fr/fr/datasets/ventes-de-pesticides-par-departement/, and joining two other sources of information associated with MAs: — uses: https://www.data.gouv.fr/fr/datasets/usages-des-produits-phytosanitaires/ — information on the “Biocontrol” status of the product, from document DGAL/SDQSPV/2020-784 published on 18/12/2020 at https://agriculture.gouv.fr/quest-ce-que-le-biocontrole

    All the initial files (.csv transformed into.txt), the R code used to merge data and different output files are collected in a zip. enter image description here NB: 1) “YASCUB” for {year,AMM,Substance_active,Classification,Usage,Statut_“BioConttrol”}, substances not on the DGAL/SDQSPV list being coded NA. 2) The file of biocontrol products shall be cleaned from the duplicates generated by the marketing authorisations leading to several trade names.
    3) The BNVD_BioC_DY3 table and the output file BNVD_BioC_DY3.txt contain the fields {Code_Region,Region,Dept,Code_Dept,Anne,Usage,Classification,Type_BioC,Quantite_substance)}

  3. s

    Metrical, morphosyntactic, and syntactic analysis of the Rigveda

    • swissubase.ch
    Updated Sep 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Metrical, morphosyntactic, and syntactic analysis of the Rigveda [Dataset]. http://doi.org/10.48656/yc4z-sa04
    Explore at:
    Dataset updated
    Sep 22, 2025
    Description

    The dataset contains: • the main data table, RV_data.csv, with morphosyntactic, syntactic and metrical information on each Rigvedic word form, and • a script, disticha.rmd, for the analysis of disticha in the main types of Rigvedic stanzas which were studied as an example for the application of the data table, resulting in the published article: Salvatore Scarlata and Paul Widmer, Syntactic evidence for metrical structure in Rigvedic stanzas, Indo-European Linguistics 13 (2025), 1-21, doi:10.1163/22125892-bja10041, issn: 2212-5892.

    In addition the dataset contains: • a further data table, RV-polylex.csv, wherein all compounded word forms are analyzed, and • some ancillary basic scripts for linking the two tables respectively for simplified representations: join.r resp. pivot01–03.r.

    Finally, the dataset contains: • a data table, RV-polylexREJECTS.csv, containing words for which it was not possible to assess them as compounded

  4. Data from: Optimized SMRT-UMI protocol produces highly accurate sequence...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Dec 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dylan Westfall; Mullins James (2023). Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies [Dataset]. http://doi.org/10.5061/dryad.w3r2280w0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 7, 2023
    Dataset provided by
    HIV Prevention Trials Networkhttp://www.hptn.org/
    HIV Vaccine Trials Networkhttp://www.hvtn.org/
    National Institute of Allergy and Infectious Diseaseshttp://www.niaid.nih.gov/
    PEPFAR
    Authors
    Dylan Westfall; Mullins James
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence PCR amplicons derived from cDNA templates tagged with universal molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR and the use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Handling of the large datasets produced from SMRT-UMI sequencing was facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline), that automatically filters and parses reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination or early cycle PCR errors, resulting in highly accurate sequence datasets. The optimized SMRT-UMI sequencing method presented here represents a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus (HIV) quasispecies. Methods This serves as an overview of the analysis performed on PacBio sequence data that is summarized in Analysis Flowchart.pdf and was used as primary data for the paper by Westfall et al. "Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies" Five different PacBio sequencing datasets were used for this analysis: M027, M2199, M1567, M004, and M005 For the datasets which were indexed (M027, M2199), CCS reads from PacBio sequencing files and the chunked_demux_config files were used as input for the chunked_demux pipeline. Each config file lists the different Index primers added during PCR to each sample. The pipeline produces one fastq file for each Index primer combination in the config. For example, in dataset M027 there were 3–4 samples using each Index combination. The fastq files from each demultiplexed read set were moved to the sUMI_dUMI_comparison pipeline fastq folder for further demultiplexing by sample and consensus generation with that pipeline. More information about the chunked_demux pipeline can be found in the README.md file on GitHub. The demultiplexed read collections from the chunked_demux pipeline or CCS read files from datasets which were not indexed (M1567, M004, M005) were each used as input for the sUMI_dUMI_comparison pipeline along with each dataset's config file. Each config file contains the primer sequences for each sample (including the sample ID block in the cDNA primer) and further demultiplexes the reads to prepare data tables summarizing all of the UMI sequences and counts for each family (tagged.tar.gz) as well as consensus sequences from each sUMI and rank 1 dUMI family (consensus.tar.gz). More information about the sUMI_dUMI_comparison pipeline can be found in the paper and the README.md file on GitHub. The consensus.tar.gz and tagged.tar.gz files were moved from sUMI_dUMI_comparison pipeline directory on the server to the Pipeline_Outputs folder in this analysis directory for each dataset and appended with the dataset name (e.g. consensus_M027.tar.gz). Also in this analysis directory is a Sample_Info_Table.csv containing information about how each of the samples was prepared, such as purification methods and number of PCRs. There are also three other folders: Sequence_Analysis, Indentifying_Recombinant_Reads, and Figures. Each has an .Rmd file with the same name inside which is used to collect, summarize, and analyze the data. All of these collections of code were written and executed in RStudio to track notes and summarize results. Sequence_Analysis.Rmd has instructions to decompress all of the consensus.tar.gz files, combine them, and create two fasta files, one with all sUMI and one with all dUMI sequences. Using these as input, two data tables were created, that summarize all sequences and read counts for each sample that pass various criteria. These are used to help create Table 2 and as input for Indentifying_Recombinant_Reads.Rmd and Figures.Rmd. Next, 2 fasta files containing all of the rank 1 dUMI sequences and the matching sUMI sequences were created. These were used as input for the python script compare_seqs.py which identifies any matched sequences that are different between sUMI and dUMI read collections. This information was also used to help create Table 2. Finally, to populate the table with the number of sequences and bases in each sequence subset of interest, different sequence collections were saved and viewed in the Geneious program. To investigate the cause of sequences where the sUMI and dUMI sequences do not match, tagged.tar.gz was decompressed and for each family with discordant sUMI and dUMI sequences the reads from the UMI1_keeping directory were aligned using geneious. Reads from dUMI families failing the 0.7 filter were also aligned in Genious. The uncompressed tagged folder was then removed to save space. These read collections contain all of the reads in a UMI1 family and still include the UMI2 sequence. By examining the alignment and specifically the UMI2 sequences, the site of the discordance and its case were identified for each family as described in the paper. These alignments were saved as "Sequence Alignments.geneious". The counts of how many families were the result of PCR recombination were used in the body of the paper. Using Identifying_Recombinant_Reads.Rmd, the dUMI_ranked.csv file from each sample was extracted from all of the tagged.tar.gz files, combined and used as input to create a single dataset containing all UMI information from all samples. This file dUMI_df.csv was used as input for Figures.Rmd. Figures.Rmd used dUMI_df.csv, sequence_counts.csv, and read_counts.csv as input to create draft figures and then individual datasets for eachFigure. These were copied into Prism software to create the final figures for the paper.

  5. Sloan Digital Sky Survey DR16

    • kaggle.com
    zip
    Updated Dec 30, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mukharbek Organokov (2019). Sloan Digital Sky Survey DR16 [Dataset]. https://www.kaggle.com/muhakabartay/sloan-digital-sky-survey-dr16
    Explore at:
    zip(6728394 bytes)Available download formats
    Dataset updated
    Dec 30, 2019
    Authors
    Mukharbek Organokov
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Feedback: Mukharbek Organokov organokov.m@gmail.com

    Context

    Sloan Digital Sky Survey current DR16 Server Data release with Galaxies, Stars and Quasars.

    License: Creative Commons Attribution license (CC-BY) More datailes here. Find more here.

    Content

    The table results from a query which joins two tables:
    - "PhotoObj" which contains photometric data
    - "SpecObj" which contains spectral data.

    16 variables (double) and 1 additional variable (char) 'class'. A class object can be predicted from the other 16 variables.

    Variables description:
    objid = Object Identifier
    ra = J2000 Right Ascension (r-band)
    dec = J2000 Declination (r-band)
    u = better of deV/Exp magnitude fit (u-band)
    g = better of deV/Exp magnitude fit (g-band)
    r = better of deV/Exp magnitude fit (r-band)
    i = better of deV/Exp magnitude fit (i-band)
    z = better of deV/Exp magnitude fit (z-band)
    run = Run Number
    rerun = Rerun Number
    camcol = Camera column
    field = Field number
    specobjid = Object Identifier
    class = object class (galaxy, star or quasar object)
    redshift = Final Redshift
    plate = plate number
    mjd = MJD of observation
    fiberid = fiberID

    Comments

    • A four-color UVGR intermediate-band photometric system (Thuan-Gunn astronomic magnitude system) is discussed in [1]. The Sloan Digital Sky Survey (SDSS) photometric system, a new five-color (u′ g′ r′ i′ z′) wide-band CCD system is described in [2]
    • The variables 'run', 'rerun', 'camcol' and 'field' features which describe a field within an image taken by the SDSS. A field is basically a part of the entire image corresponding to 2048 by 1489 pixels. A field can be identified by: - run number, which identifies the specific scan, - the camera column, or "camcol," a number from 1 to 6, identifying the scanline within the run, and the field number. The field number typically starts at 11 (after an initial rampup time), and can be as large as 800 for particularly long runs. - An additional number, rerun, specifies how the image was processed.
    • The variable 'class' identifies an object to be either a galaxy (GALAXY), star (STAR) or quasar (QSO).
      ####References:
      [1] Thuan & Gunn (1976, PASP, 88,543)
      [2] Fukugita, M. et al, Astronomical J. v.111, p.1748

    Data server

    Data can be obtained using SkyServer SQL Search with the command below:
    -- This query does a table JOIN between the imaging (PhotoObj) and spectra
    -- (SpecObj) tables and includes the necessary columns in the SELECT to upload
    -- the results to the SAS (Science Archive Server) for FITS file retrieval.
    SELECT TOP 100000
    p.objid,p.ra,p.dec,p.u,p.g,p.r,p.i,p.z,
    p.run, p.rerun, p.camcol, p.field,
    s.specobjid, s.class, s.z as redshift,
    s.plate, s.mjd, s.fiberid
    FROM PhotoObj AS p
    JOIN SpecObj AS s ON s.bestobjid = p.objid
    WHERE
    p.u BETWEEN 0 AND 19.6
    AND g BETWEEN 0 AND 20

    Learn how to. Some examples. Full SQL Tutorial.

    Or perform a complicated, CPU-intensive query of SDSS catalog data using CasJobs, SQL-based interface to the CAS.

    Acknowledgements

    SDSS collaboration.

    Inspiration

    The Sloan Digital Sky Survey has created the most detailed three-dimensional maps of the Universe ever made, with deep multi-color images of one-third of the sky, and spectra for more than three million astronomical objects. It allows to learn and explore all phases and surveys - past, present, and future - of the SDSS.

  6. C

    Table 2

    • hepdata.net
    csv +3
    Updated 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HEPData (2015). Table 2 [Dataset]. http://doi.org/10.17182/hepdata.68951.v1/t2
    Explore at:
    https://yaml.org, csv, https://root.cern, https://yoda.hepforge.orgAvailable download formats
    Dataset updated
    2015
    Dataset provided by
    HEPData
    Description

    HERA combined reduced cross sections $\sigma_{r,\rm NC}^{+}$ for NC $e^{+}p$ scattering at $\sqrt{s} = 300$ GeV; $\delta_{\rm stat}$, $\delta_{\rm uncor}$...

  7. T

    Table 2

    • hepdata.net
    csv +3
    Updated 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HEPData (2017). Table 2 [Dataset]. http://doi.org/10.17182/hepdata.77815.v1/t2
    Explore at:
    https://yoda.hepforge.org, csv, https://root.cern, https://yaml.orgAvailable download formats
    Dataset updated
    2017
    Dataset provided by
    HEPData
    Description

    Distributions of the $R(K^{*0})$ delta log-likelihood, $-(\ln L - \ln L_{best})$, for the three trigger categories combined in the central-q2...

  8. Black Jack - Interactive Card Game

    • kaggle.com
    zip
    Updated Dec 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick L Ford (2024). Black Jack - Interactive Card Game [Dataset]. https://www.kaggle.com/datasets/patricklford/black-jack-interactive-card-game/code
    Explore at:
    zip(4873009 bytes)Available download formats
    Dataset updated
    Dec 21, 2024
    Authors
    Patrick L Ford
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Introduction

    Blackjack, also known as 21, is one of the most popular card games worldwide. Blackjack remains a favourite due to its mix of simplicity, luck, strategy, and fast paced game play, making it a staple in casinos.

    Objective of Blackjack:

    • The goal of Blackjack is to have a hand value closer to 21 than the dealer's hand, without exceeding 21. If a player's hand exceeds 21, they "bust" and lose the round.

    Card Values:

    • Number cards (2-10): These are worth their face value.
    • Face cards (Jack, Queen, King): Each is worth 10 points.
    • Ace: Can be worth either 1 or 11, depending on which value benefits the hand more without exceeding 21.

    Setup:

    • Deck: Blackjack is typically played with one to eight standard decks of 52 cards.
    • Players: One or more players compete against the dealer. Each player is dealt a separate hand, and players do not compete against each other.
    • Table Layout: The table features spaces for player bets, cards, and chips.

    Game Play:

    • Initial Bets:
      • Players place their bets in designated areas on the table.
    • Dealing Cards:
      • Each player and the dealer receive two cards.
      • Players' cards are dealt face-up, while the dealer gets one face-up card (up card) and one face-down card (hole card).
    • Player Options:
      • Hit: Request another card to add to their hand. Players can keep hitting until they are satisfied or bust.
      • Stand: Keep the current hand and end their turn.
      • Double Down: Double the initial bet and receive exactly one more card. Commonly allowed only on the first two cards.
      • Split: If the first two cards have the same rank, the player can split them into two separate hands by placing an additional bet equal to the original. Each hand is played separately.
      • Surrender (Optional Rule): Forfeit half the bet and end the turn. This is usually allowed only on the first two cards.
      • Insurance (Optional Rule): If the dealer's up card is an Ace, players may place a side bet (half the original bet) that the dealer has Blackjack. If the dealer has Blackjack, the insurance bet pays 2:1; otherwise, the player loses the insurance bet.
    • Dealer's Turn:
      • Hit until the hand value is 17 or higher.
      • Stand on 17 or higher (including "soft 17" in some variations).
      • The dealer does not have options; actions are automatic.
    • Winning:
      • Player Wins: The player's hand value is closer to 21 than the dealer's hand, or the dealer busts.
      • Dealer Wins: Dealer's hand value is closer to 21, or the player busts.
      • Push (Tie): Both hands have the same value; the player keeps their bet.
    • Blackjack (Natural):
      • If the player's initial two cards are an Ace and a 10-point card (Jack, Queen, King, or 10), they have a "Blackjack."
      • Blackjack typically pays 3:2 (e.g., a $10 bet wins $15).
      • If both the player and the dealer have Blackjack, it's a push.
    • House Edge and Strategy:

    The casino typically has a small edge due to rules favouring the dealer (e.g., the player acts first, so they can bust before the dealer plays): - Basic strategy can minimise the house edge: - Strategy charts show the optimal play based on the player's hand and the dealer's up card. - Advanced players use card counting to track high value cards remaining in the deck, gaining an advantage.

    Common Variations:

    • European Blackjack: Dealer receives only one card initially; no hole card until players complete their turns.
    • Spanish 21: Played with 48-card decks (no 10's), with bonuses for certain hands.
    • Pontoon: A British variation where "Five Card Trick" (five cards totalling 21 or less) is a winning hand.
    • Blackjack Switch: Players play two hands and can swap the second card between them.

    Etiquette and Tips:

    • Use hand signals to indicate actions (e.g., tapping for "hit," waving for "stand").
    • Avoid touching chips after the deal starts.
    • Familiarise yourself with table-specific rules and variations.

    Visualisation

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2Faa4b5d8819430e46c3203b3597666578%2FScreenshot%202024-12-21%2010.36.57.png?generation=1734781714095911&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F86038e4d98f429825106bb2e8b5f74e8%2FScreenshot%202024-12-21%2010.38.18.png?generation=1734781738030008&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F5b634959e2292840ce454745ca80062f%2FScreenshot%202024-12-21%2010.39.12.png?generation=1734781761032959&alt=media" alt="">

    A Markdown document with the R code for the game of Black Jack. link

    R Code

    The provided R code implements a simplified version of the game Blackjack. It includes f...

  9. d

    Physiological data and R script for running physiology combined model for...

    • datadryad.org
    • search.dataone.org
    zip
    Updated Jul 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gengping Zhu; Javier Gutierrez Illan; David W. Crowder (2021). Physiological data and R script for running physiology combined model for Drosophila suzukii [Dataset]. http://doi.org/10.5061/dryad.gxd2547mj
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 12, 2021
    Dataset provided by
    Dryad
    Authors
    Gengping Zhu; Javier Gutierrez Illan; David W. Crowder
    Time period covered
    Jul 8, 2021
    Description

    This is the dataset that accompanies an article entitled "The use of insect life tables in optimizing invasive pest distributional models" that would be published in Ecography. The dataset include two R script that used to generate physical model and the physiology combined model respectively. Our paper shows that the physiology combined model show good performance when applying ecological niche model in risk assessment. We addressed this by determining whether incorporating physiological data from life table analyses of an invasive insect, Drosophila suzukii, improved predictions of ecological niche models. The dataset also include the physiology data D. suzukii that we assembled for running our physiology combined model.

  10. R

    Slakestable : R package to explore raw data from the Slakes app

    • entrepot.recherche.data.gouv.fr
    application/x-gzip
    Updated May 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Chalaux; Marine Lacoste; Marine Lacoste; Saby, Nicolas,; Saby, Nicolas,; Thomas Chalaux (2022). Slakestable : R package to explore raw data from the Slakes app [Dataset]. http://doi.org/10.15454/BGSMUE
    Explore at:
    application/x-gzip(18335)Available download formats
    Dataset updated
    May 25, 2022
    Dataset provided by
    Recherche Data Gouv
    Authors
    Thomas Chalaux; Marine Lacoste; Marine Lacoste; Saby, Nicolas,; Saby, Nicolas,; Thomas Chalaux
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Description

    Le package "slakestable" permet de formater rapidement les données brutes issues de l'application pour smartphone "Slakes" (Fajardo et al., 2016). La fonction "tablecourbe" permet de créer une unique table contenant les coefficients a, b, c issues de l'ajustement sur la Gompertz des données brutes, ainsi que le SI600 pour chaque agrégat. Il est possible de concaténer les données par site.localisation par une moyenne ou une médiane avant ou après l'ajustement de l'équation de la Gompertz, deux tables indépendantes sont créées. Il est possible de les rassembler à l'aide de la fonction "jointurefeuilles". The "slakestable" package helps for quick formatting of raw data frome the "Slakes" smartphone app. (Fajardo et al., 2016). The "tablecourbe" function allows the creation of a single table containing the coefficient a, b, c from the Gompertz fit of the data, and the SI600 for each aggregate. It is also possible to concatenate the data by site/location with a mean or median before or after the Gompertz adjustement, two tables are created. It's possible to bind them with the "jointurefeuilles" function.

  11. m

    Data from: Table 2 in Consistent patterns of common species across tropical...

    • scholarship.miami.edu
    Updated Jan 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Declan L. M. Cooper; Kenneth J. Feeley; Simon L. Lewis; Martin J. P. Sullivan; Paulo I. Prado; Hans ter Steege; Nicolas Barbier; Ferry Slik; Bonaventure Sonké; Corneille E. N. Ewango; Stephen Adu-Bredu; Kofi Affum-Baffoe; Daniel P. P. de Aguiar; Manuel Augusto Ahuite Reategui; Shin-Ichiro Aiba; Bianca Weiss Albuquerque; Francisca Dionízia de Almeida Matos; Alfonso Alonso; Christian A. Amani; Dário Dantas do Amaral; Iêda Leão do Amaral; Ana Andrade; Ires Paula de Andrade Miranda; Ilondea B. Angoboy; Alejandro Araujo-Murakami; Nicolás Castaño Arboleda; Luzmila Arroyo; Peter Ashton; Gerardo A. Aymard C; Cláudia Baider; Timothy R. Baker; Michael Philippe Bessike Balinga; Henrik Balslev; Lindsay F. Banin; Olaf S. Bánki; Chris Baraloto; Edelcilio Marques Barbosa; Flávia Rodrigues Barbosa; Jos Barlow; Jean-Francois Bastin; Hans Beeckman; Serge Begne; Natacha Nssi Bengone; Erika Berenguer; Nicholas Berry; Robert Bitariho; Pascal Boeckx; Jan Bogaert; Bernard Bonyoma; Patrick Boundja; Nils Bourland; Faustin Boyemba Bosela; Fabian Brambach; Roel Brienen; David F. R. P. Burslem; José Luís Camargo; Wegliane Campelo; Angela Cano; Sasha Cárdenas; Dairon Cárdenas López; Rainiellen de Sá Carpanedo; Yrma Andreina Carrero Márquez; Fernanda Antunes Carvalho; Luisa Fernanda Casas; Hernán Castellanos; Carolina V. Castilho; Carlos Cerón; Colin A. Chapman; Jerome Chave; Phourin Chhang; Wanlop Chutipong; George B. Chuyong; Bruno Barçante Ladvocat Cintra; Connie J. Clark; Fernanda Coelho de Souza; James A. Comiskey; David A. Coomes; Fernando Cornejo Valverde; Diego F. Correa; Flávia R. C. Costa; Janaina Barbosa Pedrosa Costa; Pierre Couteron; Heike Culmsee; Aida Cuni-Sanchez; Francisco Dallmeier; Gabriel Damasco; Gilles Dauby; Nállarett Dávila; Hilda Paulette Dávila Doza; Jose Don T. De Alban; Rafael L. de Assis; Charles De Canniere; Thales De Haulleville; Marcelo de Jesus Veiga Carim; Layon O. Demarchi; Kyle G. Dexter; Anthony Di Fiore; Hazimah Haji Mohammad Din; Mathias I. Disney; Brice Yannick Djiofack; Marie-Noël K. Djuikouo; Tran Van Do; Jean-Louis Doucet; Freddie C. Draper; Vincent Droissart; Joost F. Duivenvoorden; Julien Engel; Vittoria Estienne; William Farfan-Rios; Sophie Fauset; Yuri Oliveira Feitosa; Ted R. Feldpausch; Cid Ferreira; Joice Ferreira; Leandro Valle Ferreira; Christine D. Fletcher; Bernardo Monteiro Flores; Alusine Fofanah; Ernest G. Foli; Émile Fonty; Gabriella M. Fredriksson; Alfredo Fuentes; David Galbraith; George Pepe Gallardo Gonzales; Karina Garcia-Cabrera; Roosevelt García-Villacorta; Vitor H. F. Gomes; Ricardo Zárate Gómez; Therany Gonzales; Rogerio Gribel; Marcelino Carneiro Guedes; Juan Ernesto Guevara; Khalid Rehman Hakeem; Jefferson S. Hall; Keith C. Hamer; Alan C. Hamilton; David J. Harris; Rhett D. Harrison; Terese B. Hart; Andy Hector; Terry W. Henkel; John Herbohn; Mireille B. N. Hockemba; Bruce Hoffman; Milena Holmgren; Euridice N. Honorio Coronado; Isau Huamantupa-Chuquimaco; Wannes Hubau; Nobuo Imai; Mariana Victória Irume; Patrick A. Jansen; Kathryn J. Jeffery; Eliana M. Jimenez; Tommaso Jucker; André Braga Junqueira; Michelle Kalamandeen; Narcisse G. Kamdem; Kuswata Kartawinata; Emmanuel Kasongo Yakusu; John M. Katembo; Elizabeth Kearsley; David Kenfack; Michael Kessler; Thiri Toe Khaing; Timothy J. Killeen; Kanehiro Kitayama; Bente Klitgaard; Nicolas Labrière; Yves Laumonier; Susan G. W. Laurance; William F. Laurance; Félix Laurent; Tinh Cong Le; Trai Trong Le; Miguel E. Leal; Evlyn Márcia Leão de Moraes Novo; Aurora Levesley; Moses B. Libalah; Juan Carlos Licona; Diógenes de Andrade Lima Filho; Jeremy A. Lindsell; Aline Lopes; Maria Aparecida Lopes; Jon C. Lovett; Richard Lowe; José Rafael Lozada; Xinghui Lu; Nestor K. Luambua; Bruno Garcia Luize; Paul Maas; José Leonardo Lima Magalhães; William E. Magnusson; Ni Putu Diana Mahayani; Jean-Remy Makana; Yadvinder Malhi; Lorena Maniguaje Rincón; Asyraf Mansor; Angelo Gilberto Manzatto; Beatriz S. Marimon; Ben Hur Marimon-Junior; Andrew R Marshall; Maria Pires Martins; Faustin M. Mbayu; Marcelo Brilhante de Medeiros; Italo Mesones; Faizah Metali; Vianet Mihindou; Jerome Millet; William Milliken; Hugo F. Mogollón; Jean-François Molino; Mohd. Nizam Mohd. Said; Abel Monteagudo Mendoza; Juan Carlos Montero; Sam Moore; Bonifacio Mostacedo; Linder Felipe Mozombite Pinto; Sharif Ahmed Mukul; Pantaleo K. T. Munishi; Hidetoshi Nagamasu; Henrique Eduardo Mendonça Nascimento; Marcelo Trindade Nascimento; David Neill; Reuben Nilus; Janaína Costa Noronha; Laurent Nsenga; Percy Núñez Vargas; Lucas Ojo; Alexandre A. Oliveira; Edmar Almeida de Oliveira; Fidèle Evouna Ondo; Walter Palacios Cuenca; Susamar Pansini; Marcelo Petratti Pansonato; Marcos Ríos Paredes; Ekananda Paudel; Daniela Pauletto; Richard G. Pearson; José Luis Marcelo Pena; R. Toby Pennington; Carlos A. Peres; Andrea Permana; Pascal Petronelli; Maria Cristina Peñuela Mora; Juan Fernando Phillips; Oliver L. Phillips; Georgia Pickavance; Maria Teresa Fernandez Piedade; Nigel C. A. Pitman; Pierre Ploton; Andreas Popelier; John R. Poulsen; Adriana Prieto; Richard B. Primack; Hari Priyadi; Lan Qie; Adriano Costa Quaresma; Helder Lima de Queiroz; Hirma Ramirez-Angulo; José Ferreira Ramos; Neidiane Farias Costa Reis; Jan Reitsma; Juan David Cardenas Revilla; Terhi Riutta; Gonzalo Rivas-Torres; Iyan Robiansyah; Maira Rocha; Domingos de Jesus Rodrigues; M. Elizabeth Rodriguez-Ronderos; Francesco Rovero; Andes H. Rozak; Agustín Rudas; Ervan Rutishauser; Daniel Sabatier; Le Bienfaiteur Sagang; Adeilza Felipe Sampaio; Ismayadi Samsoedin; Manichanh Satdichanh; Juliana Schietti; Jochen Schöngart; Veridiana Vizoni Scudeller; Naret Seuaturien; Douglas Sheil; Rodrigo Sierra; Miles R. Silman; Thiago Sanna Freire Silva; José Renan da Silva Guimarães; Murielle Simo-Droissart; Marcelo Fragomeni Simon; Plinio Sist; Thaiane R. Sousa; Emanuelle de Sousa Farias; Luiz de Souza Coelho; Dominick V. Spracklen; Suzanne M. Stas; Robert Steinmetz; Pablo R. Stevenson; Juliana Stropp; Rahayu S. Sukri; Terry C. H. Sunderland; Eizi Suzuki; Michael D. Swaine; Jianwei Tang; James Taplin; David M. Taylor; J. Sebastián Tello; John Terborgh; Nicolas Texier; Ida Theilade; Duncan W. Thomas; Raquel Thomas; Sean C. Thomas; Milton Tirado; Benjamin Toirambe; José Julio de Toledo; Kyle W. Tomlinson; Armando Torres-Lezama; Hieu Dang Tran; John Tshibamba Mukendi; Roven D. Tumaneng; Maria Natalia Umaña; Peter M. Umunay; Ligia Estela Urrego Giraldo; Elvis H. Valderrama Sandoval; Luis Valenzuela Gamarra; Tinde R. Van Andel; Martin van de Bult; Jaqueline van de Pol; Geertje van der Heijden; Rodolfo Vasquez; César I. A. Vela; Eduardo Martins Venticinque; Hans Verbeeck; Rizza Karen A. Veridiano; Alberto Vicentini; Ima Célia Guimarães Vieira; Emilio Vilanova Torre; Daniel Villarroel; Boris Eduardo Villa Zegarra; Jason Vleminckx; Patricio von Hildebrand; Vincent Antoine Vos; Corine Vriesendorp; Edward L. Webb; Lee J. T. White; Serge Wich; Florian Wittmann; Roderick Zagt; Runguo Zang; Charles Eugene Zartman; Lise Zemagho; Egleé L. Zent; Stanford Zent (2024). Table 2 in Consistent patterns of common species across tropical tree communities [Dataset]. https://scholarship.miami.edu/esploro/outputs/dataset/Table-2-in-Consistent-patterns-of/991032667338602976
    Explore at:
    Dataset updated
    Jan 10, 2024
    Dataset provided by
    Zenodo
    Authors
    Declan L. M. Cooper; Kenneth J. Feeley; Simon L. Lewis; Martin J. P. Sullivan; Paulo I. Prado; Hans ter Steege; Nicolas Barbier; Ferry Slik; Bonaventure Sonké; Corneille E. N. Ewango; Stephen Adu-Bredu; Kofi Affum-Baffoe; Daniel P. P. de Aguiar; Manuel Augusto Ahuite Reategui; Shin-Ichiro Aiba; Bianca Weiss Albuquerque; Francisca Dionízia de Almeida Matos; Alfonso Alonso; Christian A. Amani; Dário Dantas do Amaral; Iêda Leão do Amaral; Ana Andrade; Ires Paula de Andrade Miranda; Ilondea B. Angoboy; Alejandro Araujo-Murakami; Nicolás Castaño Arboleda; Luzmila Arroyo; Peter Ashton; Gerardo A. Aymard C; Cláudia Baider; Timothy R. Baker; Michael Philippe Bessike Balinga; Henrik Balslev; Lindsay F. Banin; Olaf S. Bánki; Chris Baraloto; Edelcilio Marques Barbosa; Flávia Rodrigues Barbosa; Jos Barlow; Jean-Francois Bastin; Hans Beeckman; Serge Begne; Natacha Nssi Bengone; Erika Berenguer; Nicholas Berry; Robert Bitariho; Pascal Boeckx; Jan Bogaert; Bernard Bonyoma; Patrick Boundja; Nils Bourland; Faustin Boyemba Bosela; Fabian Brambach; Roel Brienen; David F. R. P. Burslem; José Luís Camargo; Wegliane Campelo; Angela Cano; Sasha Cárdenas; Dairon Cárdenas López; Rainiellen de Sá Carpanedo; Yrma Andreina Carrero Márquez; Fernanda Antunes Carvalho; Luisa Fernanda Casas; Hernán Castellanos; Carolina V. Castilho; Carlos Cerón; Colin A. Chapman; Jerome Chave; Phourin Chhang; Wanlop Chutipong; George B. Chuyong; Bruno Barçante Ladvocat Cintra; Connie J. Clark; Fernanda Coelho de Souza; James A. Comiskey; David A. Coomes; Fernando Cornejo Valverde; Diego F. Correa; Flávia R. C. Costa; Janaina Barbosa Pedrosa Costa; Pierre Couteron; Heike Culmsee; Aida Cuni-Sanchez; Francisco Dallmeier; Gabriel Damasco; Gilles Dauby; Nállarett Dávila; Hilda Paulette Dávila Doza; Jose Don T. De Alban; Rafael L. de Assis; Charles De Canniere; Thales De Haulleville; Marcelo de Jesus Veiga Carim; Layon O. Demarchi; Kyle G. Dexter; Anthony Di Fiore; Hazimah Haji Mohammad Din; Mathias I. Disney; Brice Yannick Djiofack; Marie-Noël K. Djuikouo; Tran Van Do; Jean-Louis Doucet; Freddie C. Draper; Vincent Droissart; Joost F. Duivenvoorden; Julien Engel; Vittoria Estienne; William Farfan-Rios; Sophie Fauset; Yuri Oliveira Feitosa; Ted R. Feldpausch; Cid Ferreira; Joice Ferreira; Leandro Valle Ferreira; Christine D. Fletcher; Bernardo Monteiro Flores; Alusine Fofanah; Ernest G. Foli; Émile Fonty; Gabriella M. Fredriksson; Alfredo Fuentes; David Galbraith; George Pepe Gallardo Gonzales; Karina Garcia-Cabrera; Roosevelt García-Villacorta; Vitor H. F. Gomes; Ricardo Zárate Gómez; Therany Gonzales; Rogerio Gribel; Marcelino Carneiro Guedes; Juan Ernesto Guevara; Khalid Rehman Hakeem; Jefferson S. Hall; Keith C. Hamer; Alan C. Hamilton; David J. Harris; Rhett D. Harrison; Terese B. Hart; Andy Hector; Terry W. Henkel; John Herbohn; Mireille B. N. Hockemba; Bruce Hoffman; Milena Holmgren; Euridice N. Honorio Coronado; Isau Huamantupa-Chuquimaco; Wannes Hubau; Nobuo Imai; Mariana Victória Irume; Patrick A. Jansen; Kathryn J. Jeffery; Eliana M. Jimenez; Tommaso Jucker; André Braga Junqueira; Michelle Kalamandeen; Narcisse G. Kamdem; Kuswata Kartawinata; Emmanuel Kasongo Yakusu; John M. Katembo; Elizabeth Kearsley; David Kenfack; Michael Kessler; Thiri Toe Khaing; Timothy J. Killeen; Kanehiro Kitayama; Bente Klitgaard; Nicolas Labrière; Yves Laumonier; Susan G. W. Laurance; William F. Laurance; Félix Laurent; Tinh Cong Le; Trai Trong Le; Miguel E. Leal; Evlyn Márcia Leão de Moraes Novo; Aurora Levesley; Moses B. Libalah; Juan Carlos Licona; Diógenes de Andrade Lima Filho; Jeremy A. Lindsell; Aline Lopes; Maria Aparecida Lopes; Jon C. Lovett; Richard Lowe; José Rafael Lozada; Xinghui Lu; Nestor K. Luambua; Bruno Garcia Luize; Paul Maas; José Leonardo Lima Magalhães; William E. Magnusson; Ni Putu Diana Mahayani; Jean-Remy Makana; Yadvinder Malhi; Lorena Maniguaje Rincón; Asyraf Mansor; Angelo Gilberto Manzatto; Beatriz S. Marimon; Ben Hur Marimon-Junior; Andrew R Marshall; Maria Pires Martins; Faustin M. Mbayu; Marcelo Brilhante de Medeiros; Italo Mesones; Faizah Metali; Vianet Mihindou; Jerome Millet; William Milliken; Hugo F. Mogollón; Jean-François Molino; Mohd. Nizam Mohd. Said; Abel Monteagudo Mendoza; Juan Carlos Montero; Sam Moore; Bonifacio Mostacedo; Linder Felipe Mozombite Pinto; Sharif Ahmed Mukul; Pantaleo K. T. Munishi; Hidetoshi Nagamasu; Henrique Eduardo Mendonça Nascimento; Marcelo Trindade Nascimento; David Neill; Reuben Nilus; Janaína Costa Noronha; Laurent Nsenga; Percy Núñez Vargas; Lucas Ojo; Alexandre A. Oliveira; Edmar Almeida de Oliveira; Fidèle Evouna Ondo; Walter Palacios Cuenca; Susamar Pansini; Marcelo Petratti Pansonato; Marcos Ríos Paredes; Ekananda Paudel; Daniela Pauletto; Richard G. Pearson; José Luis Marcelo Pena; R. Toby Pennington; Carlos A. Peres; Andrea Permana; Pascal Petronelli; Maria Cristina Peñuela Mora; Juan Fernando Phillips; Oliver L. Phillips; Georgia Pickavance; Maria Teresa Fernandez Piedade; Nigel C. A. Pitman; Pierre Ploton; Andreas Popelier; John R. Poulsen; Adriana Prieto; Richard B. Primack; Hari Priyadi; Lan Qie; Adriano Costa Quaresma; Helder Lima de Queiroz; Hirma Ramirez-Angulo; José Ferreira Ramos; Neidiane Farias Costa Reis; Jan Reitsma; Juan David Cardenas Revilla; Terhi Riutta; Gonzalo Rivas-Torres; Iyan Robiansyah; Maira Rocha; Domingos de Jesus Rodrigues; M. Elizabeth Rodriguez-Ronderos; Francesco Rovero; Andes H. Rozak; Agustín Rudas; Ervan Rutishauser; Daniel Sabatier; Le Bienfaiteur Sagang; Adeilza Felipe Sampaio; Ismayadi Samsoedin; Manichanh Satdichanh; Juliana Schietti; Jochen Schöngart; Veridiana Vizoni Scudeller; Naret Seuaturien; Douglas Sheil; Rodrigo Sierra; Miles R. Silman; Thiago Sanna Freire Silva; José Renan da Silva Guimarães; Murielle Simo-Droissart; Marcelo Fragomeni Simon; Plinio Sist; Thaiane R. Sousa; Emanuelle de Sousa Farias; Luiz de Souza Coelho; Dominick V. Spracklen; Suzanne M. Stas; Robert Steinmetz; Pablo R. Stevenson; Juliana Stropp; Rahayu S. Sukri; Terry C. H. Sunderland; Eizi Suzuki; Michael D. Swaine; Jianwei Tang; James Taplin; David M. Taylor; J. Sebastián Tello; John Terborgh; Nicolas Texier; Ida Theilade; Duncan W. Thomas; Raquel Thomas; Sean C. Thomas; Milton Tirado; Benjamin Toirambe; José Julio de Toledo; Kyle W. Tomlinson; Armando Torres-Lezama; Hieu Dang Tran; John Tshibamba Mukendi; Roven D. Tumaneng; Maria Natalia Umaña; Peter M. Umunay; Ligia Estela Urrego Giraldo; Elvis H. Valderrama Sandoval; Luis Valenzuela Gamarra; Tinde R. Van Andel; Martin van de Bult; Jaqueline van de Pol; Geertje van der Heijden; Rodolfo Vasquez; César I. A. Vela; Eduardo Martins Venticinque; Hans Verbeeck; Rizza Karen A. Veridiano; Alberto Vicentini; Ima Célia Guimarães Vieira; Emilio Vilanova Torre; Daniel Villarroel; Boris Eduardo Villa Zegarra; Jason Vleminckx; Patricio von Hildebrand; Vincent Antoine Vos; Corine Vriesendorp; Edward L. Webb; Lee J. T. White; Serge Wich; Florian Wittmann; Roderick Zagt; Runguo Zang; Charles Eugene Zartman; Lise Zemagho; Egleé L. Zent; Stanford Zent
    Time period covered
    Jan 10, 2024
    Description

    Table 2 | Extrapolated tree species hyperdominance results for African, Amazonian, Southeast Asian tropical forests at the regional scale Number of hyperdominantsTotal speciesHyperdominant percentageAfrica104 [101,107]4,638 [4,511,4,764]2.23Amazonia299 [295,304]13,826 [13,615,14,036]2.16Southeast Asia278 [268,289]11,963 [11,451,12,475]2.32Total a681 [664,700]30,427 [29,577,31,275]2.24 a Calculated as the sum of the number of hyperdominants and total species across the three major tropical forest regions with hyperdominance percentage derived therefrom.Prediction intervals (in brackets) combine uncertainty from the standard error of predicted means and the residual s.d. of the regression of the bias correction fit.

  12. Z

    Dataset — Make Reddit Great Again: Assessing Community Effects of Moderation...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Trujillo, Amaury; Cresci, Stefano (2023). Dataset — Make Reddit Great Again: Assessing Community Effects of Moderation Interventions on r/The_Donald [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6250576
    Explore at:
    Dataset updated
    Jan 10, 2023
    Dataset provided by
    IIT-CNR
    Authors
    Trujillo, Amaury; Cresci, Stefano
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Reddit contents and complementary data regarding the r/The_Donald community and its main moderation interventions, used for the corresponding article indicated in the title.

    An accompanying R notebook can be found in: https://github.com/amauryt/make_reddit_great_again

    If you use this dataset please cite the related article.

    The dataset timeframe of the Reddit contents (submissions and comments) spans from 30 weeks before Quarantine (2018-11-28) to 30 weeks after Restriction (2020-09-23). The original Reddit content was collected from the Pushshift monthly data files, transformed, and loaded into two SQLite databases.

    The first database, the_donald.sqlite, contains all the available content from r/The_Donald created during the dataset timeframe, with the last content being posted several weeks before the timeframe upper limit. It only has two tables: submissions and comments. It should be noted that the IDs of contents are on base 10 (numeric integer), unlike the original base 36 (alphanumeric) used on Reddit and Pushshift. This is for efficient storage and processing. If necessary, many programming languages or libraries can easily convert IDs from one base to another.

    The second database, core_the_donald.sqlite, contains all the available content from core users of r/The_Donald made platform-wise (i.e., within and without the subreddit) during the dataset timeframe. Core users are defined as those who authored either a submission or a comment a week in r/The_Donald during the 30 weeks prior to the subreddit's Quarantine. The database has four tables: submissions, comments, subreddits, and perspective_scores. The subreddits table contains the names of the subreddits to which submissions and comments were made (their IDs are also on base 10). The perspective_scores table contains comment toxicity scores.

    The Perspective API was used to score comments based on the attributes toxicity and severe_toxicity. It should be noted that not all of the comments in core_the_donald have a score because the comment body was blank or because the Perspective API returned a request error (after three tries). However, the percentage of missing scores is minuscule.

    A third file, mbfc_scores.csv, contains the bias and factual reporting accuracy collected in October 2021 from Media Bias / Fact Check (MBFC). Both attributes are scored on a Likert-like manner. One can associate submissions to MBFC scores by doing a join by the domain column.

  13. MOESM2 of OmicsARules: a R package for integration of multi-omics datasets...

    • springernature.figshare.com
    xlsx
    Updated Feb 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danze Chen; Fan Zhang; Qianqian Zhao; Jianzhen Xu (2024). MOESM2 of OmicsARules: a R package for integration of multi-omics datasets via association rules mining [Dataset]. http://doi.org/10.6084/m9.figshare.10278410.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 16, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Danze Chen; Fan Zhang; Qianqian Zhao; Jianzhen Xu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 2: Table S1. General information of three real datasets downloaded from TCGA. Table S2. Top 20 rules identified from BRCA mRNA dataset. Table S3. Top 20 rules identified from BRCA DNA methylation. Table S4. Top 20 rules identified from ESCA mRNA dataset. Table S5. Top 20 rules identified from ESCA DNA methylation dataset. Table S6. Top 20 rules identified from LUAD mRNA dataset. Table S7. Top 20 rules identified from LUAD DNA methylation dataset. Table S8. Top 20 rules identified from the combined BRCA mRNA and DNA methylation datasets. Table S9. Top 20 rules identified from the combined ESCA mRNA and DNA methylation datasets. Table S10. Top 20 rules identified from the combined LUAD mRNA and DNA methylation datasets.

  14. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD

Current Population Survey (CPS)

Explore at:
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Damico, Anthony
Description

analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

Search
Clear search
Close search
Google apps
Main menu