100+ datasets found
  1. m

    6. Definitions and examples of the moves of the UPOCS genre

    • data.mendeley.com
    Updated Nov 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Author Anonym (2021). 6. Definitions and examples of the moves of the UPOCS genre [Dataset]. http://doi.org/10.17632/7yg2y4sdkn.1
    Explore at:
    Dataset updated
    Nov 5, 2021
    Authors
    Author Anonym
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Definitions and examples of the moves of the UPOCS genre

  2. 2021 Methodological Summary And Definitions

    • catalog.data.gov
    • odgavaprod.ogopendata.com
    • +1more
    Updated Sep 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Substance Abuse and Mental Health Services Administration (2025). 2021 Methodological Summary And Definitions [Dataset]. https://catalog.data.gov/dataset/2021-methodological-summary-and-definitions
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
    Description

    Use this summary report to properly interpret 2021 NSDUH estimates of substance use and mental health issues. The report accompanies theannual detailed tablesand covers overall methodology, key definitions for measures and terms used in 2021 NSDUH reports and tables, and selected analyses of the measures and how they should be interpreted.The report is organized into six chapters:Introduction.Description of the survey, including information about the sample design, data collection procedures, and key aspects of data processing such as development of the analysis weights. The report also includes methodological changes and related issues in the 2021 NSDUH due to COVID-19.Technical details on the statistical methods and measurement, such as suppression criteria for unreliable estimates, statistical testing procedures, issues around selected substance use and mental health measures, and the impact of methodological changes on response rates.Special topics related to prescription psychotherapeutic drugs.A comparison between NSDUH and other sources of data on substance use and mental health issues, including data sources for populations outside the NSDUH target population.A more in-depth view of special methodological issues for the 2021 NSDUH, including the results of special analyses that led SAMHSA to not compare estimates from 2021 to estimates from previous years.An appendix covers key definitions used in NSDUH reports and tables.

  3. f

    Definitions of independent variables used in the statistical analysis.

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Feb 20, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    McConway, Kevin; Cameron, Robert; Bertorelle, Giorgio; Sattmann, Helmut; Cook, Laurence; Juan, Xavier; Anton, Christian; Fontaine, Benoît; Dodd, Mike; Skelton, Peter; Stalažs, Arturs; Féher, Zoltan; Schilthuizen, Menno; Rammul, Üllar; Oliveira, Cristina; Ożgo, Małgorzata; Pokryszko, Beata; Silvertown, Jonathan; Baur, Bruno; Bossdorf, Oliver; Sólymos, Péter; Correia, Maria; Worthington, Jenny; Gill, Eoin (2013). Definitions of independent variables used in the statistical analysis. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001733024
    Explore at:
    Dataset updated
    Feb 20, 2013
    Authors
    McConway, Kevin; Cameron, Robert; Bertorelle, Giorgio; Sattmann, Helmut; Cook, Laurence; Juan, Xavier; Anton, Christian; Fontaine, Benoît; Dodd, Mike; Skelton, Peter; Stalažs, Arturs; Féher, Zoltan; Schilthuizen, Menno; Rammul, Üllar; Oliveira, Cristina; Ożgo, Małgorzata; Pokryszko, Beata; Silvertown, Jonathan; Baur, Bruno; Bossdorf, Oliver; Sólymos, Péter; Correia, Maria; Worthington, Jenny; Gill, Eoin
    Description

    Definitions of independent variables used in the statistical analysis.

  4. Social Media PII Disclosure Analyses

    • kaggle.com
    zip
    Updated Jul 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eidan Rosado (2024). Social Media PII Disclosure Analyses [Dataset]. https://www.kaggle.com/datasets/edyvision/social-media-pii-disclosure-analyses
    Explore at:
    zip(29813203 bytes)Available download formats
    Dataset updated
    Jul 30, 2024
    Authors
    Eidan Rosado
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Privacy vs. Social Capital: Social Media PII Disclosure Analyses

    This data was collected and analyzed as part of a study on PII disclosures in social media conversations with special attention to influencer characteristics in the interactions in the dissertation titled Privacy vs. Social Capital: Examining Information Disclosure Patterns within Social Media Influencer Networks and the research paper titled Unveiling Influencer-Driven Personal Data Sharing in Social Media Discourse.

    Each study phase is different, with X (Twitter) data used in the pilot analysis and Reddit data used in the main study. Both folders will have the analyzed_posts and cluster summary csv files broken down by collection (either based on trend or collection date).

    Note: Raw data is not made available in these datasets due to the nature of the study and to protect the original authors.

    Notable Data Elements

    Post Data

    Column nameTypeDescription
    Node IDUUIDUnique identifier for post (replaces original platform identifier)
    User IDUUIDUnique identifier assigned for user (replaces original platform identifier)
    Cluster NameStrComposite ID for subgraph using collection name and subgraph index
    Influence PowerFloatEigenvector centrality
    Influencer TierStrCategorical label calculated by follower count
    Collection NameStrTrend collection assigned based on search query
    HashtagsSet(str)The set of hashtags included in the node
    PII DisclosedBoolWhether or not PII was disclosed
    PII DetectedSet(str)The detected token types in post
    PII Risk ScoreFloatThe PII score for all tokens in a post
    Is CommentBoolWhether or not the post is a comment or reply
    Is Text StarterBoolWhether or not the post has text content
    CommunityStrThe group, community, channel, etc. associated with
    TimestampTimestampCreation timestamp (provided by social media API)
    Time ElapsedIntTime elapsed (seconds) from original influencer’s post

    Cluster Data

    Column NameTypeDescription
    Cluster NameStrComposite ID for subgraph using collection name and subgraph index
    Influencer Tiers FrequenciesList[dict]Frequency of influencer tiers of all users in the cluster
    Top Influence Power ScoreFloatEigenvector centrality of top influencer
    Top Influencer TierStrSize tier of top influencer
    Collection NameStrTrend collection assigned based on search query.
    HashtagsSet(str)The set of hashtags included in the cluster
    PII Detection FrequenciesList[dict]The detected token types in post with frequencies
    Node CountIntCount of all nodes in the influencer cluster
    Node DisclosuresIntCount of all nodes with mean_risk_score > 1*
    Disclosure RatioFloatSum of nodes with confirmed disclosed PII divided by overall cluster size (count of nodes in the cluster)
    Mean Risk ScoreFloatThe mean risk score for an entire network cluster
    Median Risk ScoreFloatThe median risk score for an entire network cluster
    Min Risk ScoreFloatThe min risk score for an entire network cluster
    Max Risk ScoreFloatThe max risk score for an entire network cluster
    Time SpanFloatTotal Time Elapsed
  5. d

    Tabular statistical summay of data analysis - Calawah River Riverscape Study...

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated May 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact, Custodian) (2025). Tabular statistical summay of data analysis - Calawah River Riverscape Study [Dataset]. https://catalog.data.gov/dataset/tabular-statistical-summay-of-data-analysis-calawah-river-riverscape-study3
    Explore at:
    Dataset updated
    May 24, 2025
    Dataset provided by
    (Point of Contact, Custodian)
    Area covered
    Calawah River
    Description

    The objective of this study was to identify the patterns of juvenile salmonid distribution and relative abundance in relation to habitat correlates. It is the first dataset of its kind because the entire river was snorkeled by one person in multiple years. During two consecutive summers, we completed a census of juvenile salmonids and stream habitat across a stream network. We used the data to test the ability of habitat models to explain the distribution of juvenile coho salmon (Oncorhynchus kisutch), young-of-the-year (age 0) steelhead (Oncorhynchus mykiss), and steelhead parr (= age 1) for a network consisting of several different sized streams. Our network-scale models, which included five stream habitat variables, explained 27%, 11%, and 19% of the variation in the density of juvenile coho salmon, age 0 steelhead, and steelhead parr, respectively. We found weak to strong levels of spatial auto-correlation in the model residuals (Moran's I values ranging from 0.25 - 0.71). Explanatory power of base habitat models increased substantially and the level of spatial auto-correlation decreased with sequential inclusion of variables accounting for stream size, year, stream, and reach location. The models for specific streams underscored the variability that was implied in the network-scale models. Associations between juvenile salmonids and individual habitat variables were rarely linear and ranged from negative to positive, and the variable accounting for location of the habitat within a stream was often more important than any individual habitat variable. The limited success in predicting the summer distribution and density of juvenile coho salmon and steelhead with our network-scale models was apparently related to variation in the strength and shape of fish-habitat associations across and within streams and years. Summary of statistical analysis of the Calawah Riverscape data. NOAA was not involved and did not pay for the collection of this data. This data represents the statistical analysis carried out by Martin Liermann as a NOAA employee.

  6. Data_Sheet_1_NeuroDecodeR: a package for neural decoding in R.docx

    • frontiersin.figshare.com
    docx
    Updated Jan 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ethan M. Meyers (2024). Data_Sheet_1_NeuroDecodeR: a package for neural decoding in R.docx [Dataset]. http://doi.org/10.3389/fninf.2023.1275903.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jan 3, 2024
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Ethan M. Meyers
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Neural decoding is a powerful method to analyze neural activity. However, the code needed to run a decoding analysis can be complex, which can present a barrier to using the method. In this paper we introduce a package that makes it easy to perform decoding analyses in the R programing language. We describe how the package is designed in a modular fashion which allows researchers to easily implement a range of different analyses. We also discuss how to format data to be able to use the package, and we give two examples of how to use the package to analyze real data. We believe that this package, combined with the rich data analysis ecosystem in R, will make it significantly easier for researchers to create reproducible decoding analyses, which should help increase the pace of neuroscience discoveries.

  7. The definitions of slums and favelas and its implication on population data:...

    • scielo.figshare.com
    jpeg
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alfredo Pereira de Queiroz Filho (2023). The definitions of slums and favelas and its implication on population data: a content analysis approach [Dataset]. http://doi.org/10.6084/m9.figshare.7506944.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Alfredo Pereira de Queiroz Filho
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    AbstractThis article aimed to discuss the different definitions of slums and favelas and their implication on population data. The definitions discussed were extracted from research related to the United Nations Human Settlements Programme (UN-Habitat) and the Instituto Brasileiro de Geografia e Estatística (IBGE). The data manipulation was performed according to the content analysis (CA) approach. The quantification performed with Iramuteq software was based on word frequency and factorial correspondence analysis (FCA). Qualitative and quantitative analyzes highlighted two major differences: in the object characterization (area, building and both); and qualification type (legal aspects, construction standards, infrastructure deficiency, land property, population density, geographic references and residents typing). With the high number of qualifications and diverse content, the population data aggregate different information, making its comparison less accurate. This imprecision tends to expand due to the area growth and the number of countries analyzed.

  8. H

    Introduction to Time Series Analysis for Hydrologic Data

    • hydroshare.org
    • hydroshare.cuahsi.org
    zip
    Updated Jan 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriela Garcia; Kateri Salk (2021). Introduction to Time Series Analysis for Hydrologic Data [Dataset]. https://www.hydroshare.org/resource/ee2a4c2151f24115a12e34d4d22d96fe
    Explore at:
    zip(1.1 MB)Available download formats
    Dataset updated
    Jan 29, 2021
    Dataset provided by
    HydroShare
    Authors
    Gabriela Garcia; Kateri Salk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Oct 1, 1974 - Jan 27, 2021
    Area covered
    Description

    This lesson was adapted from educational material written by Dr. Kateri Salk for her Fall 2019 Hydrologic Data Analysis course at Duke University. This is the first part of a two-part exercise focusing on time series analysis.

    Introduction

    Time series are a special class of dataset, where a response variable is tracked over time. The frequency of measurement and the timespan of the dataset can vary widely. At its most simple, a time series model includes an explanatory time component and a response variable. Mixed models can include additional explanatory variables (check out the nlme and lme4 R packages). We will be covering a few simple applications of time series analysis in these lessons.

    Opportunities

    Analysis of time series presents several opportunities. In aquatic sciences, some of the most common questions we can answer with time series modeling are:

    • Has there been an increasing or decreasing trend in the response variable over time?
    • Can we forecast conditions in the future?

      Challenges

    Time series datasets come with several caveats, which need to be addressed in order to effectively model the system. A few common challenges that arise (and can occur together within a single dataset) are:

    • Autocorrelation: Data points are not independent from one another (i.e., the measurement at a given time point is dependent on previous time point(s)).

    • Data gaps: Data are not collected at regular intervals, necessitating interpolation between measurements. There are often gaps between monitoring periods. For many time series analyses, we need equally spaced points.

    • Seasonality: Cyclic patterns in variables occur at regular intervals, impeding clear interpretation of a monotonic (unidirectional) trend. Ex. We can assume that summer temperatures are higher.

    • Heteroscedasticity: The variance of the time series is not constant over time.

    • Covariance: the covariance of the time series is not constant over time. Many of these models assume that the variance and covariance are similar over the time-->heteroschedasticity.

      Learning Objectives

    After successfully completing this notebook, you will be able to:

    1. Choose appropriate time series analyses for trend detection and forecasting

    2. Discuss the influence of seasonality on time series analysis

    3. Interpret and communicate results of time series analyses

  9. Data from: THE ADVANCED ANALYTICS JUMPSTART: DEFINITION, PROCESS MODEL, BEST...

    • scielo.figshare.com
    jpeg
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeremy Rose; Mikael Berndtsson; Gunnar Mathiason; Peter Larsson (2023). THE ADVANCED ANALYTICS JUMPSTART: DEFINITION, PROCESS MODEL, BEST PRACTICES [Dataset]. http://doi.org/10.6084/m9.figshare.5862411.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Jeremy Rose; Mikael Berndtsson; Gunnar Mathiason; Peter Larsson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT Companies are encouraged by the big data trend to experiment with advanced analytics and many turn to specialist consultancies to help them get started where they lack the necessary competences. We investigate the program of one such consultancy, Advectas - in particular the advanced analytics Jumpstart. Using qualitative techniques including semi structured interviews and content analysis we investigate the nature and value of the Jumpstart concept through five cases in different companies. We provide a definition, a process model and a set of thirteen best practices derived from these experiences, and discuss the distinctive qualities of this approach.

  10. 2022 Methodological Summary And Definitions

    • data.virginia.gov
    • gimi9.com
    • +1more
    html
    Updated Sep 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Substance Abuse and Mental Health Services Administration (2025). 2022 Methodological Summary And Definitions [Dataset]. https://data.virginia.gov/dataset/2022-methodological-summary-and-definitions
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
    Description

    Use this summary report to properly interpret 2022 NSDUH estimates related to substance use, mental health, and treatment. The report accompanies theannual detailed tablesand covers overall methodology, key definitions for measures and terms used in 2022 NSDUH reports and tables, and selected analyses of the measures and how they should be interpreted.The report is organized into five chapters:Introduction.Description of the survey, including information about the sample design, data collection procedures and questionnaire changes, and key aspects of data processing such as development of the analysis weights.Technical details on the statistical methods and measurement, such as suppression criteria for unreliable estimates, statistical testing procedures, revised estimates for 2021 to account for data collection mode, and issues around selected substance use and mental health measures.Special topics related to prescription psychotherapeutic drugs.Description of other sources of data on substance use and mental health issues in the United States, including data sources for populations outside the NSDUH target population.An appendix covers key definitions used in NSDUH reports and tables.

  11. g

    2019 Methodological Summary and Definitions

    • gimi9.com
    • data.virginia.gov
    • +1more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    2019 Methodological Summary and Definitions [Dataset]. https://gimi9.com/dataset/data-gov_2019-methodological-summary-and-definitions/
    Explore at:
    Description

    Use this summary report to properly interpret 2019 NSDUH estimates of substance use and mental health issues. The report accompanies theannual detailed tablesand covers overall methodology, key definitions for measures and terms used in 2019 NSDUH reports and tables, and selected analyses of the measures and how they should be interpreted.The report is organized into five chapters:Introduction.Description of the survey, including information about the sample design, data collection procedures, and key aspects of data processing such as development of the analysis weights.Technical details on the statistical methods and measurement, such as suppression criteria for unreliable estimates, statistical testing procedures, issues around data accuracy, and measurement issues for selected substance use and mental health measures.Special topics related to prescription psychotherapeutic drugs.A comparison between NSDUH and other sources of data on substance use and mental health issues, including data sources for populations outside the NSDUH target population.An appendix covers key definitions used in NSDUH reports and tables.

  12. Z

    Conceptualization of public data ecosystems

    • data.niaid.nih.gov
    Updated Sep 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anastasija, Nikiforova; Martin, Lnenicka (2024). Conceptualization of public data ecosystems [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13842001
    Explore at:
    Dataset updated
    Sep 26, 2024
    Dataset provided by
    University of Tartu
    University of Hradec Králové
    Authors
    Anastasija, Nikiforova; Martin, Lnenicka
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains data collected during a study "Understanding the development of public data ecosystems: from a conceptual model to a six-generation model of the evolution of public data ecosystems" conducted by Martin Lnenicka (University of Hradec Králové, Czech Republic), Anastasija Nikiforova (University of Tartu, Estonia), Mariusz Luterek (University of Warsaw, Warsaw, Poland), Petar Milic (University of Pristina - Kosovska Mitrovica, Serbia), Daniel Rudmark (Swedish National Road and Transport Research Institute, Sweden), Sebastian Neumaier (St. Pölten University of Applied Sciences, Austria), Karlo Kević (University of Zagreb, Croatia), Anneke Zuiderwijk (Delft University of Technology, Delft, the Netherlands), Manuel Pedro Rodríguez Bolívar (University of Granada, Granada, Spain).

    As there is a lack of understanding of the elements that constitute different types of value-adding public data ecosystems and how these elements form and shape the development of these ecosystems over time, which can lead to misguided efforts to develop future public data ecosystems, the aim of the study is: (1) to explore how public data ecosystems have developed over time and (2) to identify the value-adding elements and formative characteristics of public data ecosystems. Using an exploratory retrospective analysis and a deductive approach, we systematically review 148 studies published between 1994 and 2023. Based on the results, this study presents a typology of public data ecosystems and develops a conceptual model of elements and formative characteristics that contribute most to value-adding public data ecosystems, and develops a conceptual model of the evolutionary generation of public data ecosystems represented by six generations called Evolutionary Model of Public Data Ecosystems (EMPDE). Finally, three avenues for a future research agenda are proposed.

    This dataset is being made public both to act as supplementary data for "Understanding the development of public data ecosystems: from a conceptual model to a six-generation model of the evolution of public data ecosystems ", Telematics and Informatics*, and its Systematic Literature Review component that informs the study.

    Description of the data in this data set

    PublicDataEcosystem_SLR provides the structure of the protocol

    Spreadsheet#1 provides the list of results after the search over three indexing databases and filtering out irrelevant studies

    Spreadsheets #2 provides the protocol structure.

    Spreadsheets #3 provides the filled protocol for relevant studies.

    The information on each selected study was collected in four categories:(1) descriptive information,(2) approach- and research design- related information,(3) quality-related information,(4) HVD determination-related information

    Descriptive Information

    Article number

    A study number, corresponding to the study number assigned in an Excel worksheet

    Complete reference

    The complete source information to refer to the study (in APA style), including the author(s) of the study, the year in which it was published, the study's title and other source information.

    Year of publication

    The year in which the study was published.

    Journal article / conference paper / book chapter

    The type of the paper, i.e., journal article, conference paper, or book chapter.

    Journal / conference / book

    Journal article, conference, where the paper is published.

    DOI / Website

    A link to the website where the study can be found.

    Number of words

    A number of words of the study.

    Number of citations in Scopus and WoS

    The number of citations of the paper in Scopus and WoS digital libraries.

    Availability in Open Access

    Availability of a study in the Open Access or Free / Full Access.

    Keywords

    Keywords of the paper as indicated by the authors (in the paper).

    Relevance for our study (high / medium / low)

    What is the relevance level of the paper for our study

    Approach- and research design-related information

    Approach- and research design-related information

    Objective / Aim / Goal / Purpose & Research Questions

    The research objective and established RQs.

    Research method (including unit of analysis)

    The methods used to collect data in the study, including the unit of analysis that refers to the country, organisation, or other specific unit that has been analysed such as the number of use-cases or policy documents, number and scope of the SLR etc.

    Study’s contributions

    The study’s contribution as defined by the authors

    Qualitative / quantitative / mixed method

    Whether the study uses a qualitative, quantitative, or mixed methods approach?

    Availability of the underlying research data

    Whether the paper has a reference to the public availability of the underlying research data e.g., transcriptions of interviews, collected data etc., or explains why these data are not openly shared?

    Period under investigation

    Period (or moment) in which the study was conducted (e.g., January 2021-March 2022)

    Use of theory / theoretical concepts / approaches? If yes, specify them

    Does the study mention any theory / theoretical concepts / approaches? If yes, what theory / concepts / approaches? If any theory is mentioned, how is theory used in the study? (e.g., mentioned to explain a certain phenomenon, used as a framework for analysis, tested theory, theory mentioned in the future research section).

    Quality-related information

    Quality concerns

    Whether there are any quality concerns (e.g., limited information about the research methods used)?

    Public Data Ecosystem-related information

    Public data ecosystem definition

    How is the public data ecosystem defined in the paper and any other equivalent term, mostly infrastructure. If an alternative term is used, how is the public data ecosystem called in the paper?

    Public data ecosystem evolution / development

    Does the paper define the evolution of the public data ecosystem? If yes, how is it defined and what factors affect it?

    What constitutes a public data ecosystem?

    What constitutes a public data ecosystem (components & relationships) - their "FORM / OUTPUT" presented in the paper (general description with more detailed answers to further additional questions).

    Components and relationships

    What components does the public data ecosystem consist of and what are the relationships between these components? Alternative names for components - element, construct, concept, item, helix, dimension etc. (detailed description).

    Stakeholders

    What stakeholders (e.g., governments, citizens, businesses, Non-Governmental Organisations (NGOs) etc.) does the public data ecosystem involve?

    Actors and their roles

    What actors does the public data ecosystem involve? What are their roles?

    Data (data types, data dynamism, data categories etc.)

    What data do the public data ecosystem cover (is intended / designed for)? Refer to all data-related aspects, including but not limited to data types, data dynamism (static data, dynamic, real-time data, stream), prevailing data categories / domains / topics etc.

    Processes / activities / dimensions, data lifecycle phases

    What processes, activities, dimensions and data lifecycle phases (e.g., locate, acquire, download, reuse, transform, etc.) does the public data ecosystem involve or refer to?

    Level (if relevant)

    What is the level of the public data ecosystem covered in the paper? (e.g., city, municipal, regional, national (=country), supranational, international).

    Other elements or relationships (if any)

    What other elements or relationships does the public data ecosystem consist of?

    Additional comments

    Additional comments (e.g., what other topics affected the public data ecosystems and their elements, what is expected to affect the public data ecosystems in the future, what were important topics by which the period was characterised etc.).

    New papers

    Does the study refer to any other potentially relevant papers?

    Additional references to potentially relevant papers that were found in the analysed paper (snowballing).

    Format of the file.xls, .csv (for the first spreadsheet only), .docx

    Licenses or restrictionsCC-BY

    For more info, see README.txt

  13. g

    Dictionary of Algorithms and Data Structures (DADS)

    • gimi9.com
    • data.nist.gov
    • +3more
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dictionary of Algorithms and Data Structures (DADS) [Dataset]. https://gimi9.com/dataset/data-gov_dictionary-of-algorithms-and-data-structures-dads/
    Explore at:
    Description

    The Dictionary of Algorithms and Data Structures (DADS) is an online, publicly accessible dictionary of generally useful algorithms, data structures, algorithmic techniques, archetypal problems, and related definitions. In addition to brief definitions, some entries have links to related entries, links to implementations, and additional information. DADS is meant to be a resource for the practicing programmer, although students and researchers may find it a useful starting point. DADS has fundamental entries in areas such as theory, cryptography and compression, graphs, trees, and searching, for instance, Ackermann's function, quick sort, traveling salesman, big O notation, merge sort, AVL tree, hash table, and Byzantine generals. DADS also has index pages that list entries by area and by type. Currently DADS does not include algorithms particular to business data processing, communications, operating systems or distributed algorithms, programming languages, AI, graphics, or numerical analysis.

  14. f

    Statistical analyses.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Menze, Bjoern H.; Schmitz, Désirée A.; Li, Hongwei Bran; Kümmerli, Rolf; Wechsler, Tobias (2024). Statistical analyses. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001429748
    Explore at:
    Dataset updated
    Aug 8, 2024
    Authors
    Menze, Bjoern H.; Schmitz, Désirée A.; Li, Hongwei Bran; Kümmerli, Rolf; Wechsler, Tobias
    Description

    The zebrafish Danio rerio has become a popular model host to explore disease pathology caused by infectious agents. A main advantage is its transparency at an early age, which enables live imaging of infection dynamics. While multispecies infections are common in patients, the zebrafish model is rarely used to study them, although the model would be ideal for investigating pathogen-pathogen and pathogen-host interactions. This may be due to the absence of an established multispecies infection protocol for a defined organ and the lack of suitable image analysis pipelines for automated image processing. To address these issues, we developed a protocol for establishing and tracking single and multispecies bacterial infections in the inner ear structure (otic vesicle) of the zebrafish by imaging. Subsequently, we generated an image analysis pipeline that involved deep learning for the automated segmentation of the otic vesicle, and scripts for quantifying pathogen frequencies through fluorescence intensity measures. We used Pseudomonas aeruginosa, Acinetobacter baumannii, and Klebsiella pneumoniae, three of the difficult-to-treat ESKAPE pathogens, to show that our infection protocol and image analysis pipeline work both for single pathogens and pairwise pathogen combinations. Thus, our protocols provide a comprehensive toolbox for studying single and multispecies infections in real-time in zebrafish.

  15. Lifestyle_and_Health_Risk_Prediction_Dataset

    • kaggle.com
    zip
    Updated Oct 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zahra Nusrat (2025). Lifestyle_and_Health_Risk_Prediction_Dataset [Dataset]. https://www.kaggle.com/datasets/zahranusrat/lifestyle-and-health-risk-prediction-dataset
    Explore at:
    zip(61147 bytes)Available download formats
    Dataset updated
    Oct 23, 2025
    Authors
    Zahra Nusrat
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    🧩 About Dataset

    This dataset provides a detailed collection of information related to [your topic], offering valuable insights for data analysis, visualization, and model development. It consists of multiple features such as [list of important columns], which capture various dimensions of the subject in a structured and measurable way.

    The purpose of this dataset is to support exploratory data analysis (EDA) and predictive modeling by allowing users to identify trends, patterns, and relationships among variables. It can serve as a foundation for building machine learning models, performing statistical studies, or generating data-driven visual reports.

    Researchers, data enthusiasts, and students can use this dataset to enhance their analytical understanding, practice preprocessing techniques, and improve their ability to draw meaningful conclusions from real-world data.

    Additionally, this dataset can be explored to uncover correlations, test hypotheses, and visualize behavioral or performance patterns. Its clean structure and well-defined variables make it suitable for both beginners learning EDA and experienced professionals developing predictive insights.

  16. Supporting Data for Method Assessment for Non-Targeted Analyses (MANTA)...

    • data.nist.gov
    • datasets.ai
    • +2more
    Updated May 24, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Place (2021). Supporting Data for Method Assessment for Non-Targeted Analyses (MANTA) Program: Interlaboratory Study 1 Results [Dataset]. http://doi.org/10.18434/mds2-2412
    Explore at:
    Dataset updated
    May 24, 2021
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Authors
    Benjamin Place
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    Supporting data for the results of Interlaboratory 1 of the Method Assessment for Non-Targeted Analyses. The datasets include the chemical compound descriptions, laboratory mean responses, and the tools for the principal components analysis of the datasets. In addition, a Microsoft Excel file, which was given to all participants, allowed for the analysis of the metadata.

  17. Exploratory Data Analysis (EDA) for COVIND-19

    • kaggle.com
    zip
    Updated Apr 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Badea-Matei Iuliana (2024). Exploratory Data Analysis (EDA) for COVIND-19 [Dataset]. https://www.kaggle.com/datasets/mateiiuliana/exploratory-data-analysis-eda-for-covind-19
    Explore at:
    zip(26972 bytes)Available download formats
    Dataset updated
    Apr 8, 2024
    Authors
    Badea-Matei Iuliana
    Description

    Description: The COVID-19 dataset used for this EDA project encompasses comprehensive data on COVID-19 cases, deaths, and recoveries worldwide. It includes information gathered from authoritative sources such as the World Health Organization (WHO), the Centers for Disease Control and Prevention (CDC), and national health agencies. The dataset covers global, regional, and national levels, providing a holistic view of the pandemic's impact.

    Purpose: This dataset is instrumental in understanding the multifaceted impact of the COVID-19 pandemic through data exploration. It aligns perfectly with the objectives of the EDA project, aiming to unveil insights, patterns, and trends related to COVID-19. Here are the key objectives: 1. Data Collection and Cleaning: • Gather reliable COVID-19 datasets from authoritative sources (such as WHO, CDC, or national health agencies). • Clean and preprocess the data to ensure accuracy and consistency. 2. Descriptive Statistics: • Summarize key statistics: total cases, recoveries, deaths, and testing rates. • Visualize temporal trends using line charts, bar plots, and heat maps. 3. Geospatial Analysis: • Map COVID-19 cases across countries, regions, or cities. • Identify hotspots and variations in infection rates. 4. Demographic Insights: • Explore how age, gender, and pre-existing conditions impact vulnerability. • Investigate disparities in infection rates among different populations. 5. Healthcare System Impact: • Analyze hospitalization rates, ICU occupancy, and healthcare resource allocation. • Assess the strain on medical facilities. 6. Economic and Social Effects: • Investigate the relationship between lockdown measures, economic indicators, and infection rates. • Explore behavioral changes (e.g., mobility patterns, remote work) during the pandemic. 7. Predictive Modeling (Optional): • If data permits, build simple predictive models (e.g., time series forecasting) to estimate future cases.

    Data Sources: The primary sources of the COVID-19 dataset include the Johns Hopkins CSSE COVID-19 Data Repository, Google Health’s COVID-19 Open Data, and the U.S. Economic Development Administration (EDA). These sources provide reliable and up-to-date information on COVID-19 cases, deaths, testing rates, and other relevant variables. Additionally, GitHub repositories and platforms like Medium host supplementary datasets and analyses, enriching the available data resources.

    Data Format: The dataset is available in various formats, such as CSV and JSON, facilitating easy access and analysis. Before conducting the EDA, the data underwent preprocessing steps to ensure accuracy and consistency. Data cleaning procedures were performed to address missing values, inconsistencies, and outliers, enhancing the quality and reliability of the dataset.

    License: The COVID-19 dataset may be subject to specific usage licenses or restrictions imposed by the original data sources. Proper attribution is essential to acknowledge the contributions of the WHO, CDC, national health agencies, and other entities providing the data. Users should adhere to any licensing terms and usage guidelines associated with the dataset.

    Attribution: We acknowledge the invaluable contributions of the World Health Organization (WHO), the Centers for Disease Control and Prevention (CDC), national health agencies, and other authoritative sources in compiling and disseminating the COVID-19 data used for this EDA project. Their efforts in collecting, curating, and sharing data have been instrumental in advancing our understanding of the pandemic and guiding public health responses globally.

  18. The Canada Trademarks Dataset

    • zenodo.org
    pdf, zip
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeremy Sheff; Jeremy Sheff (2024). The Canada Trademarks Dataset [Dataset]. http://doi.org/10.5281/zenodo.4999655
    Explore at:
    zip, pdfAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jeremy Sheff; Jeremy Sheff
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Canada Trademarks Dataset

    18 Journal of Empirical Legal Studies 908 (2021), prepublication draft available at https://papers.ssrn.com/abstract=3782655, published version available at https://onlinelibrary.wiley.com/share/author/CHG3HC6GTFMMRU8UJFRR?target=10.1111/jels.12303

    Dataset Selection and Arrangement (c) 2021 Jeremy Sheff

    Python and Stata Scripts (c) 2021 Jeremy Sheff

    Contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office.

    This individual-application-level dataset includes records of all applications for registered trademarks in Canada since approximately 1980, and of many preserved applications and registrations dating back to the beginning of Canada’s trademark registry in 1865, totaling over 1.6 million application records. It includes comprehensive bibliographic and lifecycle data; trademark characteristics; goods and services claims; identification of applicants, attorneys, and other interested parties (including address data); detailed prosecution history event data; and data on application, registration, and use claims in countries other than Canada. The dataset has been constructed from public records made available by the Canadian Intellectual Property Office. Both the dataset and the code used to build and analyze it are presented for public use on open-access terms.

    Scripts are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/. Data files are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/, and also subject to additional conditions imposed by the Canadian Intellectual Property Office (CIPO) as described below.

    Terms of Use:

    As per the terms of use of CIPO's government data, all users are required to include the above-quoted attribution to CIPO in any reproductions of this dataset. They are further required to cease using any record within the datasets that has been modified by CIPO and for which CIPO has issued a notice on its website in accordance with its Terms and Conditions, and to use the datasets in compliance with applicable laws. These requirements are in addition to the terms of the CC-BY-4.0 license, which require attribution to the author (among other terms). For further information on CIPO’s terms and conditions, see https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html. For further information on the CC-BY-4.0 license, see https://creativecommons.org/licenses/by/4.0/.

    The following attribution statement, if included by users of this dataset, is satisfactory to the author, but the author makes no representations as to whether it may be satisfactory to CIPO:

    The Canada Trademarks Dataset is (c) 2021 by Jeremy Sheff and licensed under a CC-BY-4.0 license, subject to additional terms imposed by the Canadian Intellectual Property Office. It contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office. For further information, see https://creativecommons.org/licenses/by/4.0/ and https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html.

    Details of Repository Contents:

    This repository includes a number of .zip archives which expand into folders containing either scripts for construction and analysis of the dataset or data files comprising the dataset itself. These folders are as follows:

    • /csv: contains the .csv versions of the data files
    • /do: contains Stata do-files used to convert the .csv files to .dta format and perform the statistical analyses set forth in the paper reporting this dataset
    • /dta: contains the .dta versions of the data files
    • /py: contains the python scripts used to download CIPO’s historical trademarks data via SFTP and generate the .csv data files

    If users wish to construct rather than download the datafiles, the first script that they should run is /py/sftp_secure.py. This script will prompt the user to enter their IP Horizons SFTP credentials; these can be obtained by registering with CIPO at https://ised-isde.survey-sondage.ca/f/s.aspx?s=59f3b3a4-2fb5-49a4-b064-645a5e3a752d&lang=EN&ds=SFTP. The script will also prompt the user to identify a target directory for the data downloads. Because the data archives are quite large, users are advised to create a target directory in advance and ensure they have at least 70GB of available storage on the media in which the directory is located.

    The sftp_secure.py script will generate a new subfolder in the user’s target directory called /XML_raw. Users should note the full path of this directory, which they will be prompted to provide when running the remaining python scripts. Each of the remaining scripts, the filenames of which begin with “iterparse”, corresponds to one of the data files in the dataset, as indicated in the script’s filename. After running one of these scripts, the user’s target directory should include a /csv subdirectory containing the data file corresponding to the script; after running all the iterparse scripts the user’s /csv directory should be identical to the /csv directory in this repository. Users are invited to modify these scripts as they see fit, subject to the terms of the licenses set forth above.

    With respect to the Stata do-files, only one of them is relevant to construction of the dataset itself. This is /do/CA_TM_csv_cleanup.do, which converts the .csv versions of the data files to .dta format, and uses Stata’s labeling functionality to reduce the size of the resulting files while preserving information. The other do-files generate the analyses and graphics presented in the paper describing the dataset (Jeremy N. Sheff, The Canada Trademarks Dataset, 18 J. Empirical Leg. Studies (forthcoming 2021)), available at https://papers.ssrn.com/abstract=3782655). These do-files are also licensed for reuse subject to the terms of the CC-BY-4.0 license, and users are invited to adapt the scripts to their needs.

    The python and Stata scripts included in this repository are separately maintained and updated on Github at https://github.com/jnsheff/CanadaTM.

    This repository also includes a copy of the current version of CIPO's data dictionary for its historical XML trademarks archive as of the date of construction of this dataset.

  19. m

    Replication Data for: Upcoming issues, new methods: using Interactive...

    • data.mendeley.com
    Updated Oct 18, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gustavo Behling (2021). Replication Data for: Upcoming issues, new methods: using Interactive Qualitative Analysis (IQA) in Management Research [Dataset]. http://doi.org/10.17632/kb76h5jtvw.1
    Explore at:
    Dataset updated
    Oct 18, 2021
    Authors
    Gustavo Behling
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These data refer to the paper “Upcoming issues, new methods: using Interactive Qualitative Analysis (IQA) in Management Research”. This article is a guide to the application of the IQA method in management research and the files available refer to: 1. 1-Affinities, definitions, and cards produced by focus group.docx: all cards, affinities and definitions create by focus group session.docx 2. 2-Step-by-step - Analysis procedures.docx: detailed data analysis procedures.docx 3. 3-Axial Coding Tables – Individual Interviews.docx: detailed axial coding procedures.docx 4. 4-Theoretical Coding Table – Individual Interviews.docx: detailed theoretical coding procedures.docx

  20. OYO hotel dataset

    • kaggle.com
    zip
    Updated Feb 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JIS College of Engineering (2025). OYO hotel dataset [Dataset]. https://www.kaggle.com/datasets/jiscecseaiml/oyo-hotel-dataset
    Explore at:
    zip(75756 bytes)Available download formats
    Dataset updated
    Feb 4, 2025
    Authors
    JIS College of Engineering
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Overview The OYO Hotel Rooms Dataset provides comprehensive data on hotel room listings from OYO, covering various attributes related to pricing, amenities, and customer ratings. This dataset is valuable for researchers, data scientists, and machine learning practitioners interested in hospitality analytics, price prediction, customer satisfaction analysis, and clustering-based insights.

    Data Source The dataset has been collected from publicly available OYO hotel listings and includes structured information for analysis.

    Features The dataset consists of multiple attributes that define each hotel room, including:

    Hotel Name: The name of the hotel property. City: The location where the hotel is situated. Room Type: Category of the room (e.g., Standard, Deluxe, Suite). Price (INR): The cost per night in Indian Rupees. Discounted Price: The price after applying discounts. Rating: The customer rating for the hotel (out of 5). Reviews: The number of customer reviews. Amenities: A list of available facilities such as WiFi, AC, Breakfast, Parking, etc. Latitude & Longitude: Geolocation details for mapping and spatial analysis. Potential Use Cases Price Prediction: Using regression models to predict hotel room pricing. Customer Sentiment Analysis: Analyzing ratings and reviews to understand customer satisfaction. Market Segmentation: Clustering hotels based on price, rating, and location. Recommendation Systems: Building personalized hotel recommendations. File Format

    OYO_HOTEL_ROOMS.xlsx (Excel format) – Contains structured tabular data.

    Acknowledgment This dataset is intended for academic and research purposes. The data is sourced from publicly available hotel listings and does not contain any personally identifiable information.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Author Anonym (2021). 6. Definitions and examples of the moves of the UPOCS genre [Dataset]. http://doi.org/10.17632/7yg2y4sdkn.1

6. Definitions and examples of the moves of the UPOCS genre

Explore at:
Dataset updated
Nov 5, 2021
Authors
Author Anonym
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Definitions and examples of the moves of the UPOCS genre

Search
Clear search
Close search
Google apps
Main menu