100+ datasets found
  1. f

    Data from: Variable definition.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Mar 17, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Min, Liangyu; Huang, Xiaohong; Zhang, Xiaorong; Zhang, Jun; Zeng, Qianqian; Liu, Jiangwei (2023). Variable definition. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000998892
    Explore at:
    Dataset updated
    Mar 17, 2023
    Authors
    Min, Liangyu; Huang, Xiaohong; Zhang, Xiaorong; Zhang, Jun; Zeng, Qianqian; Liu, Jiangwei
    Description

    The impact of a chief executive officer’s (CEO’s) functional experience on firm performance has gained the attention of many scholars. However, the measurement of functional experience is rarely disclosed in the public database. Few studies have been conducted on the comprehensive functional experience of CEOs. This paper used the upper echelons theory and obtained deep-level curricula vitae (CVs) data through the named entity recognition technique. First, we mined 15 consecutive years of CEOs’ CVs from 2006 to 2020 from Chinese listed companies. Second, we extracted information throughout their careers and automatically classified their functional hierarchy. Finally, we constructed breadth (functional breadth: functional experience richness) and depth (functional depth: average tenure and the hierarchy of function) for empirical analysis. We found that a CEO’s breadth is significantly negatively related to firm performance, and the quadratic term is significantly positive. A CEO’s depth is significantly positively related to firm performance, and the quadratic term is significantly negative. The research results indicate a u-shaped relationship between a CEO’s breadth and firm performance and an inverted u-shaped relationship between their depth and firm performance. The study’s findings extend the literature on factors influencing firm performance and CEOs’ functional experience. The study expands from the horizontal macro to the vertical micro level, providing new evidence to support the recruitment and selection of high-level corporate talent.

  2. f

    Data from: Variable definition.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Oct 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ma, Jiao (2024). Variable definition. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001361512
    Explore at:
    Dataset updated
    Oct 24, 2024
    Authors
    Ma, Jiao
    Description

    Most of the previous studies of environmental innovation focus on the impact of environmental innovation on carbon emissions. This study rarely examines the internal causes and mechanisms of influence of low-carbon innovation. This study focuses on the effect of carbon emissions on low-carbon innovation in firms. Using a panel data set of Chinese A-share firms, this study finds that the increase in carbon emissions promotes low-carbon innovation. This promoting effect comes from high carbon emissions increasing the pressure to reduce carbon emissions in firms and prompting firms to increase R&D investment, and the effect is more pronounced in firms with lower equity concentration or high-tech firms. It is also found that indirect carbon emissions do not promote low-carbon innovation, while other types of carbon emissions do. This study expands the research on the internal causes of low-carbon innovation in firms, examines the logic influencing low-carbon innovation in firms from the perspective of emission reduction motives and methods, reveals that global warming contains opportunities for the development of low-carbon innovation in firms, and provides a reference for optimizing the carbon emissions calculation system.

  3. D

    Standard terms and definitions applicable to the quality assurance of...

    • data.aeronomie.be
    pdf
    Updated Jan 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Royal Belgian Institute for Space Aeronomy (2025). Standard terms and definitions applicable to the quality assurance of Essential Climate Variable data records [Dataset]. https://data.aeronomie.be/dataset/standard-terms-and-definitions-applicable-to-the-quality-assurance-of-essential-climate-variable-da
    Explore at:
    pdf, pdf(863196)Available download formats
    Dataset updated
    Jan 30, 2025
    Dataset authored and provided by
    Royal Belgian Institute for Space Aeronomy
    License

    http://publications.europa.eu/resource/authority/licence/CC_BY_4_0http://publications.europa.eu/resource/authority/licence/CC_BY_4_0

    Description

    This document contains a selection of standard terms and definitions relevant to the quality assurance of Essential Climate Variable (ECVs) data records. It reproduces appropriate terms and definitions published by normalization bodies, mainly by BIPM/JCGM/ISO in their International Vocabulary of Metrology (VIM) and Guide to the Expression of Uncertainties (GUM). It also reproduces selected terms and definitions related to the quality assurance and validation of Earth Observation (EO) data, available publicly on the ISO website and on the Cal/Val portal of the Committee on Earth Observation Satellites (CEOS).

    Several of those terms have been recommended by CEOS in the GEO-CEOS Quality Assurance framework for Earth Observation (QA4EO) and, as such, are applicable to virtually all Copernicus data sets of EO origin. Terms and definitions are expected to evolve as normalization organisations regularly update their standards.

  4. f

    Variable data dictionary.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jan 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kumar, Ashwani; Balakrishnan, Vijayakumar; Guérin, Philippe J.; Walker, Martin; Halder, Julia B.; Raja, Jeyapal Dinesh; Uddin, Azhar; Brack, Matthew; Srividya, Adinarayanan; Freitas, Luzia T.; Rahi, Manju; Singh-Phulgenda, Sauman; Basáñez, Maria-Gloria; Khan, Mashroor Ahmad; Harriss, Eli (2024). Variable data dictionary. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001377771
    Explore at:
    Dataset updated
    Jan 16, 2024
    Authors
    Kumar, Ashwani; Balakrishnan, Vijayakumar; Guérin, Philippe J.; Walker, Martin; Halder, Julia B.; Raja, Jeyapal Dinesh; Uddin, Azhar; Brack, Matthew; Srividya, Adinarayanan; Freitas, Luzia T.; Rahi, Manju; Singh-Phulgenda, Sauman; Basáñez, Maria-Gloria; Khan, Mashroor Ahmad; Harriss, Eli
    Description

    BackgroundLymphatic filariasis (LF) is a neglected tropical disease (NTD) targeted by the World Health Organization for elimination as a public health problem (EPHP). Since 2000, more than 9 billion treatments of antifilarial medicines have been distributed through mass drug administration (MDA) programmes in 72 endemic countries and 17 countries have reached EPHP. Yet in 2021, nearly 900 million people still required MDA with combinations of albendazole, diethylcarbamazine and/or ivermectin. Despite the reliance on these drugs, there remain gaps in understanding of variation in responses to treatment. As demonstrated for other infectious diseases, some urgent questions could be addressed by conducting individual participant data (IPD) meta-analyses. Here, we present the results of a systematic literature review to estimate the abundance of IPD on pre- and post-intervention indicators of infection and/or morbidity and assess the feasibility of building a global data repository.MethodologyWe searched literature published between 1st January 2000 and 5th May 2023 in 15 databases to identify prospective studies assessing LF treatment and/or morbidity management and disease prevention (MMDP) approaches. We considered only studies where individual participants were diagnosed with LF infection or disease and were followed up on at least one occasion after receiving an intervention/treatment.Principal findingsWe identified 138 eligible studies from 23 countries, having followed up an estimated 29,842 participants after intervention. We estimate 14,800 (49.6%) IPD on pre- and post-intervention infection indicators including microfilaraemia, circulating filarial antigen and/or ultrasound indicators measured before and after intervention using 8 drugs administered in various combinations. We identified 33 studies on MMDP, estimating 6,102 (20.4%) IPD on pre- and post-intervention clinical morbidity indicators only. A further 8,940 IPD cover a mixture of infection and morbidity outcomes measured with other diagnostics, from participants followed for adverse event outcomes only or recruited after initial intervention.ConclusionsThe LF treatment study landscape is heterogeneous, but the abundance of studies and related IPD suggest that establishing a global data repository to facilitate IPD meta-analyses would be feasible and useful to address unresolved questions on variation in treatment outcomes across geographies, demographics and in underrepresented groups. New studies using more standardized approaches should be initiated to address the scarcity and inconsistency of data on morbidity management.

  5. f

    nzqa_exam_questions_contextual_population_parameter_definitions - updated

    • auckland.figshare.com
    csv
    Updated Nov 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anna Fergusson; Haozhong Wei (2024). nzqa_exam_questions_contextual_population_parameter_definitions - updated [Dataset]. http://doi.org/10.17608/k6.auckland.27644403.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Nov 11, 2024
    Dataset provided by
    The University of Auckland
    Authors
    Anna Fergusson; Haozhong Wei
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data set represents contextualised population parameter definitions extracted and developed from past NZQA Level 3 Statistics exam questions. and assessment schedules, namely those used for the achievement standards AS90642 and AS91584.The data set was developed by Haozhong Wei as part of his MSc dissertation project, under the supervision of Dr Anna Fergusson and Dr Anne Patel (University of Auckland | Waipapa Taumata Rau).An overview of the variables used in the dataset:1. Year: This variable is the year of the exam.2. Paper: This is the identifier of the paper, e.g., AS90642, indicating the specific exam to which the question belongs.3. Type: This variable indicates the type of data and usually identifies whether the entry is a question or an answer.4. Question part: This variable indicates the specific part number of the problem, e.g., 1a, 1b, 2, etc.5. Text: This is the full text of the question.6. Population parameter: A description of the parameter of the entire text.7. Parameter type: These variables further detail the type of overall parameter, such as ‘single mean’ or ‘single proportion’ or even ‘difference between two means’.

  6. H

    Library Services Contributing to Institutional Success: Data Dictionary

    • dataverse.harvard.edu
    Updated Oct 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elizabeth Szkirpan (2024). Library Services Contributing to Institutional Success: Data Dictionary [Dataset]. http://doi.org/10.7910/DVN/QCM6NA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 24, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Elizabeth Szkirpan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Data dictionary for Library Services Contributing to Institutional Success Version 5 dataset. This data dictionary outlines variable, variable definition, the purpose of the variable in the dataset, the source of the variable, and the variable's collection date.

  7. s

    Variable definitions for Webb & Mindel Marine Extinctions data table

    • orda.shef.ac.uk
    txt
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tom Webb (2023). Variable definitions for Webb & Mindel Marine Extinctions data table [Dataset]. http://doi.org/10.6084/m9.figshare.1258983.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    The University of Sheffield
    Authors
    Tom Webb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Definitions of all variables in the full data table for Webb & Mindel, Global Patterns of Extinction Risk in Marine and Non-marine Systems, Current Biology. Links to full data table and R code to generate figures and analyses.

  8. f

    Definitions of explanatory variables.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Megan A. Carter; Lise Dubois; Mark S. Tremblay; Monica Taljaard; Bobby L. Jones (2023). Definitions of explanatory variables. [Dataset]. http://doi.org/10.1371/journal.pone.0047065.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Megan A. Carter; Lise Dubois; Mark S. Tremblay; Monica Taljaard; Bobby L. Jones
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    aTime-dependent indicates that these variables were available at 4, 6, 7, 8, and 10 years of age and thus were treated as time-dependent explanatory variables in the analysis. If a variable was only measured once and occurred at or before baseline (4 y) it was treated as a ‘risk factor’ (time-stable).bMissing at 4 y of age for all children, value at age 3.5 y was carried forward to age 4.cFor more information on how this variable was calculated and interpreted, please see reference 32.dMeasured every other data collection cycle for all children (value at age 6 was carried forward for age 7).

  9. f

    Definition of independent variables and dependent variable.

    • plos.figshare.com
    xls
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xuan Wang; Yuqing Tang; Xiaopeng Zhang; Xi Yin; Xin Du; Xinping Zhang (2023). Definition of independent variables and dependent variable. [Dataset]. http://doi.org/10.1371/journal.pone.0109594.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Xuan Wang; Yuqing Tang; Xiaopeng Zhang; Xi Yin; Xin Du; Xinping Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    *For example, if the patient is six months old, the age should be converted to 0.5.Definition of independent variables and dependent variable.

  10. m

    Synthesis methods Stata code: Cumpston_et_al_2023_other_synthesis_methods.do...

    • bridges.monash.edu
    • researchdata.edu.au
    txt
    Updated Jan 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miranda Cumpston; Sue Brennan; Rebecca Ryan; Joanne McKenzie (2023). Synthesis methods Stata code: Cumpston_et_al_2023_other_synthesis_methods.do [Dataset]. http://doi.org/10.26180/20786251.v4
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 27, 2023
    Dataset provided by
    Monash University
    Authors
    Miranda Cumpston; Sue Brennan; Rebecca Ryan; Joanne McKenzie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This Stata .do file provides the code used to analyse the data extracted and coded from systematic reviews included in the paper: Cumpston MS, Brennan SE, Ryan R, McKenzie JE. 2023. Statistical synthesis methods other than meta-analysis are commonly used, but are seldom specified: a survey of systematic reviews of interventions Input file: Synthesis methods data file: Cumpston_et_al_2023_other_synthesis_methods.xlsx (https://doi.org/10.26180/20785396) Associated file: Synthesis methods data dictionary (https://doi.org/10.26180/20785948) Study protocol: Cumpston MS, McKenzie JE, Thomas J and Brennan SE. The use of ‘PICO for synthesis’ and methods for synthesis without meta-analysis: protocol for a survey of current practice in systematic reviews of health interventions. F1000Research 2021, 9:678. (https://doi.org/10.12688/f1000research.24469.2)

    Note: Naming convention of the variables. The naming convention for the variables links to the data dictionary. The character prefix identifies the section of the data_directory (e.g. variables names with the prefix 'Chars' are from the 'CHARACTERISTICS' section). The number of the variable reflects the item number in the data dictionary, except that the first digit is removed because this is captured by the character prefix. For example, Chars_2 is item number 1.2 under the 'CHARACTERISTICS' section of the data dictionary.

  11. ERA5 monthly averaged data on single levels from 1940 to present

    • cds.climate.copernicus.eu
    grib
    Updated Aug 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECMWF (2025). ERA5 monthly averaged data on single levels from 1940 to present [Dataset]. http://doi.org/10.24381/cds.f17050d7
    Explore at:
    gribAvailable download formats
    Dataset updated
    Aug 6, 2025
    Dataset provided by
    European Centre for Medium-Range Weather Forecastshttp://ecmwf.int/
    Authors
    ECMWF
    License

    https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/cc-by/cc-by_f24dc630aa52ab8c52a0ac85c03bc35e0abc850b4d7453bdc083535b41d5a5c3.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/cc-by/cc-by_f24dc630aa52ab8c52a0ac85c03bc35e0abc850b4d7453bdc083535b41d5a5c3.pdf

    Time period covered
    Jan 1, 1940 - Jul 1, 2025
    Description

    ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. ERA5 provides hourly estimates for a large number of atmospheric, ocean-wave and land-surface quantities. An uncertainty estimate is sampled by an underlying 10-member ensemble at three-hourly intervals. Ensemble mean and spread have been pre-computed for convenience. Such uncertainty estimates are closely related to the information content of the available observing system which has evolved considerably over time. They also indicate flow-dependent sensitive areas. To facilitate many climate applications, monthly-mean averages have been pre-calculated too, though monthly means are not available for the ensemble mean and spread. ERA5 is updated daily with a latency of about 5 days (monthly means are available around the 6th of each month). In case that serious flaws are detected in this early release (called ERA5T), this data could be different from the final release 2 to 3 months later. In case that this occurs users are notified. The data set presented here is a regridded subset of the full ERA5 data set on native resolution. It is online on spinning disk, which should ensure fast and easy access. It should satisfy the requirements for most common applications. An overview of all ERA5 datasets can be found in this article. Information on access to ERA5 data on native resolution is provided in these guidelines. Data has been regridded to a regular lat-lon grid of 0.25 degrees for the reanalysis and 0.5 degrees for the uncertainty estimate (0.5 and 1 degree respectively for ocean waves). There are four main sub sets: hourly and monthly products, both on pressure levels (upper air fields) and single levels (atmospheric, ocean-wave and land surface quantities). The present entry is "ERA5 monthly mean data on single levels from 1940 to present".

  12. H

    Survey of Income and Program Participation (SIPP)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). Survey of Income and Program Participation (SIPP) [Dataset]. http://doi.org/10.7910/DVN/I0FFJV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the survey of income and program participation (sipp) with r if the census bureau's budget was gutted and only one complex sample survey survived, pray it's the survey of income and program participation (sipp). it's giant. it's rich with variables. it's monthly. it follows households over three, four, now five year panels. the congressional budget office uses it for their health insurance simulation . analysts read that sipp has person-month files, get scurred, and retreat to inferior options. the american community survey may be the mount everest of survey data, but sipp is most certainly the amazon. questions swing wild and free through the jungle canopy i mean core data dictionary. legend has it that there are still species of topical module variables that scientists like you have yet to analyze. ponce de león would've loved it here. ponce. what a name. what a guy. the sipp 2008 panel data started from a sample of 105,663 individuals in 42,030 households. once the sample gets drawn, the census bureau surveys one-fourth of the respondents every four months, over f our or five years (panel durations vary). you absolutely must read and understand pdf pages 3, 4, and 5 of this document before starting any analysis (start at the header 'waves and rotation groups'). if you don't comprehend what's going on, try their survey design tutorial. since sipp collects information from respondents regarding every month over the duration of the panel, you'll need to be hyper-aware of whether you want your results to be point-in-time, annualized, or specific to some other period. the analysis scripts below provide examples of each. at every four-month interview point, every respondent answers every core question for the previous four months. after that, wave-specific addenda (called topical modules) get asked, but generally only regarding a single prior month. to repeat: core wave files contain four records per person, topical modules contain one. if you stacked every core wave, you would have one record per person per month for the duration o f the panel. mmmassive. ~100,000 respondents x 12 months x ~4 years. have an analysis plan before you start writing code so you extract exactly what you need, nothing more. better yet, modify something of mine. cool? this new github repository contains eight, you read me, eight scripts: 1996 panel - download and create database.R 2001 panel - download and create database.R 2004 panel - download and create database.R 2008 panel - download and create database.R since some variables are character strings in one file and integers in anoth er, initiate an r function to harmonize variable class inconsistencies in the sas importation scripts properly handle the parentheses seen in a few of the sas importation scripts, because the SAScii package currently does not create an rsqlite database, initiate a variant of the read.SAScii function that imports ascii data directly into a sql database (.db) download each microdata file - weights, topical modules, everything - then read 'em into sql 2008 panel - full year analysis examples.R< br /> define which waves and specific variables to pull into ram, based on the year chosen loop through each of twelve months, constructing a single-year temporary table inside the database read that twelve-month file into working memory, then save it for faster loading later if you like read the main and replicate weights columns into working memory too, merge everything construct a few annualized and demographic columns using all twelve months' worth of information construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half, again save it for faster loading later, only if you're so inclined reproduce census-publish ed statistics, not precisely (due to topcoding described here on pdf page 19) 2008 panel - point-in-time analysis examples.R define which wave(s) and specific variables to pull into ram, based on the calendar month chosen read that interview point (srefmon)- or calendar month (rhcalmn)-based file into working memory read the topical module and replicate weights files into working memory too, merge it like you mean it construct a few new, exciting variables using both core and topical module questions construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half reproduce census-published statistics, not exactly cuz the authors of this brief used the generalized variance formula (gvf) to calculate the margin of error - see pdf page 4 for more detail - the friendly statisticians at census recommend using the replicate weights whenever possible. oh hayy, now it is. 2008 panel - median value of household assets.R define which wave(s) and spe cific variables to pull into ram, based on the topical module chosen read the topical module and replicate weights files into working memory too, merge once again construct a replicate-weighted complex sample design with a...

  13. Data from: Bike Sharing Dataset

    • kaggle.com
    Updated Sep 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ram Vishnu R (2024). Bike Sharing Dataset [Dataset]. https://www.kaggle.com/datasets/ramvishnur/bike-sharing-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 10, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ram Vishnu R
    Description

    Problem Statement:

    A bike-sharing system is a service in which bikes are made available for shared use to individuals on a short term basis for a price or free. Many bike share systems allow people to borrow a bike from a "dock" which is usually computer-controlled wherein the user enters the payment information, and the system unlocks it. This bike can then be returned to another dock belonging to the same system.

    A US bike-sharing provider BoomBikes has recently suffered considerable dip in their revenue due to the Corona pandemic. The company is finding it very difficult to sustain in the current market scenario. So, it has decided to come up with a mindful business plan to be able to accelerate its revenue.

    In such an attempt, BoomBikes aspires to understand the demand for shared bikes among the people. They have planned this to prepare themselves to cater to the people's needs once the situation gets better all around and stand out from other service providers and make huge profits.

    They have contracted a consulting company to understand the factors on which the demand for these shared bikes depends. Specifically, they want to understand the factors affecting the demand for these shared bikes in the American market. The company wants to know:

    • Which variables are significant in predicting the demand for shared bikes.
    • How well those variables describe the bike demands

    Based on various meteorological surveys and people's styles, the service provider firm has gathered a large dataset on daily bike demands across the American market based on some factors.

    Business Goal:

    You are required to model the demand for shared bikes with the available independent variables. It will be used by the management to understand how exactly the demands vary with different features. They can accordingly manipulate the business strategy to meet the demand levels and meet the customer's expectations. Further, the model will be a good way for management to understand the demand dynamics of a new market.

    Data Preparation:

    1. You can observe in the dataset that some of the variables like 'weathersit' and 'season' have values as 1, 2, 3, 4 which have specific labels associated with them (as can be seen in the data dictionary). These numeric values associated with the labels may indicate that there is some order to them - which is actually not the case (Check the data dictionary and think why). So, it is advisable to convert such feature values into categorical string values before proceeding with model building. Please refer the data dictionary to get a better understanding of all the independent variables.
    2. You might notice the column 'yr' with two values 0 and 1 indicating the years 2018 and 2019 respectively. At the first instinct, you might think it is a good idea to drop this column as it only has two values so it might not be a value-add to the model. But in reality, since these bike-sharing systems are slowly gaining popularity, the demand for these bikes is increasing every year proving that the column 'yr' might be a good variable for prediction. So think twice before dropping it.

    Model Building:

    In the dataset provided, you will notice that there are three columns named 'casual', 'registered', and 'cnt'. The variable 'casual' indicates the number casual users who have made a rental. The variable 'registered' on the other hand shows the total number of registered users who have made a booking on a given day. Finally, the 'cnt' variable indicates the total number of bike rentals, including both casual and registered. The model should be built taking this 'cnt' as the target variable.

    Model Evaluation:

    When you're done with model building and residual analysis and have made predictions on the test set, just make sure you use the following two lines of code to calculate the R-squared score on the test set. python from sklearn.metrics import r2_score r2_score(y_test, y_pred) - where y_test is the test data set for the target variable, and y_pred is the variable containing the predicted values of the target variable on the test set. - Please perform this step as the R-squared score on the test set holds as a benchmark for your model.

  14. f

    Independent variables that characterise the performed task derived from a...

    • figshare.com
    xls
    Updated Jun 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danuta Roman-Liu; Joanna Kamińska; Tomasz Tokarski (2025). Independent variables that characterise the performed task derived from a study qualified for further analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0324924.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 10, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Danuta Roman-Liu; Joanna Kamińska; Tomasz Tokarski
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Independent variables that characterise the performed task derived from a study qualified for further analysis.

  15. O

    ARCHIVED - Live Well San Diego Data Dictionary

    • data.sandiegocounty.gov
    application/rdfxml +5
    Updated Sep 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    County of San Diego (2021). ARCHIVED - Live Well San Diego Data Dictionary [Dataset]. https://data.sandiegocounty.gov/Live-Well-San-Diego/ARCHIVED-Live-Well-San-Diego-Data-Dictionary/remr-mk73
    Explore at:
    json, xml, application/rdfxml, csv, tsv, application/rssxmlAvailable download formats
    Dataset updated
    Sep 23, 2021
    Dataset authored and provided by
    County of San Diego
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Area covered
    San Diego
    Description

    For current version, see: https://data.sandiegocounty.gov/Live-Well-San-Diego/Live-Well-San-Diego-Data-Dictionary/37vr-nftn/about_data

    This is the Data Dictionary for the Live Well San Diego Database. Each variable is defined, given pertinent notes, and sourced.

    Prepared by: County of San Diego, Health & Human Services Agency, Public Health Services Division, Community Health Statistics Unit.

  16. A

    NLDAS Forcing Data L4 Monthly 0.125 x 0.125 degree V001 (NLDAS_FOR0125_M) at...

    • data.amerigeoss.org
    html, pdf, png
    Updated Jul 28, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States[old] (2019). NLDAS Forcing Data L4 Monthly 0.125 x 0.125 degree V001 (NLDAS_FOR0125_M) at GES DISC [Dataset]. https://data.amerigeoss.org/es/dataset/nldas-forcing-data-l4-monthly-0-125-x-0-125-degree-v001-nldas-for0125-m-at-ges-disc
    Explore at:
    html, pdf, pngAvailable download formats
    Dataset updated
    Jul 28, 2019
    Dataset provided by
    United States[old]
    Description

    This data set contains the forcing data for Phase 1 of the North American Land Data Assimilation System (NLDAS-1). The data are in 1/8th degree grid spacing and range from Aug. 1996 to Dec. 2007. The temporal resolution is monthly. The file format is WMO GRIB-1. The NLDAS-1 monthly forcing data, containing 17 variables, are generated from the NLDAS-1 hourly forcing data. Brief description about the NLDAS-1 hourly forcing data can be found from the GCMD DIF for NLDAS_FOR0125_H_001.

    The data set applies a user-defined parameter table to indicate the contents and parameter number. The GRIBTAB file shows a list of parameters for this data set, along with their Product Definition Section (PDS) IDs and units.

    The variables, DLWRFsfc, DSWRFsfc, PRESsfc, SPFH2m, TMP2m, UGRD10m, and VGRD10m, are the monthly average from 00Z01 of month to 23:59Zlastdayofmonth.

    The variables, BRTMPsfc and CAPEsfc, are the monthly average from 00Z01 of month to 23:59Zlastdayofmonth, except if any hour has an undefined value of -9999, then do not include the hour in the monthly average.

    The variables, PARsfc and RGOESsfc, are the monthly average from 00Z01 of month to 23:59Zlastdayofmonth, except if any hour has an undefined value of -9999, then reassign the variable as zero and include the hour in the monthly average.

    The variables, ACPCPsfc, APCPsfc, PEDASsfc, and PRDARsfc, are the monthly accumulation from 00Z01 of month to 23:59Zlastdayofmonth. However, the ACPCPsfc is actually the sum of the (ACPCPsfc/PEDASsfc)*APCPsfc from each hour, where the ratio of (ACPCPsfc/PEDASsfc) is the fraction of convective precipitation from EDAS, and then multiplied by the APCPsfc to get the convective precipitation. For PRDARsfc accumulation, if hourly PRDARsfc is undefined or negative, fill the hour with a zero value.

    The last variable, RSWRFsfc, is the monthly average from 00Z01 of month to 23:59Zlastdayofmonth, except represents the monthly average of the hourly "blend" of the DSWRFsfc from EDAS and RGOESsfc from GEOS. The blend algorithm is that, for each hour, the RGOESsfc from GEOS is used for all the grid points where it is available, but for where it is not available, the DSWRFsfc from EDAS is used. Because the spatial extent/availability of GEOS varies from hour to hour, this blend is done for hourly data first, and then the monthly average is applied to the hourly blended data. This last variable thus best represents the shortwave radiation flux downwards at the surface that is used in the NLDAS-1 LSMs. More about this blending/supplementation can be found from the NLDAS Project Web Site.

  17. O

    Live Well San Diego Data Dictionary

    • data.sandiegocounty.gov
    • splitgraph.com
    application/rdfxml +5
    Updated Oct 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    County of San Diego (2024). Live Well San Diego Data Dictionary [Dataset]. https://data.sandiegocounty.gov/w/37vr-nftn/by4r-nr9x?cur=DYgNX4bQTWU&from=CMHqym2hUhd
    Explore at:
    json, application/rssxml, csv, tsv, application/rdfxml, xmlAvailable download formats
    Dataset updated
    Oct 10, 2024
    Dataset authored and provided by
    County of San Diego
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Area covered
    San Diego
    Description

    This is the Data Dictionary for the Live Well San Diego Database. Each variable is defined, given pertinent notes, and sourced.

    Prepared by: County of San Diego, Health & Human Services Agency, Public Health Services Division, Community Health Statistics Unit.

  18. HIRENASD Experimental Data, Static Cp Plots and Data files

    • data.nasa.gov
    • gimi9.com
    • +2more
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). HIRENASD Experimental Data, Static Cp Plots and Data files [Dataset]. https://data.nasa.gov/dataset/hirenasd-experimental-data-static-cp-plots-and-data-files
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Tecplot (ascii) and matlab files are posted here for the Static pressure coefficient data sets. To download all of the data in either tecplot format or matlab format, you can go to https://c3.nasa.gov/dashlink/resources/485/ Please consult the documentation found on this page under Support/Documentation for information regarding variable definition, data processing, etc.

  19. f

    The detailed definition, data source, and years of data extracted for each...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shi, Fanghui; Sun, Xiaowen; Olatosi, Bankole; Li, Zhenlong; Li, Xiaoming; Zhang, Jiajia; Weissman, Sharon; Yang, Xueying; Zeng, Chengbo (2023). The detailed definition, data source, and years of data extracted for each county-level variable. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001035364
    Explore at:
    Dataset updated
    May 31, 2023
    Authors
    Shi, Fanghui; Sun, Xiaowen; Olatosi, Bankole; Li, Zhenlong; Li, Xiaoming; Zhang, Jiajia; Weissman, Sharon; Yang, Xueying; Zeng, Chengbo
    Description

    The detailed definition, data source, and years of data extracted for each county-level variable.

  20. R

    Data from: The relationship between learning orientation, firm performance...

    • repod.icm.edu.pl
    ods, odt
    Updated Feb 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karpacz, Jarosław; Wójcik-Karpacz, Anna (2023). The relationship between learning orientation, firm performance and market dynamism in MSMEs operating in technology parks in Poland: an empirical analysis [Dataset]. http://doi.org/10.18150/IOUHRH
    Explore at:
    ods(8051), odt(7204), ods(7973)Available download formats
    Dataset updated
    Feb 3, 2023
    Dataset provided by
    RepOD
    Authors
    Karpacz, Jarosław; Wójcik-Karpacz, Anna
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Poland
    Description

    In this study, we investigate the (in)direct relationship between learning orientation and firm performance. The study is guided by the DCs framework. We collected data from 182 MSMEs operating in TPs in Poland. We used two methods (PAPI, CAWI) in our quantitative empirical research.For the analysis of empirical data, we used the methods of description and statistical inference. The values obtained by means of Cronbach’s alpha values showed very good reliability of questionnaire. We have assumed that the coefficient deciding whether a tool is reliable should be at least 0.70. The results of the Kolmogorov-Smirnov tests indicate grounds for assuming that the variables are not normally distributed. We present the results of the Kolmogorov–Smirnov tests and Cronbach’s alpha coefficients in Table 1.In the next step, we applied the correlation analysis between the variables by using the rho-Spearman coefficient. We present the results of correlations among the analysed variables in Table 2. The analysis of data included in Table 2 indicated weak or very weak correlations among the variables in individual configurations. LO positively correlates with FP (rs = 0.197; p < 0.01). This means that the increase in LO is accompanied, on average, by a small increase in FP.There is also a positive, although very weak (rs = 0.151) correlation between MD and LO, which was statistically significant (p < 0.05). This means that the increase in MD is accompanied by, on average, a slight increase in LO.At the same time, the results of the correlation analysis indicated a weak but positive correlation between one of the dimensions of MD, i.e. speed of change in technology and competition and LO (rs = 0.0.236; p < 0.01). Relationships between the two remaining dimensions of MD were not statistically significant.In addition, the aforementioned MD dimension also positively correlates with FP. The correlation between the MD dimension called speed of change in technology and competition and FP is positive, weak and statistically significant (rs = 0.181; p < 0.05). This means that the increase in the speed of change in technology and competition is accompanied by, on average, a slight increase in FP. Relationships between the two remaining dimensions of MD and FP were not statistically significant.Correlation analysis encourages deeper recognition and understanding of LO-FP relationship in the context of MD. We used linear regression models in order to verify the hypotheses, which allowed for a global assessment of relationships among all analysed variables.The values of coefficients obtained for permanent effects in this model inform about how much the expected value of explanatory variable changes along with the unitary growth of a given predictor. The explanatory variable (predictor) is a variable in a statistical model (as well as in an econometric model) on the basis of which the response variable is calculated. In Model 1 there is one explanatory variable (LO); while in Model 2 there are two explanatory variables (LO, MD). The response variable is FP. The statistical significance of these coefficients was verified by a test based on the t statistics. For all the mentioned tests, p<0.05 indicated the statistical significance of the analysed relationships.The assessment of the impact of LO on FP is dictated by the H.1 hypothesis verification.While the assessment of the impact of dynamism of the market in which enterprises operate in explaining the impact of LO on FP is dictated by the H.2 hypothesis verification.H.1: Learning orientation is positively related to firm performance.H.2: Market dynamism moderates the learning orientation-firm performance relationship; the positive effect of learning orientation on firm performance is likely to be stronger under high market dynamism than under low market dynamism.The results of testing the H1 and H2 hypotheses are presented in Table 3.We estimated Models 1 and 2 in Table 3 by using the Akaike Information Criteria (AIC). The AIC for both models was similar, i.e. 568.28 for the first model and 571.12 for the second one. AIC levels for both models indicated acceptable matching levels. The lower the AIC value, the better the predictive values of the model. The model coefficient is a parameter determined by its most likely value. The confidence interval of the model coefficient indicates in which range its less probable but possible values may be. It also has a diagnostic value. If the value of the regression coefficient contains “0”, the coefficient has no substantive value for the model. Model 1 explained 13.5% of the data variation (R2 = 0.135), while Model 2 explained 14.0% of the data variation (R2 = 0.140), which is slightly more than Model 1. The analysis of the models presented in Table 3 leads to several findings. In the first model, only LO was positively related to FP and only slightly explained the variability of the dependent variable. It has a small but statistically significant impact on FP (coefficient: 0.38; p=0.00). The linear regression model (Model 1) confirms the thesis about the positive impact of LO on FP. It may be assumed that an increase in the assessment of LO by one point, with no change in the other parameters of the model, would result in an increase in average FP by 0.38. This model explains 13.5% of the data variability (R2 = 0.135). Secondly, the linear regression model (Model 2) did not confirm the thesis about the moderating role of MD on the LO-FP relationship. None of the predictors showed statistical significance (p<0.05) in Model 2. What is more, taking the MD variable into account affects the quality of the model, and MD itself adopts negative prediction indicators, which means that better FP in responding to changes in the level of MD deteriorates the overall FP. However, the research has not confirmed whether MD - a higher-order construct built of three first-order constructs, i.e. the speed of changes in technology and competition, unpredictability of changes in technology and competition, uncertainty of customer behaviour - increases the importance of LO for increasing FP, and thus achieving a competitive advantage. Thirdly, the control variables were insignificant in both models. This means that the control variables in the form of enterprise size do not have a statistically significant effect on the dependent variable. Therefore, the introduction of two control variables and a moderating variable reduced the impact of LO on FP to a statistically insignificant level.The results of the study show that firm performance benefits from LO-related behaviours. Learning orientation is an important stimulant of firm performance, while market dynamism has not been classified as a moderator of the learning orientation-firm performance relationship.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Min, Liangyu; Huang, Xiaohong; Zhang, Xiaorong; Zhang, Jun; Zeng, Qianqian; Liu, Jiangwei (2023). Variable definition. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000998892

Data from: Variable definition.

Related Article
Explore at:
Dataset updated
Mar 17, 2023
Authors
Min, Liangyu; Huang, Xiaohong; Zhang, Xiaorong; Zhang, Jun; Zeng, Qianqian; Liu, Jiangwei
Description

The impact of a chief executive officer’s (CEO’s) functional experience on firm performance has gained the attention of many scholars. However, the measurement of functional experience is rarely disclosed in the public database. Few studies have been conducted on the comprehensive functional experience of CEOs. This paper used the upper echelons theory and obtained deep-level curricula vitae (CVs) data through the named entity recognition technique. First, we mined 15 consecutive years of CEOs’ CVs from 2006 to 2020 from Chinese listed companies. Second, we extracted information throughout their careers and automatically classified their functional hierarchy. Finally, we constructed breadth (functional breadth: functional experience richness) and depth (functional depth: average tenure and the hierarchy of function) for empirical analysis. We found that a CEO’s breadth is significantly negatively related to firm performance, and the quadratic term is significantly positive. A CEO’s depth is significantly positively related to firm performance, and the quadratic term is significantly negative. The research results indicate a u-shaped relationship between a CEO’s breadth and firm performance and an inverted u-shaped relationship between their depth and firm performance. The study’s findings extend the literature on factors influencing firm performance and CEOs’ functional experience. The study expands from the horizontal macro to the vertical micro level, providing new evidence to support the recruitment and selection of high-level corporate talent.

Search
Clear search
Close search
Google apps
Main menu