Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present a dataset created from merged secondary sources of ExecuComp and CompuStat and then augmented with manual data collection through searches of news stories related to CEO turnover.
We start dataset construction with the ExecuComp executive-level data for the period from 1992 through 2020. These data are merged with the CompuStat dataset of financial variables. As the dataset is intended for research on CEO turnover, we exclude observations in which the CEO at the start of the fiscal year is not well-defined; these are cases when there were co-CEOs and cases when the CEO was shared across different firms. The data set also excludes firm/year combinations that involve a restructuring of the firm – spinoff, buyout, merger, or bankruptcy.
We identify the CEO at the start of each year for each firm. This also helps identify the last year an individual served as CEO. In order to identify CEO turnover based on changes in the CEO from year to year, we require firm observations to extend over at least six contiguous years for the firm to remain in the sample. Cases involving the last year the firm is in the sample are excluded. We also exclude from the dataset cases when there was an interim CEO who stayed in the position for less than 2 years. This results in a sample of 3,100 firms reflecting 41,773 firm/year combinations.
For this sample, we examine news articles related to CEO turnover to confirm the reasons for each CEO departure case. We use the ProQuest full-text news database and search for the company name, the executive name, and the departure year. We identify news articles mentioning the turnover case and then classify the explanation of each CEO departure case into one of five categories of turnover. These categories represent CEOs who resigned, were fired, retired, left due to illness or death, and those who left the position but stayed with the firm in a change of duties, respectively.
The published data file does not include proprietary data from ExecuComp and CompuStat such as executive names and firm financial data. These data fields may be merged with the current data file using the provided ExecuComp and CompuStat identifiers.
The dataset consists of a single table containing the following fields: • gvkey – unique identifier for the firms retrieved from CompuStat database • firmid – unique firm identifier to distinguish distinct contiguous time periods created by breaks in a firm’s presence in the dataset • coname – company name as listed in the CompuStat database • execid – unique identifier for the executives retrieved from ExecuComp database • year – fiscal year • reason – reason for the eventual departure of the CEO executive from the firm, this field is blank for executives who did not leave the firm during the sample period • ceo_departure – dummy variable that equals 1 if the executive left the firm in the fiscal year, and 0 otherwise
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
//Replication Code for Dissanayake, Wu and Zhang, "The Burden of the National Debt: Evidence from Mergers and Acquisitions" //The main data come from the Compustat, CRSP, and Thomson Reuters databases, all of which are accessible via a subscription. //Pseudo-dataset "madebt_pseudo.dta" demonstrates format of the original dataset for all firm-years, with variable names. //The file "ivfirststage_pseudo.dta" contains Pseudo data for all the variables used in the first stage of the IV model as in Equation (3), except that macro variables with publicly available data contain real values.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
We start by identifying U.S.-based software organizations in the computer programming and data processing industry (SIC 737), as a knowledge-intensive high-growth setting. We integrate two main data sources. First, to collect the knowledge-based measures, we use publicly available data provided by the U.S. Patent and Trademark Office (USPTO). Using the General Architecture for Text Engineering (GATE) software, we design queries that retrieve the complete class and subclass information for each patent, as well as citations, inventors, and total patents granted between 1998 and 2011 inclusive. We aggregate the data by organization-year observation at the class and subclass levels and use these aggregated measures to compute the knowledge-based predictors and covariates. To compute moving averages for some variables, we collect five years of additional USPTO data which makes our knowledge dataset span between 1993 and 2011. Second, we use Compustat to collect organization-level control variables such as assets, number of employees, market valuation, R&D expenditures, intangibles, solvency, and slack. The integration of the two datasets yields a final sample panel of 100 organizations with 3.2 years of observations on average per organization from 1998 to 2011.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a STATA dta file, consisting of variables for my study Not all Tones are Equal: Forecast Tone in Earnings Conference Calls. These variables are generated based on original data from CRSP, Compustat, Refinitiv I/B/E/S, Refinitiv Business Ownership, and Refinitiv StreetEvents.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set consists of a balanced panel of 432 manufacturing companies listed in MSCI KLD 400 Social Index over the years 2015–2018. We use the MSCI KLD 400 Social Index dataset for circular performance information.We use the Wharton Research Data Services’ COMPUSTAT dataset for corporate financial information.For blockchain adoption we use an indicator variable, such as the annual number of blockchain-related news items is nonzero, 0 otherwise. The number of blockchain-related news items are extracted from LexisNexis database.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset used in Prediction of Financial Restatement by Managerial Tone: Using Machine Learning Approach. A total of 1,893 financial restatement observations were identified from the Audit Analytics database and matched with firm-level data from Compustat for the period 2010 to 2022. The earning conference call transcripts are obtained from StreetEvent. Consistent with Zhang et al. (2025), only the most recent restatement year for each firm was retained, resulting in a sample of 1,025 distinct restating firms. To examine the association between managerial tone and the likelihood of restatements, a matched sample of non-restating firms from the same time period was randomly selected. This process yielded a balanced panel comprising 2,050 unique firms. All continuous variables were subsequently winsorized at the 1st and 99th percentiles to mitigate the influence of extreme values.
Variable |
Description |
RESTATE |
Indicator variable equal to 1 if firm has restated earnings, and 0 otherwise. |
Tone_LM |
The optimistic words less pessimistic words of managers scaled by total words in the conference call, based on the Loughran and McDonald (2011) word list. |
ROA |
Operating Income Before Depreciation as a fraction of average Total Assets based on most recent two periods. |
SIZE |
Natural logarithm of Total Assets. |
PROFIT |
Gross Profitability as a fraction of Total Assets. |
LEV |
Total Debt as a fraction of Total Assets. |
IAC |
Inventory as a fraction of Current Assets |
CAPEI |
Price divided by the average of ten years of earnings, adjusted for inflation. |
IT |
Inventory Turnover. |
BS |
Book Value Per Share. |
BM |
Book Value of Equity as a fraction of Market Value of Equity. |
CR |
Long-term debt divided by the sum of long-term debt and shareholders' equity. |
AT |
Revenue divided by average total assets. |
PTB |
Price-to-Book ratio. |
PEG |
Trailing Price-to-Earnings to Growth ratio. |
PS |
Market capitalization divided by annual revenue. |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This table reports the summary statistics of the key variables in the quasi-natural experiment based on mergers between financial institutional blockholders during 1995–2012. The sample comes from multiple sources. Firm-level financial data come from COMPUSTAT database. Corporate social responsibility data come from MSCI ESG KLD database. Institutional investor holdings data come from Thomson Reuters Institutional (13F) Holdings database. Analyst coverage data come from Institutional Brokers Estimate System (I\B\E\S). We require observations to satisfy the following criteria: (1) Book equity is positive; (2) Each firm should at least have 2-year consecutive observations; (3) Variables are available in all observations; (4) Firms are not in financial (SIC code 6000–6999) or utility (SIC codes 4900–4999) industries. Our sample includes 3,778 firm-years that meet these criteria during 1995–2012 when Thomson Reuters Institutional (13F) Holdings and KLD are available and firms can be matched to blockholder mergers. All continuous variables are winsorized at 1st and 99th percentiles to alleviate the potential disturbance from outliers.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present a dataset created from merged secondary sources of ExecuComp and CompuStat and then augmented with manual data collection through searches of news stories related to CEO turnover.
We start dataset construction with the ExecuComp executive-level data for the period from 1992 through 2020. These data are merged with the CompuStat dataset of financial variables. As the dataset is intended for research on CEO turnover, we exclude observations in which the CEO at the start of the fiscal year is not well-defined; these are cases when there were co-CEOs and cases when the CEO was shared across different firms. The data set also excludes firm/year combinations that involve a restructuring of the firm – spinoff, buyout, merger, or bankruptcy.
We identify the CEO at the start of each year for each firm. This also helps identify the last year an individual served as CEO. In order to identify CEO turnover based on changes in the CEO from year to year, we require firm observations to extend over at least six contiguous years for the firm to remain in the sample. Cases involving the last year the firm is in the sample are excluded. We also exclude from the dataset cases when there was an interim CEO who stayed in the position for less than 2 years. This results in a sample of 3,100 firms reflecting 41,773 firm/year combinations.
For this sample, we examine news articles related to CEO turnover to confirm the reasons for each CEO departure case. We use the ProQuest full-text news database and search for the company name, the executive name, and the departure year. We identify news articles mentioning the turnover case and then classify the explanation of each CEO departure case into one of five categories of turnover. These categories represent CEOs who resigned, were fired, retired, left due to illness or death, and those who left the position but stayed with the firm in a change of duties, respectively.
The published data file does not include proprietary data from ExecuComp and CompuStat such as executive names and firm financial data. These data fields may be merged with the current data file using the provided ExecuComp and CompuStat identifiers.
The dataset consists of a single table containing the following fields: • gvkey – unique identifier for the firms retrieved from CompuStat database • firmid – unique firm identifier to distinguish distinct contiguous time periods created by breaks in a firm’s presence in the dataset • coname – company name as listed in the CompuStat database • execid – unique identifier for the executives retrieved from ExecuComp database • year – fiscal year • reason – reason for the eventual departure of the CEO executive from the firm, this field is blank for executives who did not leave the firm during the sample period • ceo_departure – dummy variable that equals 1 if the executive left the firm in the fiscal year, and 0 otherwise