Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The relevant data of CSR equilibrium can be download conditionally on Hexun (https://data.hexun.com). (XLSX)
The initial sample of this study covers the A-share companies listed on the Shanghai and Shenzhen stock exchanges during the period 2008-2020. We then screened and processed the initial sample data, including (a) Screening for companies with both RepRisk's ESG rating and Bloomberg's ESG rating. Specifically, the selection is based on samples with the same ISIN code and companies' English names in the Bloomberg and RepRisk lndex (RRI) databases. The ISIN code is a securities coding standard developed by the International Organization for Standardization (ISO) and is a unique code used to identify securities in each country or region around the world. We exclude samples that do not provide ISIN codes or have inconsistent English names. (b) We exclude observations with missing values for the main variables. (c) We exclude the ST, *ST and PT trading status samples during the observation period. Our final sample contains 1352 firm-year observations.The ESG disclosure score data and ESG performance score data required for the ESG-washing construction are respectively obtained from the Bloomberg database and the RepRisk Index (RRI) database of the Wharton Research Centre for Data Studies (WRDS). Positive media coverage data is sourced from the China Research Data Services Platform (CNRDS), while the instrumental variable (IV_population) is obtained from the EPS database and Juhe Data (https://www.gotohui.com/). Unless otherwise stated, all other data in this study are from the China Stock Market and Accounting Research (CSMAR) database. Data on executive company changes were collected manually by the authors back-to-back and independently. Then we compared and reconciled the data collected by each, and where there were discrepancies, we again collected and calibrated the data to maximize their reliability. We first obtained executive biographies from the CSMAR database, and the missing values were retrieved from Sina Finance ( https://finance.sina.com.cn/). Due to the unstructured nature of the resume data, we manually processed more than 30,000 resumes of executives to get the data of executives' company changes, based on which we calculated the per capita number of job hops of all executives in each company. The number of part-time jobs held by executives also reflects their pursuit of career changes and development, so in the robustness test the per capita mean of the number of part-time jobs held by executives is used as a proxy variable for careerist orientation. These data can be obtained directly from the CSMAR database.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This paper using panel data of 2008-2019 Shanghai and Shenzhen A-share listed companies as the research sample and employing the multiple regression method to tests the relationship between executive compensation incentives and R&D investment of listed companies in China, further investigates the path of the relationship between the two and the influence of government subsidy to the relationship. In this paper, the selected samples are excluded according to the following criteria: ①Companies with incomplete data on financial indicators and corporate governance indicators are excluded. ②Eliminate companies with negative asset-liability ratio or greater than 1. ③Exclude companies in the financial and insurance industry. ④Exclude listed companies less than 1 year. ⑤Exclude companies containing S, ST and *ST. ⑥Exclude the companies with extreme sample data. The risk-taking data involved in this paper came from the WIND database. Other data come from the CSMAR database.
The house price data are collected from the official website of China's National Bureau of Statistics . We acquired the month-on-month growth data of house prices since January 2006, then compiled the house price index based on January 2006 as 100. The Shanghai Stock Exchange Index (SSEI) data which are treated as stock market prices are derived from the CSMAR database. After that, we calculate the monthly house price and stock price return as , where are proxied by the monthly house price index and SSEI, and represent the returns series. 157 observations from January 2006 to March 2019 are obtained.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This article presents a comprehensive dataset extracted from annual reports, the China Stock Market and Accounting Research Database (CSMAR) and the Wind database, focusing on digital transformation and strategic risk taking.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The research data in this article comes from the data of Chinese A-share listed companies from 2000 to 2023; The annual reports of relevant companies are obtained from the official websites of the Shenzhen and Shanghai Stock Exchanges; The relevant data of listed companies comes from the CSMAR database of Guotai An. At the same time, this article conducts a 1% truncation process on non ratio continuous variables to reduce the impact of outliers. (1) Due to the lack of mandatory disclosure of carbon emission data by the Chinese government, there is currently a lack of micro level data on corporate carbon emissions. This study adopted the method of Chapple et al. (2013) to indirectly measure the carbon dioxide emissions of enterprises. Due to the lack of 23 years of carbon emission data, this study borrowed the ARIMA-BP prediction method from Hu Jianbo (2013) and Zhao SL et al. (2024) to fill in the predictions. (2) The degree of digital transformation of listed companies (Digital) The measurement methods for digital transformation of companies are relatively mature, and the measurement method adopts text analysis. This article first constructs numbers Keyword table for transformation; Then use Python software to match the vocabulary with the text of the annual report of the listed company, and use Jieba's method The module can calculate the frequency of relevant keywords appearing in the annual report documents of listed companies; Finally, add 1 to the frequency of the word and perform logarithmic processing Obtain indicators for enterprise digital transformation. Please refer to Wu Fei's (2021) approach (Managing the World) for specific details. (3) Control variables. This study includes enterprise level indicators as control variables: property rights nature of enterprises (SOE), with state-owned enterprises set to 0 and private enterprises set to 0 1) For operating enterprises, the board size (logarithm of the number of board members), the logarithm of the age of the enterprise (age), and the assets and liabilities Rate (lev), return on equity (roe), operating cash flow (CF), sales growth rate (growth), net profit growth rate (gprofit) Proportion of tangible assets (tangibi), proportion of independent directors (indep), proportion of the largest shareholder's shareholding (top 1) The dual role of chairman and general manager, and the nature of property rights (SOE). This study was supported by the Key Support Project for College Students' Innovation and Entrepreneurship in Hunan Province - Research on the Factors and Mechanisms of Digital Transformation of Construction Enterprises in the Digital Economy (S202411532001)
A new fraud detection dataset FDCompCN for detecting financial statement fraud of companies in China. We construct a multi-relation graph based on the supplier, customer, shareholder, and financial information disclosed in the financial statements of Chinese companies. These data are obtained from the China Stock Market and Accounting Research (CSMAR) database. We select samples between 2020 and 2023, including 5,317 publicly listed Chinese companies traded on the Shanghai, Shenzhen, and Beijing Stock Exchanges.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data includes the inputs data for the proposed model in three financial markets.Those data is downloaded from WIND database and CSMAR database.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data includes the inputs data for evaluating the proposed model. Those data are downloaded from WIND database, CSMAR database and Investing.com.
A daily emerging stock market dataset (Chinese CSI 300 dataset) including 300 stocks and 5,088 time steps from the CSMAR database. We construct our stock dataset using a pool of stocks from the CSI 300 index for the last 21 years, from 01/02/2000 to 12/31/2020. Instead of all stocks in the market, we select the stocks that used to belong to the major market index CSI 300, and filter out stocks that have missing price data over the period.
For each trading day, we use the fundamental price features as the features of stocks, including open price, close price, and volume. Additionally, we normalize price features such as open price and close price with logarithm.
The dataset randomly splits stocks into five non-overlapping sub-datasets. For each subset, the first 90% of trading days are used as train data, the following 5% as validation data, and the rest 5% as test data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Corporate social responsibility (CSR) has been widely discussed. However, the existing literature does not delve into the theoretical mechanism to show how companies adjust their CSR in the face of minimum wage increases. This may be due to the lack of a theoretical framework that clarifies the relationship between minimum wage increases and CSR adjustments. The objectives of this study is to fill this gap by investigating the impact of minimum wage increases on CSR, employing both cost stickiness and optimal distinctiveness theories. We use the data from the CSMAR database, the Human Resources and Social Security Administration, and Hexun rating system. The subject of this study is China’s A-share listed companies during 2010–2020. This study employs fixed-effects models for a panel data. The findings reveal that minimum wage increases are significantly associated with a reduction in both strategic CSR and responsive CSR. Notably, the decrease in responsive CSR outweighs that of strategic CSR. Furthermore, our results indicate that customer concentration or CSR sensitivity significantly moderates this relationship. More particularly, firms with higher customer concentration are less responsive to minimum wage increases in their CSR activities. Firms with higher sensitivity in CSR are more likely to mitigate the reducing effect of the minimum wage on CSR. By revealing how minimum wage increases affect CSR and its economic consequences, our study provides scientific recommendations for policymakers to measure the impact of minimum wage policies at the firm level.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The data is sourced from the CSMAR database, covering violation records of companies listed on the Shanghai and Shenzhen stock exchanges from 2015 to 2020, focusing on five types of financial fraud: fictitious profits, inflated assets, false records, material omissions, and inaccurate disclosures. After excluding financial firms, the fraud sample set includes 2,652 violation records from 1,226 companies. Additionally, 2,652 high-quality companies without fraud were selected from the CNRDS ESG rating database to form the non-fraud sample set. The dataset consists of two parts: 1) Structured data: The file "financial fraud dataset (structured data).xlsx" contains 5,304 records covering 43 fields, such as basic company information, financial indicators, structural indicators, and linguistic features of annual report texts. Field names are listed in Table 1. 2) Annual report text data: The folder named "Annual report text data" includes 2,652 fraud samples (file names formatted as Symbol-Year.txt) and 2,652 non-fraud samples (same format). The files contain the MD&A sections of listed companies' annual reports.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study considered non-financial A-share listed corporations between 2007 and 2021, with enterprise data retrieved from the CSMAR database, the Wind database, and various provincial and municipal statistical yearbooks. The collected data was processed as follows: (1) Given the adjustments to industry codes by the China Securities Regulatory Commission in 2012, this study standardized industry codes from before 2012 to the dimension of codes after 2012. (2) The sample industry was restricted to manufacturing, and all ST, *ST, and PT company samples were excluded. (3) Samples of corporations with severely missing fundamental data were removed. (4) Samples of corporations with a debt–asset ratio exceeding 1 were removed. (5) A 1% trimming was applied to all dummy variables. The final sample comprised 21,273 observations, with 6,868 samples designated as the experimental group and 14,405 samples designated as the control group.
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
This database contains datasets on the innovation efficiency of specialized and innovative enterprises, the knowledge breadth of patents, and the evaluation system of digitalization level. The efficiency of enterprise innovation is achieved by DEAP and Front41, the knowledge width of patents is manually calculated by the author, and the level of digitization is achieved by Stata18.0. The data of innovation efficiency comes from CSMAR database and Flush iFinD database, the data of patents comes from China's China National Intellectual Property Administration, and the data of digital intelligence level comes from CSMAR database. The following measures have been taken for missing data and outliers in various enterprises: firstly, select the annual reports of companies with missing data for verification; Secondly, if the verification is still missing, the outlier will be treated as a missing value and supplemented using interpolation.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Based on the above, this study selects A-share-listed manufacturing firms from 2014 to 2023 as the research sample. Data on big data applications is extracted from firm annual reports published on the official websites of the Shenzhen Stock Exchange and the Shanghai Stock Exchange. To ensure the validity and robustness of the constructed indicators, the measurement of disruptive innovation draws on patent data from the China National Intellectual Property Administration (CNIPA), covering the period from 2000 to 2023. The specific measurement methodology is detailed in Section 3.2. Additional firm-level data are primarily obtained from the China Stock Market & Accounting Research (CSMAR) database and Wind Information Co., Ltd. (Wind). The data were processed as follows: (1) Firms designated as Special Treatment (ST, Firms that have exhibited financial distress for two consecutive years), particularly Special Treatment (*ST, Firms that have reported consecutive losses for three years or face the risk of trading suspension), or Particular Transfer (PT) were excluded; (2) Financial institutions were removed; (3) Firms with substantial missing values for key variables were excluded. (4) To mitigate the influence of extreme values on the empirical results, selected variables—such as market-oriented disruptive innovation, technology-oriented disruptive innovation, managerial myopia, and government intervention—were winsorized at the 1st and 99th percentiles. After applying the above criteria, a total of 21,203 valid firm-year observations were retained for analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
*** READ ME ***
This note describes code and data for the paper "Does innovative disruption impact credit markets: Evidence from China".
This dataset combines U.S. bond market data (1970-2019) from Mergent FISD, Compustat and WRDS with VC/IPO data from Becker and Ivashina (2023), alongside Chinese market data from iFind (bonds), CSMAR (IPOs) and Zero2IPO (VC). All data sources are described in detail in Section 2.1 of the paper.
All the code is in a single do file which runs on StataMP 18.
* The Data
There are several data files, whose names end in .dta.
For the Chinese market:
Bond rating (at issue).dta: Contains bond issuer credit ratings at issuance for Chinese firms.
Bonds 2000-2024.dta: Provides bond characteristics from the iFind database (2000–2024).
Default Table Variable.dta: Includes bond default records.
IPO share.dta: Reports industry-level IPO activity from CSMAR.
TVPI.dta: Contains industry-level Total Value to Paid-In (TVPI) metrics.
VC.dta: Captures industry-level venture capital flows from Zero2IPO.
For the U.S. market:
Bond rating (at issue).dta: Records bond issuer credit ratings at issuance for U.S. firms.
Burgiss.dta: Provides Burgiss-sourced VC data by industry (from Becker and Ivashina 2023).
compustat panel data.dta: Includes firm-level fundamentals from Compustat.
Default Table Variable.dta: Lists bond default events.
ff30 encode.dta: Maps Fama-French 30 industry classifications.
FF30 industry.dta: Converts SIC codes to Fama-French 30 industries.
ipo count by ff30 year CSTAT.dta: Tracks IPO activity by Fama-French 30 industry.
Mergent Bonds 1950-2020.dta: Contains bond characteristics from Mergent FISD (1950–2020).
ratings panel.dta: Reports Standard & Poor’s issuer credit ratings.
VC by ff30 year.dta/VC by sector year.dta: Detail VC investments by Fama-French 30/sector-year.
* The Code
Two Stata do file called "code China.do" and "code US.do" contains the code for the paper.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The global financial data services market is projected to exhibit robust growth over the coming years, driven by the increasing adoption of data-driven decision-making, regulatory requirements, and the proliferation of digital technologies. The market is anticipated to reach a valuation of USD 108.39 million by 2033, expanding at a CAGR of 5.9% from 2025 to 2033. Key market drivers include the growing need for real-time and accurate financial data, the increasing use of artificial intelligence (AI) and machine learning (ML) for data analysis, and the rising demand for financial data by non-financial institutions. Key trends shaping the industry include the growing adoption of cloud-based data services, the emergence of real-time data analytics platforms, the increasing focus on data privacy and security, and the proliferation of mobile financial data services. The market is segmented into application (financial companies, non-financial companies, colleges & academies, non-profit institutions, individual investors), type (macroeconomic data, industrial data), and region (North America, South America, Europe, Middle East & Africa, Asia Pacific). Key market players include Wind, Choice, CSMAR, Bloomberg, Hexun, Resset, iFinD, Investing.com, Sinofin, and others. Regional markets are expected to exhibit varying growth trajectories, with Asia Pacific projected to witness significant growth driven by the burgeoning financial sectors in China and India. North America and Europe are expected to maintain their dominance in terms of market share.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In calculating the carbon emissions of listed companies, we use fossil energy consumption, electricity consumption, and heat consumption manually collected from the corporate social responsibility reports (CSR reports), sustainable development reports, and environmental reports disclosed annually by 3,352 listed Chinese companies from 2003 to 2021. We have manually sorted out the list and time of the upgrading of SEZs in 272 cities. The relevant information about the upgrading of SEZs comes from the China Development Zone Audit Announcement (2006 edition) released by the NDRC and other departments . This set of data records the details of state- and provincial-level SEZs that have been officially recognized after a large-scale rectification in 2003, including the names, types, and approval authorities of SEZs. The financial information of the listed companies involved in the control variables is all from the China Stock Market & Accounting Research Database (CSMAR Database).
https://api.github.com/licenses/unlicensehttps://api.github.com/licenses/unlicense
The tone data used in this paper are mainly from the China Research Data Service (CNRDS). To ensure the integrity of sample data as much as possible, we crawl and supplement the missing earnings communication conference texts of all listed companies involved in CNRDS. The main websites supplemented by crawlers are as follows: https://rs.p5w.net is a large-scale centralized website for listed companies to hold earnings communication conferences, including the earnings communication conferences information of most A-share listed companies; Sse e Interactive (http://sns.sseinfo.com) is the official website of SSE, which collects a considerable part of earnings communication conferences information. The Securities Times website (http://www.stcn.com) contains a small amount of information about earnings communication conferences. This paper first screened the missing CNRDS performance presentation and then carried out three rounds of manual screening, python crawler, and text acquisition on the above three websites successively. The content object of the earnings communication conference text for this study is finally determined. Other company data are downloaded from the Wind database and CSMAR database.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The replication materials contain this README file, two do-files
("ChinaConnections_Alonso_etal_dofilePREP.do" and "ChinaConnections_Alonso_etal_dofileRESULTS.do"),
two excel datasets ("CPI" and "DataPrep_POLITICIANS"), and the Appendix to the paper.
In order to run the codes, the user will need additional proprietary data from CSMAR that we cannot share.
Alonso, M., Palma, N. and Simon-Yarza, B. (2022). The Value of Political Connections: Evidence from China's Anti-Corruption Campaign. Journal of Institutional Economics, forthcoming
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The relevant data of CSR equilibrium can be download conditionally on Hexun (https://data.hexun.com). (XLSX)