8 datasets found
  1. Fama–French Factors and Portfolios

    • kaggle.com
    zip
    Updated Oct 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikita Manaenkov (2025). Fama–French Factors and Portfolios [Dataset]. https://www.kaggle.com/datasets/nikitamanaenkov/famafrench-factors-and-portfolios
    Explore at:
    zip(177539895 bytes)Available download formats
    Dataset updated
    Oct 30, 2025
    Authors
    Nikita Manaenkov
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset provides foundational factor and portfolio return data used in empirical finance and asset pricing research. It contains: - Fama–French 3-Factor and 5-Factor models - Size (ME), Book-to-Market (B/M), Operating Profitability (OP), and Investment (Inv) portfolios - Bivariate portfolios (e.g., 2x3 Size-B/M sorts) - Industry portfolio returns All data originate from the Kenneth R. French Data Library and are based on CRSP and Compustat databases. Data are value-weighted and expressed in percentages.

    Some files in this dataset contain header comments describing data sources and methodology (as shown below):

    This file was created using the 202508 CRSP database.
    The 1-month TBill rate data until 202405 are from Ibbotson Associates. 
    Starting from 202406, the 1-month TBill rate is from ICE BofA US 1-Month Treasury Bill Index.
    

    To correctly read such files in Python (pandas), use the comment parameter — it automatically ignores all lines starting with a specific symbol (e.g., none here, so you can skip manually):

    Example 1 — Automatically detect header rows:

    import pandas as pd
    
    # Detect the first numeric line to find where data starts
    file_path = "F-F_Research_Data_5_Factors_2x3.csv"
    
    with open(file_path) as f:
      lines = f.readlines()
    
    # Find where the header line (column names) appears
    for i, line in enumerate(lines):
      if "Mkt-RF" in line:
        skip_rows = i
        break
    
    df = pd.read_csv(file_path, skiprows=skip_rows, sep=r"\s+")
    print(df.head())
    

    Example 2 — Skip a known number of comment lines manually:

    df = pd.read_csv("F-F_Research_Data_5_Factors_2x3.csv", skiprows=3, sep=r"\s+")
    

    Example 3 — If comments are prefixed (e.g., with #):

    df = pd.read_csv("F-F_Research_Data_5_Factors_2x3.csv", comment="#", sep=",")
    

    File Structure Example

    ColumnDescription
    Mkt-RFMarket excess return
    SMBSmall minus Big (size factor)
    HMLHigh minus Low (book-to-market factor)
    RMWRobust minus Weak (profitability factor)
    CMAConservative minus Aggressive (investment factor)
    RFRisk-free rate (1-month Treasury Bill)
  2. Portfolio performance based on industry categorization for the 49 Fama and...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liyun Wu; Muneeb Ahmad; Salman Ali Qureshi; Kashif Raza; Yousaf Ali Khan (2023). Portfolio performance based on industry categorization for the 49 Fama and French portfolios. [Dataset]. http://doi.org/10.1371/journal.pone.0272521.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Liyun Wu; Muneeb Ahmad; Salman Ali Qureshi; Kashif Raza; Yousaf Ali Khan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Portfolio performance based on industry categorization for the 49 Fama and French portfolios.

  3. m

    Data for: Can the seasonal pattern of consumption growth reproduce habits in...

    • data.mendeley.com
    • narcis.nl
    Updated Oct 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Javier Rojo-Suárez (2020). Data for: Can the seasonal pattern of consumption growth reproduce habits in the cross-section of stock returns? Evidence from the European equity market [Dataset]. http://doi.org/10.17632/frpm7rywcn.2
    Explore at:
    Dataset updated
    Oct 13, 2020
    Authors
    Javier Rojo-Suárez
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Area covered
    Europe
    Description

    We compile all return and macroeconomic data from Kenneth French's website and the OECD statistical data warehouse, respectively, for the period from January 1990 to December 2018. All return and macroeconomic data include the following countries: Austria, Belgium, Denmark, Finland, France, Germany, Greece, Ireland, Italy, Netherlands, Norway, Portugal, Spain, Sweden, Switzerland and United Kingdom.The dataset comprises the following series:

    1. Fama-French factors, 3-factor model, as provided by Kenneth French (Europe_3_Factors.txt).
    2. Fama-French factors, 5-factor model, as provided by Kenneth French (Europe_5_Factors.txt).
    3. Returns for 25 size-BE/ME portfolios, as provided by Kenneth French (Europe_25_Portfolios_ME_BE-ME.txt).
    4. Returns for 25 size-momentum, as provided by Kenneth French (Europe_25_Portfolios_ME_Prior_12_2.txt).
    5. Weighted average per capita consumption growth. We first collect quarterly chained volume estimates for consumption in nondurables and services, non-seasonally adjusted, in national currency, for the 16 countries under consideration (‘Non-durable goods’ and ‘Services’ series, LNBQR measure). Second, we use the population series provided by the OECD to determine per capita consumption growth series for each country. Finally, we estimate the average consumption growth for the economies under consideration, weighting by population (Europe_Consumption_Q.txt).
    6. Weighted average consumer confidence index (CCI). We collect monthly CCI data as provided by the OECD (‘OECD Standardised CCI, Amplitude adjusted, sa’ series, dataset ‘Composite Leading Indicators’, MEI). We determine the average CCI for the economies under consideration, weighting by population (Europe_Indicators_Q.txt).
  4. m

    Data from: Liquidity, time-varying betas and anomalies. Is the high trading...

    • data.mendeley.com
    Updated Nov 19, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paper authors Paper authors (2019). Liquidity, time-varying betas and anomalies. Is the high trading activity enhancing the validity of the CAPM in the UK equity market? [Dataset]. http://doi.org/10.17632/56n2yxgpcf.1
    Explore at:
    Dataset updated
    Nov 19, 2019
    Authors
    Paper authors Paper authors
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Area covered
    United Kingdom
    Description

    Using all stocks listed in the London Stock Exchange for the period from January 1989 to December 2018, the dataset comprises the following series:

    1. Annual returns for 20 asset growth portfolios, following Fama and French (1993) methodology.
    2. Annual returns for 25 portfolios size-book to market equity, following Fama and French (1993) methodology.
    3. Annual returns for 62 industry portfolios, using two-digit SIC codes.
    4. Fama and French (1993) factors for their three-factor model (RM, SMB and HML).
    5. Fama and French (2015) factors for their five-factor model (RM, SMB, HML, RMW, and CMA).
    6. Variation of the Amihid illiquidy measure for the London Stock Exchange, following Amihud (2002) methodology.
    7. Three-month interest rate of the Treasury Bill for the United Kingdom, as provided by the OECD database.

    We have produced these series using the following data from Thomson Reuters Datastream: (i) total return index (RI series), (ii) market value (MV series), (iii) market-to-book equity (PTBV series), (iv) total assets (WC02999 series), (v) return on equity (WC08301 series), (vi) tax rate (WC08346 series), (vii) primary SIC codes, (viii) turnover by volume (VO series), and (ix) the market price (P series). Following Griffin et al. (2010), we use the generic rules provided by the authors for excluding non-common equity securities from Datastream data.

    REFERENCES: Amihud, Y. (2002). Illiquidity and stock returns: Cross-section and time-series effects. Journal of Financial Markets, 5, 31–56. Fama, E. F. and French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33, 3–56. Fama, E. F. and French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116, 1–22. Griffin, J. M., Kelly, P., and Nardari, F. (2010). Do market efficiency measures yield correct inferences? A comparison of developed and emerging markets. Review of Financial Studies, 23, 3225–3277.

  5. f

    DataSheet1_Network Structures for Asset Return Co-Movement: Evidence From...

    • frontiersin.figshare.com
    pdf
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huai-Long Shi; Huayi Chen (2023). DataSheet1_Network Structures for Asset Return Co-Movement: Evidence From the Chinese Stock Market.pdf [Dataset]. http://doi.org/10.3389/fphy.2022.593493.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    Frontiers
    Authors
    Huai-Long Shi; Huayi Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This article focuses on the detailed network structure of the co-movement for asset returns. Based on the Chinese sector indices and Fama-French five factors, we conducted return decomposition and constructed a minimum spanning tree (MST) in terms of the rank correlation among raw return, idiosyncratic return, and factor premium. With the adoption of a rolling window analysis, we examined the static and time-varying characteristics associated with the MST(s). We obtained the following findings: 1) A star-like structure is presented for the whole sample period, in which market factor MKT acts as the hub node; 2) the star-like structure changes during the periods for major market cycles. The idiosyncratic returns for some sector indices would be disjointed from MKT and connected with their counterparts and other pricing factors; and 3) the effectiveness of pricing factors are time-varying, and investment factor CMA seems redundant in the Chinese market. Our work provides a new perspective for the research of asset co-movement, and the test of the effectiveness of empirical pricing factors.

  6. m

    Data for: Nuclear hazard and asset prices: Implications of nuclear disasters...

    • data.mendeley.com
    Updated Nov 3, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ana Belén Alonso-Conde (2020). Data for: Nuclear hazard and asset prices: Implications of nuclear disasters in the cross-sectional behavior of stock returns [Dataset]. http://doi.org/10.17632/wv94fj59t4.2
    Explore at:
    Dataset updated
    Nov 3, 2020
    Authors
    Ana Belén Alonso-Conde
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Using all stocks listed in the Tokyo Stock Exchange and macroeconomic data for Japan, the dataset comprises the following series:

    1. Japan_25_Portfolios_MV_PTBV: Monthly returns for 25 size-book-to-market equity portfolios, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    2. Japan_25_Portfolios_MV_PE: Monthly returns for 25 size-PE portfolios, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    3. Japan_50_Portfolios_SECTOR: Monthly returns for 50 industry portfolios. (Raw data source: Datastream database)
    4. Japan_3 Factors: Fama and French three-factors (RM, SMB and HML), following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    5. Japan_5 Factors: Fama and French five-factors (RM, SMB, HML, RMW, and CMA), following the Fama and French (2015) methodology. (Raw data source: Datastream database)
    6. Japan_NUCLEAR_Y: Instrument in years with a value of 1 when a nuclear disaster has occurred somewhere in the world and 0 otherwise. (Raw data source: Bloomberg and BBC News)
    7. Japan_NUCLEAR_M: Instrument in months with a value of 1 when a nuclear disaster has occurred somewhere in the world and 0 otherwise. (Raw data source: Bloomberg and BBC News)
    8. Japan_RF_M: Three-month interest rate of the Treasury Bill for Japan. (Raw data source: OECD)
    9. Company data: Names and general data of the companies that constitute the sample. (Raw data source: Datastream database)
    10. Number of stocks in portfolios: Number of stocks included each year in Japan_25_Portfolios_MV_PTBV, Japan_25_Portfolios_MV_PE and Japan_50_Portfolios_SECTOR. (Raw data source: Datastream database)

    We have produced all return series using the following data from Datastream: (i) total return index (RI series), (ii) market value (MV series), (iii) market-to-book equity (PTBV series), (iv) total assets (WC02999 series), (v) return on equity (WC08301 series), (vi) price-to-earnings ratio (PE series), and (vii) industry (SECTOR series). We have used the generic rules suggested by Griffin, Kelly, & Nardari (2010) for excluding non-common equity securities from Datastream data. We also exclude stocks with less than twelve observations. Accordingly, our sample comprises a total number of 5,212 stocks.

    REFERENCES:

    Fama, E. F. and French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33, 3–56. Fama, E. F. and French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116, 1–22. Griffin, J. M., Kelly, P., and Nardari, F. (2010). Do market efficiency measures yield correct inferences? A comparison of developed and emerging markets. Review of Financial Studies, 23, 3225–3277.

  7. innovative disruption and bond default

    • zenodo.org
    Updated Oct 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). innovative disruption and bond default [Dataset]. http://doi.org/10.5281/zenodo.15511564
    Explore at:
    Dataset updated
    Oct 21, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description


    *** READ ME ***

    This note describes code and data for the paper "Does innovative disruption impact credit markets: Evidence from China".

    This dataset combines U.S. bond market data (1970-2019) from Mergent FISD, Compustat and WRDS with VC/IPO data from Becker and Ivashina (2023), alongside Chinese market data from iFind (bonds), CSMAR (IPOs) and Zero2IPO (VC). All data sources are described in detail in Section 2.1 of the paper.

    All the code is in a single do file which runs on StataMP 18.

    * The Data
    There are several data files, whose names end in .dta.

    For the Chinese market:

    Bond rating (at issue).dta: Contains bond issuer credit ratings at issuance for Chinese firms.

    Bonds 2000-2024.dta: Provides bond characteristics from the iFind database (2000–2024).

    Default Table Variable.dta: Includes bond default records.

    IPO share.dta: Reports industry-level IPO activity from CSMAR.

    TVPI.dta: Contains industry-level Total Value to Paid-In (TVPI) metrics.

    VC.dta: Captures industry-level venture capital flows from Zero2IPO.

    For the U.S. market:

    Bond rating (at issue).dta: Records bond issuer credit ratings at issuance for U.S. firms.

    Burgiss.dta: Provides Burgiss-sourced VC data by industry (from Becker and Ivashina 2023).

    compustat panel data.dta: Includes firm-level fundamentals from Compustat.

    Default Table Variable.dta: Lists bond default events.

    ff30 encode.dta: Maps Fama-French 30 industry classifications.

    FF30 industry.dta: Converts SIC codes to Fama-French 30 industries.

    ipo count by ff30 year CSTAT.dta: Tracks IPO activity by Fama-French 30 industry.

    Mergent Bonds 1950-2020.dta: Contains bond characteristics from Mergent FISD (1950–2020).

    ratings panel.dta: Reports Standard & Poor’s issuer credit ratings.

    VC by ff30 year.dta/VC by sector year.dta: Detail VC investments by Fama-French 30/sector-year.

    * The Code
    Two Stata do file called "code China.do" and "code US.do" contains the code for the paper.

  8. Sudden CEO Deaths in Other Industries – Univariate CAR Analysis. This table...

    • plos.figshare.com
    xls
    Updated Oct 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chen Li; Moumita Dutta; Jing Duo; Shantanu Dutta (2025). Sudden CEO Deaths in Other Industries – Univariate CAR Analysis. This table reports the cumulative abnormal returns (CARs) for industry peer firms surrounding the sudden deaths of CEOs in two different industries. Panel A presents CARs for transportation firms around the fatal shooting of Philip Trenary, the former CEO of Pinnacle Airlines, on September 27, 2018. Panel B presents CARs for electronic equipment firms around the death of Micron Technology’s active CEO, Steve Appleton, who died in a plane crash on February 3, 2012. Peer firms are defined by the same Fama-French 48 industry classification as the firm experiencing the CEO death. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% levels, respectively. [Dataset]. http://doi.org/10.1371/journal.pone.0334399.t010
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 14, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Chen Li; Moumita Dutta; Jing Duo; Shantanu Dutta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sudden CEO Deaths in Other Industries – Univariate CAR Analysis. This table reports the cumulative abnormal returns (CARs) for industry peer firms surrounding the sudden deaths of CEOs in two different industries. Panel A presents CARs for transportation firms around the fatal shooting of Philip Trenary, the former CEO of Pinnacle Airlines, on September 27, 2018. Panel B presents CARs for electronic equipment firms around the death of Micron Technology’s active CEO, Steve Appleton, who died in a plane crash on February 3, 2012. Peer firms are defined by the same Fama-French 48 industry classification as the firm experiencing the CEO death. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% levels, respectively.

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nikita Manaenkov (2025). Fama–French Factors and Portfolios [Dataset]. https://www.kaggle.com/datasets/nikitamanaenkov/famafrench-factors-and-portfolios
Organization logo

Fama–French Factors and Portfolios

Empirical Finance Dataset for Quantitative Analysis and Factor Models

Explore at:
13 scholarly articles cite this dataset (View in Google Scholar)
zip(177539895 bytes)Available download formats
Dataset updated
Oct 30, 2025
Authors
Nikita Manaenkov
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset provides foundational factor and portfolio return data used in empirical finance and asset pricing research. It contains: - Fama–French 3-Factor and 5-Factor models - Size (ME), Book-to-Market (B/M), Operating Profitability (OP), and Investment (Inv) portfolios - Bivariate portfolios (e.g., 2x3 Size-B/M sorts) - Industry portfolio returns All data originate from the Kenneth R. French Data Library and are based on CRSP and Compustat databases. Data are value-weighted and expressed in percentages.

Some files in this dataset contain header comments describing data sources and methodology (as shown below):

This file was created using the 202508 CRSP database.
The 1-month TBill rate data until 202405 are from Ibbotson Associates. 
Starting from 202406, the 1-month TBill rate is from ICE BofA US 1-Month Treasury Bill Index.

To correctly read such files in Python (pandas), use the comment parameter — it automatically ignores all lines starting with a specific symbol (e.g., none here, so you can skip manually):

Example 1 — Automatically detect header rows:

import pandas as pd

# Detect the first numeric line to find where data starts
file_path = "F-F_Research_Data_5_Factors_2x3.csv"

with open(file_path) as f:
  lines = f.readlines()

# Find where the header line (column names) appears
for i, line in enumerate(lines):
  if "Mkt-RF" in line:
    skip_rows = i
    break

df = pd.read_csv(file_path, skiprows=skip_rows, sep=r"\s+")
print(df.head())

Example 2 — Skip a known number of comment lines manually:

df = pd.read_csv("F-F_Research_Data_5_Factors_2x3.csv", skiprows=3, sep=r"\s+")

Example 3 — If comments are prefixed (e.g., with #):

df = pd.read_csv("F-F_Research_Data_5_Factors_2x3.csv", comment="#", sep=",")

File Structure Example

ColumnDescription
Mkt-RFMarket excess return
SMBSmall minus Big (size factor)
HMLHigh minus Low (book-to-market factor)
RMWRobust minus Weak (profitability factor)
CMAConservative minus Aggressive (investment factor)
RFRisk-free rate (1-month Treasury Bill)
Search
Clear search
Close search
Google apps
Main menu