36 datasets found
  1. Fama–French Factors and Portfolios

    • kaggle.com
    zip
    Updated Oct 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikita Manaenkov (2025). Fama–French Factors and Portfolios [Dataset]. https://www.kaggle.com/datasets/nikitamanaenkov/famafrench-factors-and-portfolios
    Explore at:
    zip(177539895 bytes)Available download formats
    Dataset updated
    Oct 30, 2025
    Authors
    Nikita Manaenkov
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset provides foundational factor and portfolio return data used in empirical finance and asset pricing research. It contains: - Fama–French 3-Factor and 5-Factor models - Size (ME), Book-to-Market (B/M), Operating Profitability (OP), and Investment (Inv) portfolios - Bivariate portfolios (e.g., 2x3 Size-B/M sorts) - Industry portfolio returns All data originate from the Kenneth R. French Data Library and are based on CRSP and Compustat databases. Data are value-weighted and expressed in percentages.

    Some files in this dataset contain header comments describing data sources and methodology (as shown below):

    This file was created using the 202508 CRSP database.
    The 1-month TBill rate data until 202405 are from Ibbotson Associates. 
    Starting from 202406, the 1-month TBill rate is from ICE BofA US 1-Month Treasury Bill Index.
    

    To correctly read such files in Python (pandas), use the comment parameter — it automatically ignores all lines starting with a specific symbol (e.g., none here, so you can skip manually):

    Example 1 — Automatically detect header rows:

    import pandas as pd
    
    # Detect the first numeric line to find where data starts
    file_path = "F-F_Research_Data_5_Factors_2x3.csv"
    
    with open(file_path) as f:
      lines = f.readlines()
    
    # Find where the header line (column names) appears
    for i, line in enumerate(lines):
      if "Mkt-RF" in line:
        skip_rows = i
        break
    
    df = pd.read_csv(file_path, skiprows=skip_rows, sep=r"\s+")
    print(df.head())
    

    Example 2 — Skip a known number of comment lines manually:

    df = pd.read_csv("F-F_Research_Data_5_Factors_2x3.csv", skiprows=3, sep=r"\s+")
    

    Example 3 — If comments are prefixed (e.g., with #):

    df = pd.read_csv("F-F_Research_Data_5_Factors_2x3.csv", comment="#", sep=",")
    

    File Structure Example

    ColumnDescription
    Mkt-RFMarket excess return
    SMBSmall minus Big (size factor)
    HMLHigh minus Low (book-to-market factor)
    RMWRobust minus Weak (profitability factor)
    CMAConservative minus Aggressive (investment factor)
    RFRisk-free rate (1-month Treasury Bill)
  2. u

    Fama-French factors and Benchmark portfolios for the UK

    • datacatalogue.ukdataservice.ac.uk
    Updated Apr 6, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tharyan, R, University of Exeter (2018). Fama-French factors and Benchmark portfolios for the UK [Dataset]. http://doi.org/10.5255/UKDA-SN-852704
    Explore at:
    Dataset updated
    Apr 6, 2018
    Authors
    Tharyan, R, University of Exeter
    Time period covered
    Oct 3, 1988 - Jun 30, 2016
    Area covered
    United Kingdom
    Description

    Datasets containing the Daily, Monthly and Annual SMB, HML and momentum factors for the UK market 1980OCT-2015JUN (daily from 1988OCT to 2015JUN) and datasets containing the Fama-french and momentum portfolios used to create the SMB, HML and UMD factors and other benchmark portfolios. For the benchmark portfolios, equal and value weighted returns data files are available and a file containing information on the number of portfolios per year and the cutoffs points used to create the portfolios is also included.

    The twin aims of this research project are first to provide a more satisfactory model of the cost of capital and asset pricing in the UK and second to facilitate the creation and maintenance of high quality, survivorship bias free, standardised and regularly updated set of specific financial data for free use by academics, researchers and also potentially by regulatory bodies such as the Competition Commission, Office of Fair Trading (OFT), Water Services Regulation Authority (OFWAT), communications regulator(OFCOM) and other regulators. This will build on the Fama-French and Momentum portfolios and factors for the UK market established by Gregory, Tharyan and Huang (2009). Our first objective is to expand this dataset to include the full range of portfolios available for the US. Second, we will expand the available factor and portfolio data to encompass ongoing developments in literature relating to returns and asset pricing. Third, we will undertake a comprehensive range of asset pricing model tests to develop a more convincing model of the cross-section of UK stock returns. Lastly we will develop the UK literature on implied cost of capital (ICC). With reference to the latter, we will both test alternative models of Implied cost of capital and examine the role of implied, rather than realised, returns in asset pricing tests.

  3. Fama French 25 Portfolios

    • kaggle.com
    zip
    Updated Nov 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irwindeep (2025). Fama French 25 Portfolios [Dataset]. https://www.kaggle.com/datasets/irwindeep/fama-french-25-portfolios
    Explore at:
    zip(108223 bytes)Available download formats
    Dataset updated
    Nov 9, 2025
    Authors
    Irwindeep
    Description

    Dataset

    This dataset was created by Irwindeep

    Contents

  4. Fama-French 5 Factor Model, South Korea

    • kaggle.com
    zip
    Updated Apr 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    June (2023). Fama-French 5 Factor Model, South Korea [Dataset]. https://www.kaggle.com/datasets/hyeonjunkim/fama-french-5-factor-model-south-korea
    Explore at:
    zip(6550 bytes)Available download formats
    Dataset updated
    Apr 29, 2023
    Authors
    June
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    South Korea
    Description

    This dataset is a intermediate result for research with Professor Jay M. Chung, Soongsil Univ. This dataset consist of 6 columns: Mrt-Rf, HML, SMB, CMA, RMW, and RF.

    RF is the investment return of 91 day monetary stabilization bond of South Korea, works as risk-free rate in South Korean stock market.

    For calculating the other factors I use the same or similar method of Fama&French (2015). I will leave the link for the original paper. Link

    This data could be use to evaluate the performance or style of South Korean stock portfolios using the linear regression model that have HML, SMB, CMA, and RMW, Mrt-RF as independent vairables and Monthly portfolio return - RF as dependent variables. I will make a introductory notebook both in English and Korean.

    The dataset's caveat is as follows.

    • Historal RF is quite short, as 91 day monetary stabilization bond is a quite recent risk free asset.
    • I did not exclude financial firms or other non-standard firms while constructing the portfolios.
    • Currently the factors contradicts with some previous studies.
  5. Sign realized jump risk and the cross-section of stock returns: Evidence...

    • plos.figshare.com
    docx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Youcong Chao; Xiaoqun Liu; Shijun Guo (2023). Sign realized jump risk and the cross-section of stock returns: Evidence from China's stock market [Dataset]. http://doi.org/10.1371/journal.pone.0181990
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Youcong Chao; Xiaoqun Liu; Shijun Guo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    China
    Description

    Using 5-minute high frequency data from the Chinese stock market, we employ a non-parametric method to estimate Fama-French portfolio realized jumps and investigate whether the estimated positive, negative and sign realized jumps could forecast or explain the cross-sectional stock returns. The Fama-MacBeth regression results show that not only have the realized jump components and the continuous volatility been compensated with risk premium, but also that the negative jump risk, the positive jump risk and the sign jump risk, to some extent, could explain the return of the stock portfolios. Therefore, we should pay high attention to the downside tail risk and the upside tail risk.

  6. f

    S&P 500 and 20-(Fama and French)-portfolios performance depending on size...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Sep 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wu, Liyun; Raza, Kashif; Khan, Yousaf Ali; Ahmad, Muneeb; Qureshi, Salman Ali (2022). S&P 500 and 20-(Fama and French)-portfolios performance depending on size and book-to-market. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000229497
    Explore at:
    Dataset updated
    Sep 26, 2022
    Authors
    Wu, Liyun; Raza, Kashif; Khan, Yousaf Ali; Ahmad, Muneeb; Qureshi, Salman Ali
    Description

    S&P 500 and 20-(Fama and French)-portfolios performance depending on size and book-to-market.

  7. Descriptive statistic for portfolio return.

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Youcong Chao; Xiaoqun Liu; Shijun Guo (2023). Descriptive statistic for portfolio return. [Dataset]. http://doi.org/10.1371/journal.pone.0181990.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Youcong Chao; Xiaoqun Liu; Shijun Guo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Descriptive statistic for portfolio return.

  8. Descriptive statistics of the jump components for s1b1-s1b5.

    • plos.figshare.com
    xls
    Updated Jun 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Youcong Chao; Xiaoqun Liu; Shijun Guo (2023). Descriptive statistics of the jump components for s1b1-s1b5. [Dataset]. http://doi.org/10.1371/journal.pone.0181990.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 18, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Youcong Chao; Xiaoqun Liu; Shijun Guo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Descriptive statistics of the jump components for s1b1-s1b5.

  9. Significant number of months with different regression equations.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Youcong Chao; Xiaoqun Liu; Shijun Guo (2023). Significant number of months with different regression equations. [Dataset]. http://doi.org/10.1371/journal.pone.0181990.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Youcong Chao; Xiaoqun Liu; Shijun Guo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Significant number of months with different regression equations.

  10. f

    Identifying outliers in asset pricing data with a new weighted forward...

    • datasetcatalog.nlm.nih.gov
    • scielo.figshare.com
    Updated Feb 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aronne, Alexandre; Bressan, Aureliano Angel; Grossi, Luigi (2020). Identifying outliers in asset pricing data with a new weighted forward search estimator [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000459853
    Explore at:
    Dataset updated
    Feb 5, 2020
    Authors
    Aronne, Alexandre; Bressan, Aureliano Angel; Grossi, Luigi
    Description

    ABSTRACT The purpose of this work is to present the Weighted Forward Search (FSW) method for the detection of outliers in asset pricing data. This new estimator, which is based on an algorithm that downweights the most anomalous observations of the dataset, is tested using both simulated and empirical asset pricing data. The impact of outliers on the estimation of asset pricing models is assessed under different scenarios, and the results are evaluated with associated statistical tests based on this new approach. Our proposal generates an alternative procedure for robust estimation of portfolio betas, allowing for the comparison between concurrent asset pricing models. The algorithm, which is both efficient and robust to outliers, is used to provide robust estimates of the models’ parameters in a comparison with traditional econometric estimation methods usually used in the literature. In particular, the precision of the alphas is highly increased when the Forward Search (FS) method is used. We use Monte Carlo simulations, and also the well-known dataset of equity factor returns provided by Prof. Kenneth French, consisting of the 25 Fama-French portfolios on the United States of America equity market using single and three-factor models, on monthly and annual basis. Our results indicate that the marginal rejection of the Fama-French three-factor model is influenced by the presence of outliers in the portfolios, when using monthly returns. In annual data, the use of robust methods increases the rejection level of null alphas in the Capital Asset Pricing Model (CAPM) and the Fama-French three-factor model, with more efficient estimates in the absence of outliers and consistent alphas when outliers are present.

  11. m

    Data for: Can the seasonal pattern of consumption growth reproduce habits in...

    • data.mendeley.com
    • narcis.nl
    Updated Oct 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Javier Rojo-Suárez (2020). Data for: Can the seasonal pattern of consumption growth reproduce habits in the cross-section of stock returns? Evidence from the European equity market [Dataset]. http://doi.org/10.17632/frpm7rywcn.2
    Explore at:
    Dataset updated
    Oct 13, 2020
    Authors
    Javier Rojo-Suárez
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Area covered
    Europe
    Description

    We compile all return and macroeconomic data from Kenneth French's website and the OECD statistical data warehouse, respectively, for the period from January 1990 to December 2018. All return and macroeconomic data include the following countries: Austria, Belgium, Denmark, Finland, France, Germany, Greece, Ireland, Italy, Netherlands, Norway, Portugal, Spain, Sweden, Switzerland and United Kingdom.The dataset comprises the following series:

    1. Fama-French factors, 3-factor model, as provided by Kenneth French (Europe_3_Factors.txt).
    2. Fama-French factors, 5-factor model, as provided by Kenneth French (Europe_5_Factors.txt).
    3. Returns for 25 size-BE/ME portfolios, as provided by Kenneth French (Europe_25_Portfolios_ME_BE-ME.txt).
    4. Returns for 25 size-momentum, as provided by Kenneth French (Europe_25_Portfolios_ME_Prior_12_2.txt).
    5. Weighted average per capita consumption growth. We first collect quarterly chained volume estimates for consumption in nondurables and services, non-seasonally adjusted, in national currency, for the 16 countries under consideration (‘Non-durable goods’ and ‘Services’ series, LNBQR measure). Second, we use the population series provided by the OECD to determine per capita consumption growth series for each country. Finally, we estimate the average consumption growth for the economies under consideration, weighting by population (Europe_Consumption_Q.txt).
    6. Weighted average consumer confidence index (CCI). We collect monthly CCI data as provided by the OECD (‘OECD Standardised CCI, Amplitude adjusted, sa’ series, dataset ‘Composite Leading Indicators’, MEI). We determine the average CCI for the economies under consideration, weighting by population (Europe_Indicators_Q.txt).
  12. Cross-sectional regression results.

    • figshare.com
    xls
    Updated Jun 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Youcong Chao; Xiaoqun Liu; Shijun Guo (2023). Cross-sectional regression results. [Dataset]. http://doi.org/10.1371/journal.pone.0181990.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Youcong Chao; Xiaoqun Liu; Shijun Guo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cross-sectional regression results.

  13. m

    Data from: Liquidity, time-varying betas and anomalies. Is the high trading...

    • data.mendeley.com
    Updated Nov 19, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paper authors Paper authors (2019). Liquidity, time-varying betas and anomalies. Is the high trading activity enhancing the validity of the CAPM in the UK equity market? [Dataset]. http://doi.org/10.17632/56n2yxgpcf.1
    Explore at:
    Dataset updated
    Nov 19, 2019
    Authors
    Paper authors Paper authors
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Area covered
    United Kingdom
    Description

    Using all stocks listed in the London Stock Exchange for the period from January 1989 to December 2018, the dataset comprises the following series:

    1. Annual returns for 20 asset growth portfolios, following Fama and French (1993) methodology.
    2. Annual returns for 25 portfolios size-book to market equity, following Fama and French (1993) methodology.
    3. Annual returns for 62 industry portfolios, using two-digit SIC codes.
    4. Fama and French (1993) factors for their three-factor model (RM, SMB and HML).
    5. Fama and French (2015) factors for their five-factor model (RM, SMB, HML, RMW, and CMA).
    6. Variation of the Amihid illiquidy measure for the London Stock Exchange, following Amihud (2002) methodology.
    7. Three-month interest rate of the Treasury Bill for the United Kingdom, as provided by the OECD database.

    We have produced these series using the following data from Thomson Reuters Datastream: (i) total return index (RI series), (ii) market value (MV series), (iii) market-to-book equity (PTBV series), (iv) total assets (WC02999 series), (v) return on equity (WC08301 series), (vi) tax rate (WC08346 series), (vii) primary SIC codes, (viii) turnover by volume (VO series), and (ix) the market price (P series). Following Griffin et al. (2010), we use the generic rules provided by the authors for excluding non-common equity securities from Datastream data.

    REFERENCES: Amihud, Y. (2002). Illiquidity and stock returns: Cross-section and time-series effects. Journal of Financial Markets, 5, 31–56. Fama, E. F. and French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33, 3–56. Fama, E. F. and French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116, 1–22. Griffin, J. M., Kelly, P., and Nardari, F. (2010). Do market efficiency measures yield correct inferences? A comparison of developed and emerging markets. Review of Financial Studies, 23, 3225–3277.

  14. m

    Data for: Impact of consumer confidence on the expected returns of the Tokyo...

    • data.mendeley.com
    Updated Sep 22, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Javier Rojo Suárez (2020). Data for: Impact of consumer confidence on the expected returns of the Tokyo Stock Exchange: A comparative analysis of consumption and production-based asset pricing models [Dataset]. http://doi.org/10.17632/vyxt842rzg.2
    Explore at:
    Dataset updated
    Sep 22, 2020
    Authors
    Javier Rojo Suárez
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Using all stocks listed in the Tokyo Stock Exchange and macroeconomic data for Japan, the dataset comprises the following series:

    1. Monthly returns for 25 size-book-to-market equity portfolios, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    2. Monthly returns for 20 momentum portfolios, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    3. Monthly returns for 25 price-to-cash flow-dividend yield portfolios, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    4. Fama and French three-factors (RM, SMB and HML), following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    5. Fama and French five-factors (RM, SMB, HML, RMW, and CMA), following the Fama and French (2015) methodology for all factors, except for RMW, which is determined using the return on assets as sorting variable, as in Hou, Xue and Zhang (2014). (Raw data source: Datastream database)
    6. Private final consumption expenditure, in national currency and constant prices, non-seasonally adjusted, for Japan. (Raw data source: OECD)
    7. Consumer Confidence Index (CCI) for Japan. (Raw data source: OECD)
    8. Three-month interest rate of the Treasury Bill for Japan. (Raw data source: OECD)
    9. Gross Domestic Product (GDP) for Japan. (Raw data source: OECD)
    10. Consumer Price Index (CPI) growth rate for Japan. (Raw data source: OECD)

    We have produced all return series using the following data from Datastream: (i) total return index (RI series), (ii) market value (MV series), (iii) market-to-book equity (PTBV series), (iv) total assets (WC02999 series), (v) return on equity (WC08301 series), (vi) price-to-cash flow ratio (PC series), and (vii) dividend yield (DY series). We have used the generic rules suggested by Griffin, Kelly, & Nardari (2010) for excluding non-common equity securities from Datastream data. We also exclude stocks with less than twelve observations in the period from July 1992 to June 2018. Accordingly, our sample comprises a total number of 5,312 stocks.

    REFERENCES:

    Fama, E. F. and French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33, 3–56. Fama, E. F. and French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116, 1–22. Griffin, J. M., Kelly, P., and Nardari, F. (2010). Do market efficiency measures yield correct inferences? A comparison of developed and emerging markets. Review of Financial Studies, 23, 3225–3277. Hou K, Xue C, Zhang L. (2014). Digesting anomalies: An investment approach. Review of Financial Studies, 28, 650-705.

  15. Regression results under different linear combination of the jump...

    • figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Youcong Chao; Xiaoqun Liu; Shijun Guo (2023). Regression results under different linear combination of the jump components. [Dataset]. http://doi.org/10.1371/journal.pone.0181990.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Youcong Chao; Xiaoqun Liu; Shijun Guo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Regression results under different linear combination of the jump components.

  16. n

    Data for: Regulatory changes in corporate taxation and the cost of equity of...

    • narcis.nl
    • data.mendeley.com
    Updated Oct 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rojo Suárez, J (via Mendeley Data) (2021). Data for: Regulatory changes in corporate taxation and the cost of equity of traded firms [Dataset]. http://doi.org/10.17632/tp4bx8c28y.1
    Explore at:
    Dataset updated
    Oct 18, 2021
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Rojo Suárez, J (via Mendeley Data)
    Description

    We compile raw data from the Datastream database for all stocks traded on the Spanish equity market. Particularly, we compile the following data series: (i) total return index (RI series), (ii) market value (MV series), (iii) market-to-book equity (PTBV series), (iv) total assets (WC02999 series), (v) return on equity (WC08301 series), (vi) dividend yield (DY series), (vii) price-to-earnings ratio (PE series), and (viii) effective tax rate (WC08346 series). We use the filters suggested by Griffin, Kelly, and Nardari (2010) for the Datastream database to exclude assets other than ordinary shares from our sample. Hence, our sample comprises 443 companies, including all firms that started trading within the time interval under study, as well as those that were delisted. As a proxy for the risk-free rate, we use the three-month Treasury Bill rate for Spain, as provided by the OECD. Accordingly, the dataset comprises the following series:

    1. Spain_9_Portfolios_SIZE_BEME: Monthly returns for 9 size-book-to-market equity portfolios, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    2. Spain_9_Portfolios_DY_PE: Monthly returns for 9 dividend yield-price-to-earnings ratio, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    3. Spain_9_Portfolios_SIZE_TR: Monthly returns for 9 size-effective tax rate portfolios, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    4. Spain_FF_3_Factors: Monthly returns for the constituents of the three classic factors of Fama and French, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    5. Spain_FF_5_Factors: Monthly returns for the constituents of the five factors of Fama and French, following the Fama and French (2015) methodology. (Raw data source: Datastream database)
    6. Spain_RF: Three-month Treasury Bill rate for Spain. (Raw data source: OECD)
    7. Spain_Avg_Tax_Rate: Value-weighted effective tax rate paid by companies traded in Spain. (Raw data source: Datastream database)

    REFERENCES:

    Fama, E. F. and French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33, 3–56. Fama, E. F. and French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116, 1–22. Griffin, J. M., Kelly, P., and Nardari, F. (2010). Do market efficiency measures yield correct inferences? A comparison of developed and emerging markets. Review of Financial Studies, 23, 3225–3277.

  17. m

    Data for: Nuclear hazard and asset prices: Implications of nuclear disasters...

    • data.mendeley.com
    Updated Nov 3, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ana Belén Alonso-Conde (2020). Data for: Nuclear hazard and asset prices: Implications of nuclear disasters in the cross-sectional behavior of stock returns [Dataset]. http://doi.org/10.17632/wv94fj59t4.2
    Explore at:
    Dataset updated
    Nov 3, 2020
    Authors
    Ana Belén Alonso-Conde
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Using all stocks listed in the Tokyo Stock Exchange and macroeconomic data for Japan, the dataset comprises the following series:

    1. Japan_25_Portfolios_MV_PTBV: Monthly returns for 25 size-book-to-market equity portfolios, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    2. Japan_25_Portfolios_MV_PE: Monthly returns for 25 size-PE portfolios, following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    3. Japan_50_Portfolios_SECTOR: Monthly returns for 50 industry portfolios. (Raw data source: Datastream database)
    4. Japan_3 Factors: Fama and French three-factors (RM, SMB and HML), following the Fama and French (1993) methodology. (Raw data source: Datastream database)
    5. Japan_5 Factors: Fama and French five-factors (RM, SMB, HML, RMW, and CMA), following the Fama and French (2015) methodology. (Raw data source: Datastream database)
    6. Japan_NUCLEAR_Y: Instrument in years with a value of 1 when a nuclear disaster has occurred somewhere in the world and 0 otherwise. (Raw data source: Bloomberg and BBC News)
    7. Japan_NUCLEAR_M: Instrument in months with a value of 1 when a nuclear disaster has occurred somewhere in the world and 0 otherwise. (Raw data source: Bloomberg and BBC News)
    8. Japan_RF_M: Three-month interest rate of the Treasury Bill for Japan. (Raw data source: OECD)
    9. Company data: Names and general data of the companies that constitute the sample. (Raw data source: Datastream database)
    10. Number of stocks in portfolios: Number of stocks included each year in Japan_25_Portfolios_MV_PTBV, Japan_25_Portfolios_MV_PE and Japan_50_Portfolios_SECTOR. (Raw data source: Datastream database)

    We have produced all return series using the following data from Datastream: (i) total return index (RI series), (ii) market value (MV series), (iii) market-to-book equity (PTBV series), (iv) total assets (WC02999 series), (v) return on equity (WC08301 series), (vi) price-to-earnings ratio (PE series), and (vii) industry (SECTOR series). We have used the generic rules suggested by Griffin, Kelly, & Nardari (2010) for excluding non-common equity securities from Datastream data. We also exclude stocks with less than twelve observations. Accordingly, our sample comprises a total number of 5,212 stocks.

    REFERENCES:

    Fama, E. F. and French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33, 3–56. Fama, E. F. and French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116, 1–22. Griffin, J. M., Kelly, P., and Nardari, F. (2010). Do market efficiency measures yield correct inferences? A comparison of developed and emerging markets. Review of Financial Studies, 23, 3225–3277.

  18. Regressions of megatrend portfolios on pure factor portfolios via OLS.

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Helena Naffa; Máté Fain (2023). Regressions of megatrend portfolios on pure factor portfolios via OLS. [Dataset]. http://doi.org/10.1371/journal.pone.0244225.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Helena Naffa; Máté Fain
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Regressions of megatrend portfolios on pure factor portfolios via OLS.

  19. m

    Master Dissertation: Environmental, Social and Corporate Governance...

    • data.mendeley.com
    Updated May 16, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Mirzoyan (2022). Master Dissertation: Environmental, Social and Corporate Governance portfolio management strategies in the Russian market [Dataset]. http://doi.org/10.17632/52mnvtpxgh.2
    Explore at:
    Dataset updated
    May 16, 2022
    Authors
    Maria Mirzoyan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data contains information from ESG Corporate Ranking by RAEX from December 2020 to February 2022, its dynamics, RSPP indices "Vector of sustainable development" and "Responsibility and openness", data for the Fama-French model, market capitalization and assets of companies and daily stock closing prices.

  20. Regressions of megatrend portfolios on pure factor portfolios via GMM-IVd.

    • plos.figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Helena Naffa; Máté Fain (2023). Regressions of megatrend portfolios on pure factor portfolios via GMM-IVd. [Dataset]. http://doi.org/10.1371/journal.pone.0244225.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Helena Naffa; Máté Fain
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Regressions of megatrend portfolios on pure factor portfolios via GMM-IVd.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nikita Manaenkov (2025). Fama–French Factors and Portfolios [Dataset]. https://www.kaggle.com/datasets/nikitamanaenkov/famafrench-factors-and-portfolios
Organization logo

Fama–French Factors and Portfolios

Empirical Finance Dataset for Quantitative Analysis and Factor Models

Explore at:
13 scholarly articles cite this dataset (View in Google Scholar)
zip(177539895 bytes)Available download formats
Dataset updated
Oct 30, 2025
Authors
Nikita Manaenkov
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset provides foundational factor and portfolio return data used in empirical finance and asset pricing research. It contains: - Fama–French 3-Factor and 5-Factor models - Size (ME), Book-to-Market (B/M), Operating Profitability (OP), and Investment (Inv) portfolios - Bivariate portfolios (e.g., 2x3 Size-B/M sorts) - Industry portfolio returns All data originate from the Kenneth R. French Data Library and are based on CRSP and Compustat databases. Data are value-weighted and expressed in percentages.

Some files in this dataset contain header comments describing data sources and methodology (as shown below):

This file was created using the 202508 CRSP database.
The 1-month TBill rate data until 202405 are from Ibbotson Associates. 
Starting from 202406, the 1-month TBill rate is from ICE BofA US 1-Month Treasury Bill Index.

To correctly read such files in Python (pandas), use the comment parameter — it automatically ignores all lines starting with a specific symbol (e.g., none here, so you can skip manually):

Example 1 — Automatically detect header rows:

import pandas as pd

# Detect the first numeric line to find where data starts
file_path = "F-F_Research_Data_5_Factors_2x3.csv"

with open(file_path) as f:
  lines = f.readlines()

# Find where the header line (column names) appears
for i, line in enumerate(lines):
  if "Mkt-RF" in line:
    skip_rows = i
    break

df = pd.read_csv(file_path, skiprows=skip_rows, sep=r"\s+")
print(df.head())

Example 2 — Skip a known number of comment lines manually:

df = pd.read_csv("F-F_Research_Data_5_Factors_2x3.csv", skiprows=3, sep=r"\s+")

Example 3 — If comments are prefixed (e.g., with #):

df = pd.read_csv("F-F_Research_Data_5_Factors_2x3.csv", comment="#", sep=",")

File Structure Example

ColumnDescription
Mkt-RFMarket excess return
SMBSmall minus Big (size factor)
HMLHigh minus Low (book-to-market factor)
RMWRobust minus Weak (profitability factor)
CMAConservative minus Aggressive (investment factor)
RFRisk-free rate (1-month Treasury Bill)
Search
Clear search
Close search
Google apps
Main menu