The Penn Machine Learning Benchmarks (PMLB) is a large, curated set of benchmark datasets used to evaluate and compare supervised machine learning algorithms. These datasets cover a broad range of applications, and include binary/multi-class classification problems and regression problems, as well as combinations of categorical, ordinal, and continuous features.
Python wrapper for Penn Machine Learning Benchmark data repository. Large, curated repository of benchmark datasets for evaluating supervised machine learning algorithms. Part of PyPI https://pypi.org/
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The Penn Machine Learning Benchmarks (PMLB) is a large, curated set of benchmark datasets used to evaluate and compare supervised machine learning algorithms. These datasets cover a broad range of applications, and include binary/multi-class classification problems and regression problems, as well as combinations of categorical, ordinal, and continuous features.