Meta-datasets generated by the paper "Exploring One Million Machine Learning Pipelines: A Benchmarking Study"
dataset
OA: green
CC0
Abstract
READMEMachine learning pipelines run saved.Columns explanation:seed_i: seed used for the experimentsconfig_id: id of the configurationfold: fold of the datasetconfig_hash: hash value for the configurationsduration: durations of the runstart_time: start timeend_time: end timedataset: the name of the datasetstatus: Status of the run. If it succeeds or not.[metric_name]_[set split]: performance of the trained model on the set (train, val, test). For example, "f1_weighted_test" is the F1 Score of the trained pipeline on the test set.The remaining columns contain configuration space similar to AutoSklean.
My notes (saved in your browser only)
Citation neighborhood (no data yet)
We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.
Source provenance
- openalex
- last seen: 2026-05-13T20:12:55.465734+00:00
License: CC0
· commercial use OK