Meta-datasets generated by the paper "Exploring One Million Machine Learning Pipelines: A Benchmarking Study"

Alcobaça, Edesio; Carvalho, André Carlos Ponce de Leon Ferreira de

doi:10.6084/m9.figshare.28696262

Meta-datasets generated by the paper "Exploring One Million Machine Learning Pipelines: A Benchmarking Study"

Alcobaça, Edesio, Carvalho, André Carlos Ponce de Leon Ferreira de

2025 · doi:10.6084/m9.figshare.28696262 · W6958726191

dataset OA: green CC0

🔓 Open OA copy View on OpenAlex View at publisher

Abstract

READMEMachine learning pipelines run saved.Columns explanation:seed_i: seed used for the experimentsconfig_id: id of the configurationfold: fold of the datasetconfig_hash: hash value for the configurationsduration: durations of the runstart_time: start timeend_time: end timedataset: the name of the datasetstatus: Status of the run. If it succeeds or not.[metric_name]_[set split]: performance of the trained model on the set (train, val, test). For example, "f1_weighted_test" is the F1 Score of the trained pipeline on the test set.The remaining columns contain configuration space similar to AutoSklean.

My notes (saved in your browser only)

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

openalex: last seen: 2026-05-13T20:12:55.465734+00:00

License: CC0 · commercial use OK