Pre-trained Vision Transformers for Seizure Prediction: A Reproducible Baseline with Event-Based Evaluation and Statistical Validation

preprint OA: closed
Full text JSON View at publisher
Full text 3,894 characters · extracted from oa-doi-fallback · 4 sections · click to expand

Abstract

Background Scalp electroencephalography (EEG) based seizure prediction plays a critical role in improving the quality of life for patients with drug-resistant epilepsy, offering the potential for real-time warnings and timely interventions. Despite its clinical significance and decades of research, the field still lacks an open benchmark with reproducible baselines and deployment-oriented event-level evaluation. Most prior work relies on the small and outdated Children’s Hospital Boston (CHB-MIT) dataset and reports window-level metrics only, leaving the false-alarm burden of a real warning system underspecified. In seizure prediction, the cost of false alarm is significantly high since patients may receive painful electrical stimulation to suppress seizure. Hence, false alarms per hour (FA/h) and partial AUC (pAUC) are the most deployment-relevant metrics, reflecting alarm burden and discriminability in the low-false-alarm operating region that a usable warning system can realistically tolerate. However, few studies have systematically reported such metrics. In addition, vision transformers’ event-level performance under deployable FA/h constraints remains underexplored, and newer backbones such as MambaVision have yet to be evaluated under this setting.

Methods

In this work, we introduce a reproducible 5-fold benchmark derived from the Temple University Hospital EEG Seizure Corpus (TUSZ) dataset, and evaluate models using a pseudo-real-time event pipeline, reporting event-level sensitivity, false alarms per hour (FA/h) and partial AUC (pAUC). All models are compared to random predictors for statistical validation. We benchmark pre-trained vision transformers (SegFormer and MambaVision) under three EEG-to-image encoding methods, including a self-proposed Temporal-Patchify encoding for SegFormer.

Results

Our proposed Temporal-Patchify encoding method achieves state-of-the-art performance. We achieved 0.61 pAUC, which is 16.2% higher than the baseline Temporal-Tile SegFormer of Parani et al. The false-alarm burden (0.40±0.28 FA/h) is 44.4% lower than the Temporal-Tile SegFormer baseline while maintaining clinically usable sensitivity (60.7%±5.0%). We further perform statistical validation against a matched Poisson random predictor, confirming performance exceeds chance. Finally, we report end-to-end inference through-put up to 920 windows/s, confirming MambaVision’s fastest inference speed, exceeding SegFormer by over 20%.

Conclusions

This work bridges the gap between seizure prediction algorithms and clinically usable seizure prediction systems in real-world settings. Our findings indicate that pre-trained vision transformers, when coupled with appropriate EEG encoding methods, can achieve robust performance in low–false-alarm operating regimes, which is critical for real-world deployment. This benchmark and evaluation framework may facilitate more clinically meaningful and reproducible seizure prediction research. Competing Interest Statement The authors have declared no competing interest. List of abbreviations - AUC - Area Under the Curve - CHB-MIT - Children’s Hospital Boston–MIT EEG Dataset - CNN - Convolutional Neural Network - EDF - European Data Format - EEG - Electroencephalography - EMA - Exponential Moving Average - FA/h - False Alarms per Hour - iEEG - Intracranial Electroencephalography - LSTM - Long Short-Term Memory - NEDC - Neural Engineering Data Consortium - pAUC - Partial Area Under the ROC Curve (low false-alarm region) - ROC - Receiver Operating Characteristic - ROC-AUC - Area Under the Receiver Operating Characteristic Curve - sEEG - Scalp Electroencephalography - SOP - Seizure Occurrence Period - SPH - Seizure Prediction Horizon - SOTA - State of the Art - STFT - Short-Time Fourier Transform - TIW - Time in Warning - TUSZ - Temple University Hospital Seizure Corpus - ViT - Vision Transformer

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00