EndoExtract: Co-Designing Structured Text Extraction from Endometriosis Ultrasound Reports
preprint
OA: green
CC0
Abstract
Endometriosis ultrasound reports are often unstructured free-text documents that require manual abstraction for downstream tasks such as analytics, machine learning model training, and clinical auditing. We present \textbf{EndoExtract}, an on-premise LLM-powered system that extracts structured data from these reports and surfaces interpretive fields for human review. Through contextual inquiry with research assistants, we identified key workflow pain points: asymmetric trust between numerical and interpretive fields, repetitive manual highlighting, fatigue from sustained comparison, and terminology inconsistency across radiologists. These findings informed an interface that surfaces only interpretive fields for mandatory review, automatically highlights source evidence within PDFs, and separates batch extraction from human-paced verification. A formative workshop revealed that \textbf{EndoExtract} supports a shift from field-by-field data entry to supervisory validation, though participants noted risks of over-skimming and challenges in managing missing data.
My notes (saved in your browser only)
Condition tags
Citation neighborhood (no data yet)
We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.
Source provenance
- openalex
- last seen: 2026-06-04T00:00:01.174412+00:00
License: CC0
· commercial use OK