BabelFSH—A Toolkit for an Effective HL7 FHIR-based Terminology Provision | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF software BabelFSH—A Toolkit for an Effective HL7 FHIR-based Terminology Provision Joshua Wiedekopf, Tessa Ohlsen, Ann-Kristin Kock-Schoppenhauer, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6992162/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 29 Nov, 2025 Read the published version in Journal of Biomedical Semantics → Version 1 posted 9 You are reading this latest preprint version Abstract Background: HL7 FHIR terminological services (TS) are a valuable tool towards better healthcare interoperability, but require representations of terminologies using FHIR resources. As most terminologies are not natively distributed using FHIR resources, converters are needed. Large-scale FHIR projects, especially those with a national or even an international scope, define enormous numbers of value sets and reference many complex code systems, which must be regularly updated in TS and other systems. This necessitates a flexible, scalable and efficient provision of these artifacts. This work aims to develop a comprehensive, extensible and accessible toolkit for FHIR terminology conversion. Implementation: Based on the prevalent HL7 FHIR Shorthand (FSH) specification, a converter toolkit, called BabelFSH , was created that utilizes an adaptable plugin architecture to separate the definition of content from that of the needed declarative metadata. The development process was guided by formalized design goals. Results: All eight design goals were addressed by BabelFSH. Validation of the systems’ performance and completeness was exemplarily demonstrated using Alpha-ID-SE, an important terminology used for diagnosis coding especially of rare diseases within Germany. The tool is now used extensively within the content delivery pipeline for a central FHIR TS with a national scope within the German Medical Informatics Initiative and Network University Medicine. Discussion: The first development focus was geared towards the requirements of the central research FHIR TS for the federated FHIR infrastructure in Germany, and has proven to be very useful towards that goal. Opportunities for further improvement were identified in the validation process especially, as the validation messages are currently imprecise at times. The design of the application lends itself to the implementation of further use cases, such as direct connectivity to legacy systems for catalog conversion to FHIR. Conclusions: The developed BabelFSH tool is a novel, powerful and open-source approach to making heterogenous sources of terminological knowledge accessible as FHIR resources, thus aiding semantic interoperability in healthcare in general. HL7 FHIR Terminology as Topic Terminology Servers Knowledge Bases Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 BACKGROUND HL7 FHIR for national and international standardization and harmonization Both national and international demands on healthcare data interchange have made it ever-more important for healthcare providers to cooperate and to provide their primary-care data in machine-readable formats. Legislation in many healthcare systems, such as the US 21st Century Cures Act passed in 2016 [1, 2], or the emerging European Health Data Space (EHDS) [3] in the European Union, have catalyzed this requirement. Concurrently, national initiatives in the research domain, such as the German Medical Informatics Initiative (MII) have accelerated this need for interchange and alignment to common standards even further [4]. Such large-scale integrations between disparate systems clearly require harmonization and standardization, especially towards common data structures, encodings and processes. The use of interoperability standards presents itself as a suitable approach in this regard, and the Health Level 7 Fast Healthcare Interoperability Resources Standard (HL7 FHIR standard) [5] has been recognized to be a suitable means to this end in many jurisdictions [6, 7]. HL7 FHIR is inherently designed as an internationally applicable standard to support the broadly different requirements of different jurisdictions. As part of this design requirement, extensibility and adaptability were dominant considerations in the standard development processes. Consequently, national actors as well as use-case driven consortia are expected to define and utilize FHIR profiles, derived from the core standard, to tailor the definitions of the standard to their needs. Profiling FHIR and ValueSet bindings Apart from constraining element cardinality and adding extensions where applicable, the profiling process also generally changes numerous terminology bindings for coded data elements. As a rule, all coded data elements in FHIR must be bound to a ValueSet , which in turn is defined as subset of codes from one or more code systems. All coded data elements in the core FHIR standard have bindings already applied, but profiling often overrides these to better fit the reality of the respective jurisdiction; this mechanism is core to the FHIR philosophy. From this large need to define bindings in profiles arises a need to provide ValueSet resources to technical consumers. ValueSet (VS) resources in FHIR pick codes from one or more code systems. The standard also defines the resource type CodeSystem (CS) for the representation of these definitions within FHIR resources. In view of the enormous heterogeneity and complexity of classifications (e.g. ICD-10, ATC with their respective national adaptions) and terminologies (e.g. SNOMED CT, LOINC), their conversion into FHIR code systems can be at times quite challenging and is the main subject of this paper. To facilitate mappings between code systems, e.g. from non-standard local towards standard international terminology, ConceptMap (CM) resources can also define unidirectional mappings scoped to a certain use case. To implement system interactions with these resources, FHIR terminological services (TS) have emerged from a subsection of the specification [5 sect. 4.0] that provide both read-write access to FHIR resources and allow operations-based interactions with these resources. FHIR representation of terminological concepts In this paper, we must carefully differentiate between the intellectual concept of code systems , value sets , and concept maps/mappings , as they have been defined in the HL7 Version 3 Core Principles [8], and their manifestation in FHIR through the CodeSystem (CS), ValueSet (VS) and ConceptMap (CM) resources. The definitions from the HL7 Version 3 standard have been carried forward into the design of the FHIR standard. It is stated there that “ Code System s are often described as collections of uniquely identifiable concepts with associated representations, designations, associations, and meanings” [8 sect. 5.1.2]. Value sets are then defined as “[representing] a uniquely identifiable set of valid concept identifiers, where any concept identifier in a coded element can be tested to determine whether it is a member of the Value Set at a specific point in time.” [8 sect. 5.1.3]. However, the HL7 Version 3 standard, and early FHIR versions including the DSTU2 release, did not provide a representation (via a resource type in FHIR) for representing the content of a code system itself, as it was understood that complex, standard terminology will be managed through independent means. From the DSTU2 release of FHIR, it is stated that: Value sets that contain inline code systems are intended for small, simple code systems that are found throughout the implementation context (e.g., lists of words, status codes, enumerations). The inline code system definition is not intended to represent large publically defined terminologies such as LOINC, etc. - these terminologies have their own distribution formats . [9 sect. 6.21.1] As such, small code systems were manifested through their use within ValueSet resources, providing definitions for small, constrained applications such as communicating the status of an Observation resource. For standard terminologies, it was understood that they are maintained independently, but consuming and producing systems that communicate using these codes would have access to the required terminology to then expose it to their users and internal processes. In FHIR STU3, it was realized that many terminological use cases could be addressed by representing code systems in a FHIR resource of their own. While the in-line definition in the VS resource worked well for constrained use cases, where the value set is used just a few times within the standard, this approach does not scale with the representation of standard terminology, that is intended to be used in many use cases and value sets. While the then-created CodeSystem resource was not intended to be used for the maintenance process of most terminology, which was understood to be better served by the established processes [5 sect. 4.8.1], most coding systems can thus be expressed using native FHIR resources. This allows the definition of FHIR-based terminology services for code systems that are not considered internal to the HL7 specification and derived artefacts, but also large code systems brought into the server through their FHIR representation. To differentiate between these abstract concepts and their manifestations in FHIR, we will only use the long forms code system , value set , and concept mapping to reference the abstract concepts and resources that are not represented in FHIR resources, while the forms CodeSystem / ValueSet / ConceptMap and the abbreviations CS / VS / CM are only used to reference the FHIR manifestation. As is common in the HL7 FHIR standard’s design, the CS resource can only capture a subset of use cases directly. Based on the 80/20 rule, FHIR resource development should be “focus[ed] on the 20 % of requirements that satisfy 80 % of the interoperability needs” [5 sect. 2.1.19.2], leading to limitations in the expressivity of some resources for specialized use cases. For example, formal ontologies that are often expressed in the Web Ontology Language (OWL) often build on top of each other. The OBO foundry, a project that aims to develop a number of interoperable biomedical ontologies, even states in their guiding principles that the “scope” of any ontology they oversee “should be fairly narrow” and that “[r]equired terms that are out of scope should be imported from the appropriate ontology unless no such ontology exists” [10]. For example, a narrowly-scoped ontology in the domain of human ontologies can build on the concepts defined in an ontology describing cross-species anatomy [11]. The narrowly scoped concepts ( classes in the convention of formal ontologies) are then defined as sub-classes or children of those imported classes. The subsumption relationship in the domain of formal ontologies follows that of description logic, and has a very rigid definition [12]. Every subclass of a class is also an instance of the parent class, leading to the strict, transitive, is-a relationship. In this way, connected ontologies span a large semantic network through their is-a relationships. While this relationship can be expressed in the CS resource, using custom properties that link concepts across CS boundaries, the subsumption relationship in FHIR TS is only defined within the scope of a single CS. Regardless, a conversion of the code system defined in formal ontologies to the FHIR CS resource can be beneficial, as highlighted by Metke-Jimenez et al. [11]. Another shortcoming of the CS resource in particular is the impossibility of representing compositional code systems, such as the Unified Code of Units and Measures (UCUM) [13]. Units in UCUM can be combined arbitrarily, to express measurements in any kind of physical dimension. That fact does allow, however, for an infinite number of combinations of the defined unit symbols, especially in conjunction with the annotation facilities that can be used to express countable concepts. For example, the expression {FHIR CodeSystem resources}/[nmi_i].s2 (countable things per nautical-mile per second-squared) is nonsensical for any kind of real measurement, but perfectly valid in the language defined by the UCUM grammar. While a finite fragment of available and common UCUM codes could be created and distributed, the FHIR CS resource lacks the means to express the grammar-based UCUM standard. A FHIR TS could, however, provide some interactions defined in the terminology module of the standard through special casing, such that validation of UCUM codes can be implemented adequately. Terminological services within national projects Terminological services are an important factor towards healthcare data interoperability and, ultimately, data harmonization. The German MII (since 2015) and Network University Medicine (NUM, since 2020), funded by the German Federal Ministry of Education and Research (now German Federal Ministry of Research, Technology and Space) have recognized this need. In this funding scheme, the sub-project Service Unit Terminological Services (SU-TermServ) [14] has been tasked with providing terminological services through a FHIR terminology server, beginning in 2023. The required content is, in particular, defined and specified by the modular Core Data Set (CDS) of the MII [15]. From the Germany-wide drive to implement and exchange CDS-compliant FHIR resources, the need for a harmonized tooling to provide FHIR-based representations of referenced code systems originated. The core of the SU-TermServ project revolves around the provisioning of a central instance of Ontoserver [16] with all needed resources. Many parts of both the CDS and of implementation guides the CDS references use large, externally defined code systems, and those are needed in the HL7 FHIR format so they can be uploaded to a FHIR terminology server. For the professional support of a TS with such a national scope, transparent and effective processes must be established to provide the needed resources at the correct time for the TS to then generate a benefit for the surrounding health technology landscape. This is at odds with the current reality: most well-recognized code systems currently are instead maintained using specific tools and platforms and are often provided using proprietary formats or using very different standards. For example, the ICD-10-WHO resource (translated for use in Germany) is maintained by the German Federal Institute for Drugs and Medical Devices ( BfArM ) using internal processes, taking change requests from the medical community into account. As of writing, the document-based distribution in PDF format is the authoritative version, but a number of machine-readable formats are made available: one version uses the ISO ClaML standard (Classification Markup Language, ISO 13120:2013) [17], while one format uses tab-separated files for import into relational databases [1]. Contrast this to OncoTree, which is a classification for tumor types [18]. It is distributed mainly through an unstandardized web-based application programming interface (API). Considering also the fact that internal catalogs are often only available directly in the systems they are defined within, a clear need for simple and extensible toolkit for conversion of terminological artifacts into the HL7 FHIR resources CodeSystem , ValueSet and ConceptMap emerges. In Table 1, a brief survey of some terminologies referenced within the CDS of the MII further illustrates this need. Table 1: Exemplary terminologies/code systems referenced in the German Medical Informatics Initiative’s Core Data Set with distribution formats. Available FHIR CS resources are highlighted in boldface. Name Use Case Distribution Format Diagnosis coding ICD-10-GM German adaptation of the ICD-10 for morbidity coding ClaML, some versions as FHIR CS ICD-10-WHO Mortality coding ClaML, some versions as FHIR CS ICD-O-3 Cancer classification ClaML Alpha-ID-SE Coding for rare diseases aligned to ICD-10-GM Columnar text file ORPHAcodes Coding for rare diseases XML OncoTree Cancer classification Web API Procedure coding OPS Procedure coding ClaML Medication coding ATC (German & International variant) Pharmaceutical agents Spreadsheet (Office Open XML XLSX format) ASK Pharmaceutical agents (on the substance level) XML CAS Chemical substances Partially available in the ASK XML file EDQM Standard Terms Pharmaceutical dose forms, routes of administration, … Web API UNII Pharmaceutical agents (on the substance level) Columnar text file PZN Commercially available pharmaceutical products in Germany Columnar text file Medical Devices & Imaging ISO/IEEE 11073-10101 Medical device communication standard terminology FHIR CS DICOM DCM Medical imaging standard FHIR CS plus FHIR VS RadLex Radiological lexicon for reporting OWL OMICS data HGNC Gene symbols/names Columnar text file or JSON GENO Gene functions OWL HPO Phenotypes OWL SO Sequence annotation OWL As outlined above, the FHIR standard has always anticipated this fact, but many FHIR-based terminology services require FHIR-based representations of code systems to provide the resources through the APIs defined in the terminology module. The concept of a FHIR TS, which is specified in a functional manner through its API [5 sect. 4.7], does not actually require the terminology to be represented in FHIR resources. However, the alternative approach, whereby the terminology is maintained by the system developer in proprietary formats, is often not suitable for large-scale projects such as the MII and NUM. As those projects inherently move fast, the resulting dependency on the TS developer to provide new terminologies in an agile fashion can be a limiting factor in the projects’ development. Without the terminology artefacts required by the respective domain at the correct time, even best-of-breed terminology services cannot generate a benefit to the surrounding landscape and can thus be considered of no use. In this work, we thus propose a novel solution for making terminologies available through FHIR resources, building on established standards and allowing for rapid implementation for highly specific conversion processes. Prior Work The research community has so far not addressed the problem of terminology generation for HL7 FHIR extensively. The developers of Ontoserver , a FHIR-based terminology server in widespread use and the basis for the service provided by the SU-TermServ, have touched upon the generation of resources in their 2018 paper discussing the software [16]. This includes support for ClaML [17] generation using the ClaML-FHIR tool [19]. From the same working group, a transformation of Web Ontology Language (OWL) ontologies into FHIR was presented in 2019 [11]. Moreover, a literature review reveals several transformation scripts for specific resources. We have ourselves for example provided scripts for OncoTree and the EDQM Standard Terms database [18, 20–22], and have discussed further challenges in CS and VS transformation from the ART-DECOR platform, already touching upon the problems outlined above [23]. Especially considering the open-source code of the identified prior work, a key insight towards simplifying this process is the fact that most tools follow a similar structure. First, the metadata of the resources, e.g. the canonical URI, the version, description, author, copyright, and related fields are generated. Sometimes, this is hard coded within the tool, while in other instances, these data are defined using configuration files or provided as command line arguments. Once the metadata of the resource is defined, the tools then take in the authoritative sources of the artifact and generate the needed content: for a CS, a list of concept elements, for VS, include and exclude elements, and for a CM, a list of groups , which in turn contain elements . The content generator functionality in the tools can be highly specific (e.g. for OncoTree) or could also be generalizable to many different resources using the same formats (e.g. for ClaML). Finally, the resources are written to disk. While the implementation differs broadly, this pattern of defining metadata first and generating content afterwards holds for all related tools that we have so far identified. Another insight from the literature are the degrees of freedom associated with providing FHIR resources from other sources. For CS, consider the representation of hierarchical relationships: concept elements can define concept elements in-line, representing a strictly monohierarchical relationship. The representation through a property, generally using the parent and/or child relationship code, is however anecdotally preferred by TS implementers; and the FHIR standard states that they should not be used concurrently [5 sect. 4.8.13]. However, one CS might only use parent , another only child , and another one might link concepts in both directions with both properties. In rare cases, resources might not use those standard properties, instead assigning a different kind of relationship between concepts. The rare disease nomenclature ORPHAcodes [24], defined by the Orphanet database [25], for example, uses the concept of aggregation levels to link disorders and subtypes of disorders. Depending on the reporting use case, more specific or more general concepts might be required. This property can be understood as a special case of parent-child relationships, yet there is not a single parent for each property as in many other classifications. Instead, the terminology defines a “forest” of small “trees” through the aggregation level. In some use cases, such as feasibility queries [26], this relationship can be considered to be equivalent to the parent/child relationships, as researchers selecting a single concept from the user interface also expect the subtypes of that disorder to be selected. This needs to be achieved either by the search interface providing special consideration for the code system, by the TS interpreting this relationship as equivalent to the parent/child relationship, or by the FHIR resource serializing this property as a parent/child relationship. Objective The aim of this work is to provide to the broader FHIR community a simple-to-use and simple-to-extend toolkit for providing all manner of terminological resources, derived from their native distribution formats, building on established mechanisms. The key architectural principle derived from the literature is the clear separation between the definition of resource metadata and the generation of content. Metadata generation refers both to metadata that describes the resources in their entirety, such as name, ID, canonical URI etc., and to metadata that can be applied to CS concepts, i.e. properties. The metadata generation should follow a uniform approach across all resources, thus allowing for harmonization and convention across resources. In contrast, content generation must support both reusable, generic methods and highly specific transformations, such that all requirements of the terminologies in question can be addressed appropriately. Through this tool, the representation of terminology resources can be harmonized through the re-use of rules, conventions and transformation logic, thus aiding terminology service adoption. IMPLEMENTATION FHIR Shorthand The FHIR Shorthand (FSH) language specification [27] has quickly become a cornerstone of the FHIR community, and is now considered the dominant means to define FHIR resources as source-code using a domain-specific language (DSL). By compiling the source code files with the reference compiler, SUSHI [28], the definition of profiles and other StructureDefinition resources has been simplified and streamlined. FSH’s language elements are geared towards the concise definition of FHIR resources, especially profiles and extensions, but also resource instances of all other resource types, such as example resources within an ImplementationGuide (IG). Crucially, the FSH language and the SUSHI compiler also support the definition of CodeSystem and ValueSet resources directly, and ConceptMap via the arbitrary-resource facilities. However, especially the CS support is geared towards the definition of small, internal CS with only a few codes. All concepts must be listed in-line of the entire FSH source file (see an example in Figure 3). Generating FSH sources from original content and then compiling this with SUSHI is in our experiments certainly possible, but computationally quite expensive. However, as SUSHI is the reference implementation of an open-source formal language specification, and only a subset of this specification is relevant to the terminology generation process, the implementation of a use-case specific compiler tool was initiated, which we call BabelFSH . Implementation Prior to the implementation of the system, we have defined eight design goals that the tool must adequately address: DG1—Compatibility with SUSHI: The BabelFSH source files shall remain completely compatible with the SUSHI reference implementation (only) for CodeSystem , ValueSet and ConceptMap ; no language elements shall be added to FSH DSL that result in errors when using SUSHI to compile BabelFSH source files. Thus, changes in the FSH specification going forward can be incorporated into BabelFSH without delay. DG2—Metadata in FSH: The FSH DSL shall be used to define the metadata of the resulting terminology resources. The core FSH language elements shall not be used to define the concepts of CodeSystem resources. This requirement enforces the beneficial separation of metadata and content that was identified from the literature. DG3—Extensibility: It shall be easy to generate content from diverse sources. Core functionality shall be provided to implementers to address common requirements. If a code system requires highly specialized logic, this must be supported. Otherwise, the expressivity of the system would be so limited that many resources would not be convertible, dramatically limiting the usefulness of the system. DG4—Simple API and Self-Documentation: Implementers shall be provided a simple API to hook into the generation process, such that needed logic can be implemented quickly, allowing for fast turnaround and easy maintenance. DG5—Validation: Generated resources shall be validated for their correctness against the FHIR specification. Especially syntactic problems shall not be accepted, but semantic validation should also be performed if applicable. A strong validation process helps to ensure consistent resources, aiding implementers in creating correct and comprehensive metadata for their resources. DG6—Performance: BabelFSH shall be performant, such that the generation of multiple versions of large code systems can be carried out without issue and should offer substantial performance benefits over native FSH compilation. Were the performance of BabelFSH similar to SUSHI, the benefit of a separate parser/compiler would be offset by the complexity of this tool. DG7—FHIR Version support: The tool shall support multiple versions of the FHIR core specification. At the time of writing, this especially requires support for versions R4/R4B and R5, leading to a need of special consideration especially in the CM generation aspect, since CM has undergone considerable conceptual changes in FHIR R5. The user must be explicit in which FHIR version they use, the concurrent generation of R4B and R5 resource in the same application invocation shall not be supported. This ensures relevance of BabelFSH going forward, such that later versions of FHIR can also be integrated. DG8—Open Source: BabelFSH must be licensed with an open-source license, so that the FHIR community can, without any restrictions for commercial use, make use of the tool, so that enhancements can also be fed back into the applications. Due to the open nature of the FHIR community, the provision of an open-source tool can foster collaboration and improve the adoption of FHIR terminology services and FHIR in general. With respect to DG2 , we define metadata as those data elements available in terminological resources to describe the resource in its entirety (i.e., the canonical URI, version, name, title, technical ID, publisher, etc., as well as extensions for the entire resource), and also those data elements to describe the concepts defined within a CS (i.e., properties). The first kind of metadata is defined by the FHIR core standard, and FSH rules will be required to fill the respective slots. Properties follow a code-value-paradigm, whereby the declaration assigns each property a code and data type. The FHIR specification implicitly defines a selection of properties that need not be declared in CS, such as the parent , child , or inactive properties that are commonly used in FHIR CS resources. All other properties must be declared in the FSH source and will then be used in the plugin implementation when concepts are generated. General converter implementations (DG3) then need to provide configuration options to map properties in the terminology sources to the properties generated by the plugin. Concerning DG5 and DG7 , supporting either R4B or R4 is adequate, as no changes between these versions affect the terminology module [5 sect. 2.1.11], and validation using either of these versions will result in the same output messages. Programming Language Our implementation of BabelFSH is written using the Kotlin programming language [29], utilizing the Java Virtual Machine. Kotlin is a modern, statically-typed and inherently null-safe language that is entirely compatible with the Java language that many eHealth researchers are already familiar with. Hence, established libraries in the rich Java ecosystem can also be utilized. Moreover, Java source files can be included in Kotlin projects, so that Java plugins can also be implemented by users not familiar with Kotlin. The tool is built using the Gradle build automation toolkit [30]. Parsing FSH code To parse the FSH source files needed for the BabelFSH system, we use the free and open-source parser generator ANTLR v4 [31]. ANTLR uses formal grammars to generate recursive-descent lexers and parsers, with associated infrastructure to integrate the generated code into applications. Lexers generate tokens from the input code, which are then assigned meaning in the parser grammar. Using ANTLR-provided hooks, applications can receive the parsed structure as the parser processes the input stream and populate their internal data model from this token stream. We base our parser and lexer grammar on the implementation for the FSH reference implementation, SUSHI [28], which also utilizes ANTLR v4. By modifying select sections of the SUSHI grammar and adding other grammars as needed, we can add parseable language elements to the generated FSH parsers. To remain compatible with standard FSH code ( DG1 ), we decided on a strategy further outlined in Table 2. We use block comments, which are an existing language element of FSH, to define command line arguments to the plugins. Block comments follow the C-style syntax using a slash and asterisk symbol at the leading and trailing end of the comment. For these comments to be considered by BabelFSH, a special recognition token (leading: /*^babelfsh, trailing ^babelfsh*/) was introduced, so that normal comments are not hijacked by our approach. The FSH source code is checked for syntactic and semantic correctness and then dissected into a parse tree. Errors are thus caught early in the processing and are surfaced to the user in meaningful error messages, pointing them towards the line where a declaration does not follow the FSH language specification. Table 2: Grammar algorithm of the BabelFSH implementation. Step # Component Changes/Strategy 1 FSH Lexer grammar Comment tokens are not skipped, but sent to a different channel A new recognition token for the start end end of a multi-line comment was introduced: /*^babelfsh and *^babelfsh*/, so that the content of these (structured) comments can be bubbled up into the parser. 2 FSH Parser grammar The parser rules for instances, CS and VS and reusable RuleSets were amended add an optional terminologyPluginComment? rule 3 ANTLR Parser Listener Early validation of the content: resource types other than those supported by ANTLR are ignored Context of declarations is processed 3 Command Line Parser A secondary grammar enforces a syntactic structure in the command line arguments. Semantics of these declarations is defined by the plugins. 4 FSH Rule Parser Another secondary grammar is used to create a parse tree from the FSH rules, so that soft indexing is supported. FSH rules can address the current (=) and next (+) elements: * code[+].coding[=].display = "Display". This mechanism facilitates the re-use of rules within RuleSet FSH items. Application Programming Interface To allow for easy extensibility ( DG3 ) of the core logic of the application, we have decided on a plugin-based approach. Plugins must be provided and registered at compile-time. They are identified using a unique plugin ID, and need to define the arguments that they expect, such as input file paths or URIs, columns to map, properties to render. To make the declaration of these arguments in the FSH code simple and understandable ( DG4 ), we utilize an established command line parsing library for Kotlin [32]. Plugins then declare short and long forms of their needed arguments, such as ‑‑file/‑f for the input file, and associated help texts for each argument. By declaring optional and required arguments in the source code, validations for the provided argument values, and the facility to validate the interaction between e.g. mutually exclusive arguments, the plugins receive a type-safe set of arguments they can use in concept generation. As plugin developers also need to provide help texts for the arguments, BabelFSH can automatically generate an interactive help, and can give the user detailed feedback on incorrect argument use without effort to the plugin developer ( DG4 ). Based on these help texts, online documentation can be pre-generated, such that additional information can be provided to users. The plugins are intended only for content generation, not the definition of metadata, in line with DG2 . Abstract classes define the entry points for this content generation: For CS resources, the generation of concept entries, for VS, the definition in terms of inclusions and exclusions, and for CM, the definition of groups. Each plugin must implement only two methods, one for parsing the command-line arguments in a type-safe fashion, one for generating the content. The generator API can support complex iteration and inter-dependencies for the resources, such that the expressivity of the generator routines is not artificially limited compared to purpose-built tools. To support multiple FHIR versions in the same application (DG7) , the plugins do not natively generate FHIR resources. Instead, they use version-agnostic proxy classes, which are serialized as FHIR data structures only when the resources are written out to disk. The release R5 of FHIR changed terminological resources in two domains. First, all resources have additional metadata fields, such as editor or reviewer . As those are within the responsibility area of the standard FSH code (DG2) , the use of R5-specific metadata declarations requires the user to switch to the R5 processing mode. Otherwise, their use would result in validation errors, as they are unknown to the R4B validator. In the area of resource content, minor changes present for CS and VS, which are handled by the proxy-class approach. However, CM has undergone a fundamental redesign which makes a version-agnostic implementation extremely challenging. Most importantly, the R5 CM requires a data element relationship which corresponds to the equivalence attribute in R4B. This attribute needs to be provided for each mapping entry and states the correspondence between the source and target concepts within the specified code system. The coding of these attributes has been changed substantially, such that mapping these codes is a problematic undertaking that BabelFSH will not attempt automatically. As such, for CM generation, the representation of concept correspondence within each plugin must address this challenge themselves, whereas support for a single version is an option in the API. Resource Validation To ensure that the generated resources are compliant with the specification, we have implemented a comprehensive validation strategy ( DG5 ). The HAPI FHIR [33] library is used to prevent the generation of invalid output resources by our grammar-based FSH rule interpreter, which operates independent of the FHIR resource definition, and can thus not catch such errors on its own. To correlate the declaration of a rule (complicated by the possibility of rule re-use through RuleSet FSH items) with the error messages from HAPI FHIR, we have implemented an iterative approach. First, we generate a common resource skeleton. In parsing the rules, the data model stores the precise location (filename, FSH item and line number) the rule was originally defined. Rules that belong together, such as the system URI, the display and the code within a Coding datatype, are grouped using a parse tree. By iterating over the first level of rules in this tree and executing the rules on this skeleton, HAPI error messages can be correlated with the declaration context, which points the user towards the error. Moreover, users are provided an additional layer of validation through the integration of the FHIR validation engine that is maintained as a core infrastructural component by HL7 International in conjunction with the HAPI FHIR developers [34]. This additional validation step is opt-in and errors in this validation will not result in application termination, as many messages from this validation pipeline will be false positives. However, the validator will catch incorrect usage of profiles, missing elements etc. that might not be caught by the HAPI FHIR pipeline, so that the developer can rapidly iterate over the resource definition in their BabelFSH source files. Concept Validation Strategy Both the correctness and the performance (DG6) of the proposed tool needs to be assessed. As a point of comparison, the existing SUSHI reference compiler can be used: by generating standard FSH code for a suitable code system and compiling that with SUSHI both dimensions can be evaluated. We utilize the German Alpha-ID-SE code system [35] for comparison. This artefact is an alphabetical index to the morbidity classification ICD-10-GM [36], with Alpha-ID-SE containing many more entries (2025 version: 90399) than ICD-10-GM (2025 version: 17089). For the more differentiated text entries below the ICD-10 code level, ORPHAcodes for rare diseases are added where possible [24]. The resource is distributed as a pipe-separated plain-text file ( source file) , making the conversion to SUSHI straightforward. Each row of the file defines a single concept and several additional properties for these concepts. An excerpt of this file is shown in Figure 1. Using this file, the code system can be rendered as a standard FSH file using simple scripts. At the header of the output file, some metadata declarations are required, and from the source file, every line in that file will result in several lines of FSH code that declare not only the concept, but also the properties defined by Alpha-ID-SE. An equivalent BabelFSH file can be generated from the same template, where the metadata is declared identically (consistent with DG1 , all FSH declarations should be correctly implemented by BabelFSH). By adding a BabelFSH plugin comment to that declaration, the content is then pulled out of the FSH declarations into the plugin architecture by providing the needed plugin arguments in the comment. From these two files, two JSON representations of the Alpha-ID-SE code system can be generated using SUSHI and BabelFSH, while the runtime of each approach is timed using command line tools. Assessing the completeness of the generation is thus straightforward: if the number of concepts generated from the FSH and BabelFSH files are identical to the number of rows in the source file, the generation is complete. However, validating the completeness of the entire method is difficult to generalize, as that greatly depends on the plugin. Validating the completeness of each plugin thus needs to be done by the plugin authors on a case-by-case basis, As the source code for each plugin will generally be quite compact, they will be easily testable and debuggable. RESULTS The resulting system is shown in Fig. 2 . First, BabelFSH identifies the input files using filename matchers, with all files needing the extensions “.babel.fsh” or “.babelfsh.fsh” to be considered by the tool. This is to ensure a clear delineation between aspects better served by SUSHI and those to be solved by BabelFSH. The BabelFSH approach is implemented through these four steps: The input files are parsed against the FSH grammar into a set of FSH items and rules. Through iterative application of the rules, the resources are generated and the rules are validated for compliance with the FHIR standard. The plugin command lines are parsed, and the identified plugin is called with the command line arguments to then add the respective content to the output. Lastly, the output is written to an output file in the specified output folder. Evaluation Comparing the performance of our approach with the state-of-the art SUSHI compiler, we have generated FSH code from the Alpha-ID-SE sources in version 2025. The implementation of the FSH conversion was accomplished through a naïve Python implementation that reads the file line by line and uses string templates to generate FSH statements for each concept. The approximately 90 400 input lines (4825 KiB) thus balloon to more than 500 000 lines (+ 453%) of FSH code (21 492 KiB, + 345%) to not only define the concepts and displays, but also the metadata for each concept available in the resource. Converting the sources on the test computer, an Apple MacBook Pro with a M4 Pro processor and 48 GB of RAM, was done in less than one second. However, it was apparent that SUSHI struggles with the generation of content from this extremely large file. Using the gnomon tool [ 37 ], we timed that SUSHI version 3.14.0 took approximately 16 seconds just to read in the file, and generating the single FHIR resource from this file took an additional 812 seconds. All in all, the command terminated after 833 seconds or 13.9 minutes wall clock time, with the laptop running in high-power mode and being otherwise idle. A short excerpt of the generated FSH file is available in Fig. 3 , and the resulting FHIR JSON representation in Fig. 4 . Comparing this with the BabelFSH approach, a single version of Alpha-ID-SE was defined in less than 60 lines of FSH code with the metadata of the resource being identical to the standard FSH version. Converting this resource to FHIR JSON took less than 3 seconds in total. The full BabelFSH source file is available in Fig. 5 . As a further point of comparison, defining all available versions of Alpha-ID-SE starting in 2015, when the resource first became available, required less than 160 lines of code (including whitespace and comments) and compiled in 16.1 seconds (with the supplemental validation enabled), demonstrating the power of FSH in terms of definition re-use. Using parametrized rulesets, the declaration of the FSH item for each version only requires five lines of self-explanatory code. For the example of Alpha-ID-SE, we have verified that the generated resources are indeed semantically identical: all metadata is present, all concepts from the sources are defined, and all properties are mapped to FHIR properties. By formatting the resources using JSON tooling and comparing them using the standard GNU diff tool, this equivalence could be rigidly asserted. All resources for this validation are available in the BabelFSH source code repository. System in Use As has been motivated in the background section, BabelFSH was conceived of in the context of the requirements for the provision of a research FHIR terminology server within the MII, having a national scope. Using the BabelFSH tool, we have both streamlined the initial efforts and the continued maintenance of diverse resources accessible to our services. As of writing, we distribute resources defined by the German BfArM (ICD-10-GM, OPS, ICD-O-3, ASK), the German adaption of ATC, several important OWL ontologies including the Gene Ontology or the Human Phenotype Ontology, the EDQM Standard Terms database, the HGNC gene names database and others using BabelFSH (cf. Table 1 for explanations for each terminology thus generated). Moreover, besides the direct sharing of resources, the system could be used to provide access to FHIR representations of resources that must be licensed for machine-readable distribution. In particular, the Medical Dictionary for Regulatory Activities (MedDRA) terminology [ 38 ] can only be distributed to licensees, which the TS can currently not enforce. By distributing the BabelFSH source files, however, all parties that have a MedDRA subscription and license can generate identical representations of this input, making the distribution of the FHIR resource to those parties mostly redundant. DISCUSSION We consider all design goals previously laid out as suitably addressed. A complete assessment and a summary of the strategy for achieving each goal is stated in Table 3. While the development guided by the design goals did not follow any strict software engineering methodology, their definition ahead of the implementation was extremely helpful in guiding the development process. The design and implementation and performance of the system was positively discussed within the FHIR community, both during in-person meetings of the German FHIR standardization community as well as on-line and during the 2025 edition of the developer conference FHIR DevDays with international members of the community. Table 3: Design goals and implementation in the BabelFSH application Design Goal Implementation DG1—Compatibility with SUSHI BabelFSH supports a strict subset of the FSH specification, focusing on terminology resources. All valid BabelFSH source files are thus also valid FSH files. DG2—Metadata in FSH The tool supports a clear separation of concerns through the plugin architecture. DG3—Extensibility The provision of new plugins requires only the implementation of a very simple API. DG4—Simple API and Self-Documentation Based on the provided API, all plugins must be documented by the plugin arguments and further help texts, such that automatic help can be generated. DG5—Validation All FSH rules are validated for compliance with the FHIR specification through established validation tooling integrated into the software. Additional validation rules are enforced in the content generation pipeline. DG6—Performance BabelFSH has proven to be performant and scalable. DG7—FHIR Version support FHIR versions R4B and R5 are supported through a mode switch, and further versions could be added to the API as they are released. DG8—Open Source BabelFSH is freely available under the Open Source Initiative-approved Apache 2.0 license. The most pressing limitation of our system currently is the number and selection of available plugins. As our system originates in the context of a well-defined project, our own efforts in plugin development in particular center around the needs for this project. Due to the open-source nature and elegant simplicity of the approach, we believe that the broader FHIR community will recognize the potential of this tool, and define their own plugins as needed. The application design does also lend itself to the creation of advanced plugins, such as the implementation of a database connector that would be able to connect, using standard database access layers, to legacy systems that define internal catalogs, which could then be integrated into FHIR resources as needed; a feature idea that has been met with an enthusiastic response in the community. In terms of the API, the CS generation API can currently be considered mature, since it has proven itself for several quite different plugins now. That for CM is likely stable, but VS generation has not been considered extensively so far. As there are two ways of defining VS that could be defined, this API defines careful considerations about the intended scope of the generation. First, the intensional approach defines a VS based on rules that are evaluated by the terminological services, e.g. all concepts that are children of a specific concept. Seconds, in extensional definitions, the included concepts are listed directly. Especially for intensionally-defined VS, the SUSHI implementation is already adequate, while extensional VS can, depending on their size, present the same (performance) limitations as SUSHI does for CS resources. A further limitation of the current implementation is our input validation. BabelFSH currently has no concept about the FHIR metadata structures and applies the rules as they are written. If there is a typo or other error in the declarations, this will be caught by the HAPI validator [ 33 ] instead, and the error message with an indication where the error occurred, but complex data elements such as CodeableConcept have to be validated in concert, the error messages can often only point the user towards several line numbers in the sources, rather than towards the precise line where the error occurred. However, as the target group of BabelFSH are advanced FHIR implementers, this level of error-checking is deemed adequate, since the messages generated in this way are nevertheless quite descriptive. An interesting consideration for the future could be the integration of this tool into the reference standard SUSHI compiler and, potentially, the integration of new language elements into the FSH language definition. In this way, the functionality provided by BabelFSH could be integrated with the standard build tools that are now established in the FHIR community for IG creation. However, the intended user groups of BabelFSH and SUSHI are not always the same people. SUSHI is geared more towards IG creators, while BabelFSH is aimed at those providing FHIR terminology resources or services. While there is a degree of overlap between those user groups, it is rare for FHIR IGs to be responsible for the maintenance and distribution of larger-scale terminologies. The narrow focus of BabelFSH and the inherent constraints of the FSH implementation can allow for quicker iteration, cleaner code and faster learning for terminology converter compared to a SUSHI-native implementation. Moreover, the command-line interface of the BabelFSH app itself still allows integration into automation processes including Continuous Integration (CI) pipelines that are now common in the development of IGs. In this regard, the performance of the implemented system is also beneficial, as long runtimes of the conversion processes could be an obstacle to the profilers due to the time and cost penalties incurred by a longer runtime during IG builds on the CI infrastructure. Regarding the integration of this plugin architecture into the FSH language specification via a new language element instead of the block comments that are currently used, this document has now reached a level of maturity that makes most rules normative, disallowing changes that are not backwards compatible. While adding new language elements is not a breaking change, there are some constraints in the BabelFSH implementation that may not be compatible with the specification as written, such that integration would require very careful consideration. Lastly, for FHIR terminology services to generate a benefit to implementers and users, the maintenance and content delivery of these systems needs to be integrated with the requirements of all stakeholders. Since FHIR TS are generally only deployed within a broader infrastructure, the content delivery of such a service needs to be carried out in lockstep with other changes in the infrastructure. A tool like BabelFSH can aid in that area, by both allowing fast reaction to new or shifted requirements, such as providing new resources, and the maintenance of such resources in the future, but is only one tool towards professionalized TS maintenance. Within the SU-TermServ project, providing such a service with a broad scope, BabelFSH has become an important tool, but only develops its potential through the integration into a comprehensive strategy and tooling infrastructure. CONCLUSIONS In this work, we have presented a novel, powerful and open-source approach to making heterogeneous sources of terminological knowledge broadly accessible as FHIR resources. In this way, BabelFSH is an important tool towards the adoption of comprehensive HL7 FHIR terminological services, and thus greater (semantic) interoperability in general. It is furthermore orthogonal to existing approaches of terminology conversions: the core logic for content generations is often cleanly separable from the (often hard-coded) generation of the needed metadata, such that the generation part can be easily integrated into the BabelFSH tool. Usage of the BabelFSH tool and uptake by the community can ensure the availability of required terminological resources through FHIR terminology services, aiding the adoption of the HL7 FHIR standard overall. The tool has the potential to be adopted by terminology authors to streamline the provision of authoritative FHIR representations of their terminologies in the first place, such that community efforts to generate such artefacts become unnecessary. Abbreviations API Application Programming Interface BfArM Bundesinstitut für Arzneimittel und Medizinprodukte (German Federal Institute for Drugs and Medical Devices) ClaML Classification Markup Language CI Continuous Integration CM ConceptMap (a concrete FHIR resource) CS CodeSystem (a concrete FHIR resource) DSL Domain-Specific Language DSTU2 Draft Standard for Trial Use Release 2 (a version of the HL7 FHIR standard) EDQM European Directorate for the Quality of Medicines & HealthCare EHDS European Health Data Space FSH FHIR Shorthand HL7 FHIR Health Level 7 Fast Healthcare Interoperability Resources IG Implementation Guide MedDRA Medical Dictionary for Regulatory Activities MII Medical Informatics Initiative NUM Network University Medicine OWL Web Ontology Language R4/R4B/R5 HL7 FHIR Release 4/Release 4B/Release 5 STU3 Standard for Trial Use Release 3 (a version of the HL7 FHIR standard) SU-TermServ Service Unit Terminological Services SUSHI SUSHI Unshortens Short Hand Inputs (the reference compiler for FSH) TS Terminological services VS ValueSet (a concrete FHIR resource) Declarations Ethics approval and consent to participate Not applicable Consent for publication Not applicable Availability of data and materials Project name: BabelFSH Project home page: https://gitlab.com/mii-termserv/babelfsh Archived version: https://doi.org/10.5281/zenodo.15755172 Operating system: Platform independent Programming language: Kotlin (primarily), Java Other requirements: Java 18 or higher, Gradle License: Apache License 2.0 Any restrictions to use by non-academics: none apply Competing interests The authors declare that they have no competing interests . Funding This work was funded by the German Federal Ministry of Research, Technology and Space ( Bundesministerium für Forschung, Technologie und Raumfahrt , BMFTR) as part of the Medical Informatics Initiative Germany under the grant number 01ZZ2312A. Author Contributions Conceptualization, Methodology, Investigation: J.W., T.O., A.K., J.I. Software: J.W. Validation: J.W, T.O. Visualization, Writing – original draft: J.W. Writing – review & editing: J.W., T.O., A.K., J.I. Acknowledgements The authors would like to thank and acknowledge the members of the worldwide and diverse HL7 FHIR community, who have been involved in shaping the development of this tool through valuable feedback, and we hope that it is useful to all of them. We would further like to acknowledge the authors and developers of the FSH specification and SUSHI reference implementation for their important work in making FHIR profiling simpler and more accessible. Authors’ Information As part of the Medical Informatics Initiative, all authors are responsible for the provision and development of FHIR-based terminological services to the MII across Germany, and have substantial experience in working with medical terminologies and FHIR terminological services. They are active in the development of the FHIR standard and national adaptations through HL7 Germany, and in the working groups of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS). References US Public Law. 21st Century Cures Act. 130 Stat 1033. Stat., 114–255 2016 p. 312. Vijayaraghavan M, Genes N, Darrow BJ, Rucker DW. The 21st Century Cures Act and Emergency Medicine – Part 2: Facilitating Interoperability. Ann Emerg Med. 2022;79:13–7. doi:10.1016/j.annemergmed.2021.08.002 European Parliament, Council of the European Union. Regulation (EU) 2025/327 of the European Parliament and of the Council of 11 February 2025 on the European Health Data Space and amending Directive 2011/24/EU and Regulation (EU) 2024/2847 (Text with EEA relevance). Mar 5, 2025. Semler S, Wissing F, Heyder R. German Medical Informatics Initiative: A National Approach to Integrating Health Data from Patient Care and Medical Research. Methods Inf Med. 2018;57:e50–6. doi:10.3414/ME18-03-0003 HL7 International. HL7 Fast Healthcare Interoperability Resources, Release 5 (HL7 FHIR R5) [Internet]. 2023 [cited 3 Feb 2025]. Available from: https://hl7.org/fhir/R5 Benson T, Grieve G. Principles of Health Interoperability: FHIR, HL7 and SNOMED CT. Fourth edition. Cham: Springer International Publishing; 2021. doi:10.1007/978-3-030-56883-2_2 Vorisek CN, Lehne M, Klopfenstein SAI, Mayer PJ, Bartschke A, Haese T, et al. Fast Healthcare Interoperability Resources (FHIR) for Interoperability in Health Research: Systematic Review. JMIR Med Inform. 2022;10:e35724. doi:10.2196/35724 Grieve G, Klein WT, Beeler Jr. G, Hamm R, McKenzie L. HL7 International, Modeling and Methodology and Vocabulary Work Groups . HL7 Version 3 Standard: Core Principles and Properties of HL7 Version 3 Models, Release 1 [Internet]. ANSI/HL7 V3 CPPV3MODELS, R1-2012, 2012 [cited 3 Feb 2025]. Available from: https://www.hl7.org/documentcenter/public/standards/V3/core_principles/infrastructure/coreprinciples/v3modelcoreprinciples.html HL7 FHIR. Resource ValueSet - Content, Fast Healthcare Interoperability Resources Release DSTU2 [Internet]. 2015 [cited 3 Feb 2025]. Available from: https://hl7.org/fhir/DSTU2/valueset.html OBO Foundry. OBO Foundry Principles: Scope (Principle 5) [Internet]. 2022 [cited 20 Jun 2025]. Available from: https://obofoundry.org/principles/fp-005-delineated-content.html Metke-Jimenez A, Lawley M, Hansen D. FHIR OWL: Transforming OWL ontologies into FHIR terminology resources. AMIA Annu Symp Proc AMIA Symp. 2019;2019:664–72. Brachman. What IS-A Is and Isn’t: An Analysis of Taxonomic Links in Semantic Networks. Computer [Internet]. 1983 [cited 24 Jun 2025];16:30–6. doi:10.1109/MC.1983.1654194 Schadow G, McDonald CJ. The Unified Code of Units and Measures [Internet]. Version 2.2, 2024 [cited 8 Mar 2025]. Available from: https://ucum.org/ucum Bundesministerium für Bildung und Forschung. SU-TermServ – Medizininformatik-Struktur “Service Unit Terminological Services” for MI-I and NUM [Internet]. 2023 [cited 28 Feb 2025]. Available from: https://www.gesundheitsforschung-bmbf.de/de/su-termserv-medizininformatik-struktur-service-unit-terminological-services-for-mi-i-and-16144.php Ammon D, Kurscheidt M, Buckow K, Kirsten T, Löbe M, Meineke F, et al. Arbeitsgruppe Interoperabilität: Kerndatensatz und Informationssysteme für Integration und Austausch von Daten in der Medizininformatik-Initiative. Bundesgesundheitsblatt - Gesundheitsforschung - Gesundheitsschutz. 2024;67:656–67. doi:10.1007/s00103-024-03888-4 Metke-Jimenez A, Steel J, Hansen D, Lawley M. Ontoserver: a syndicated terminology server. J Biomed Semant [Internet]. 2018 [cited 7 Mar 2024];9:24. doi:10.1186/s13326-018-0191-z International Standards Organization. ISO 13120:2019 — Health informatics — Syntax to represent the content of healthcare classification systems — Classification Markup Language (ClaML). 2019. Kundra R, Zhang H, Sheridan R, Sirintrapun SJ, Wang A, Ochoa A, et al. OncoTree: A Cancer Classification System for Precision Oncology. JCO Clin Cancer Inform [Internet]. 2021 [cited 8 Feb 2025];221–30. doi:10.1200/CCI.20.00108 Steel J, Wiedekopf J, dfleming9, Werner P. ClaML to FHIR Transformer [Internet]. 2024 [cited 8 Feb 2025]. Available from: https://github.com/aehrc/fhir-claml Wiedekopf J. Oncotree to FHIR Converter [Internet]. 2020 [cited 8 Feb 2025]. Available from: https://github.com/skfit-uni-luebeck/oncotree-fhir European Directorate for the Quality of Medicines & HealthCare. Standard Terms Database [Internet]. 2024 [cited 8 Feb 2025]. Available from: https://standardterms.edqm.eu/ Wiedekopf J. EDQM2FHIR [Internet]. 2024 [cited 8 Feb 2025]. Available from: https://github.com/skfit-uni-luebeck/EDQM2FHIR Wiedekopf J, Drenkhahn C, Ulrich H, Kock-Schoppenhauer A-K, Ingenerf J. Providing ART-DECOR ValueSets via FHIR Terminology Servers - A Technical Report. Stud Health Technol Inform. 2021;283:127–35. doi:10.3233/SHTI210550 Institut national de la santé et de la recherche médicale (INSERM) US14. Orphanet Nomenclature for Coding and Associated Tools [Internet]. 2024 [cited 20 Jun 2025]. Available from: https://www.orphadata.com/orphanet-nomenclature-for-coding/pack-nomenclature Institut national de la santé et de la recherche médicale (INSERM) US14. Orphanet—Knowledge on rare diseases and orphan drugs [Internet]. 2025 [cited 20 Jun 2025]. Available from: https://www.orpha.net/ Rosenau L, Majeed RW, Ingenerf J, Kiel A, Kroll B, Köhler T, et al. Generation of a Fast Healthcare Interoperability Resources (FHIR)-based Ontology for Federated Feasibility Queries in the Context of COVID-19: Feasibility Study. JMIR Med Inform. 2022;10:e35789. doi:10.2196/35789 HL7 International, FHIR Infrastructure Working Group. FHIR Shorthand, Version 3.0.0 [Internet]. 2024 [cited 3 Feb 2025]. Available from: https://hl7.org/fhir/uv/shorthand/N2/ Health Level Seven International. SUSHI [Internet]. 2024 [cited 3 Feb 2025]. Available from: https://github.com/FHIR/sushi Jetbrains, Inc. Kotlin Programming Language [Internet]. 2025 [cited 3 Feb 2025]. Available from: https://kotlinlang.org/ Gradle, Inc., Dockter H, Murdoch A, Faber S, Niederwieser P, Daley L, et al. Gradle [Internet]. 2008 [cited 3 Feb 2025]. Available from: https://gradle.org Parr T. The definitive ANTLR 4 reference. Book version: P 2.0. Dallas, Texas & Raleigh, North Carolina: The Pragmatic Bookshelf; 2014. Gonsalves L, Auckland S, Mittelbach B, Hebert C, Filios K, Gockel T. xenomachina/kotlin-argparser [Internet]. 2022 [cited 20 Feb 2025]. Available from: https://github.com/xenomachina/kotlin-argparser University Health Network, James Agnew. HAPI FHIR [Internet]. 2014 [cited 14 Mar 2025]. Available from: https://hapifhir.io Iantorno M, Otasek D, Yang H, Hall D, Passas R, Werner P. hapifhir/org.hl7.fhir.validator-wrapper [Internet]. 2025 [cited 13 Jun 2025]. Available from: https://github.com/hapifhir/org.hl7.fhir.validator-wrapper Bundesinstitut für Arzneimittel und Medizinprodukte. Alpha-ID-SE [Internet]. 2025 [cited 28 Feb 2025]. Available from: https://www.bfarm.de/EN/Code-systems/Terminologies/Alpha-ID-SE/_node.html Bundesinstitut für Arzneimittel und Medizinprodukte. ICD-10-GM—International Statistical Classification of Diseases, German Modification [Internet]. 2025 [cited 28 Feb 2025]. Available from: https://www.bfarm.de/EN/Code-systems/Classifications/ICD/ICD-10-GM/_node.html Zetlen J, PayPal, Inc. gnomon [Internet]. 2016 [cited 16 Apr 2025]. Available from: https://github.com/paypal/gnomon International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Medical Dictionary for Regulatory Activities (MedDRA) [Internet]. 28.0, 2025 [cited 21 May 2025]. Available from: https://www.meddra.org Footnotes A FHIR representation of this resource has now been put forward by the BfArM. At the time BabelFSH was conceptualized, this resource did not yet exist, so that a conversion of the ClaML representation was quite relevant at the time. As of writing, this FHIR representation is not available for all terminologies in all versions maintained by the BfArM. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 29 Nov, 2025 Read the published version in Journal of Biomedical Semantics → Version 1 posted Editorial decision: Revision requested 31 Aug, 2025 Reviews received at journal 30 Jul, 2025 Reviews received at journal 23 Jul, 2025 Reviewers agreed at journal 23 Jul, 2025 Reviewers agreed at journal 17 Jul, 2025 Reviewers invited by journal 17 Jul, 2025 Editor assigned by journal 02 Jul, 2025 Submission checks completed at journal 02 Jul, 2025 First submitted to journal 27 Jun, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6992162","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"software","associatedPublications":[],"authors":[{"id":486722952,"identity":"f68de1be-33e5-47b6-9a0c-cfcd6e67bfd5","order_by":0,"name":"Joshua Wiedekopf","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABHklEQVRIie2PMUvDQBiGvyNwLnfJeiGgfyFHIDoUf0vhIC6BDkIRLHoQOJeCawf9D46OKYF0CZ0LHYxLNyGTdCl6IdShXMjqcM/03nc8fO8HYLH8Q5AEyAFDqONnDcDABf0EGLfzIcWJwlbBQ0pHp8SsywOKIwnPm+ko8oIM39P3K8BMlHUzS8BdyZ5iJFwu1knsvxR4SytdjCU3fFGm4FfmNUjSt4KqYhRuJuWWKvaoWBoHVN5BuBn3Kwf1oxWBb7Wit0y+g0OrfNT9ClJ53CpOp6Q4QDLVW8zno8xrlvO1iPQtjv/aKmR3yedlQvzKXIw/KdHsp9f8OchQ86UewDsTu3o/E+fuynw+z46JnfwQcy2Ai790qlgsFovlyC9M01roxxuB3AAAAABJRU5ErkJggg==","orcid":"","institution":"University of Luebeck, University Hospital Schleswig-Holstein","correspondingAuthor":true,"prefix":"","firstName":"Joshua","middleName":"","lastName":"Wiedekopf","suffix":""},{"id":486722953,"identity":"b8014fd7-c915-4a64-b54a-b05728d0b29e","order_by":1,"name":"Tessa Ohlsen","email":"","orcid":"","institution":"University of Luebeck, University Hospital Schleswig-Holstein","correspondingAuthor":false,"prefix":"","firstName":"Tessa","middleName":"","lastName":"Ohlsen","suffix":""},{"id":486722954,"identity":"d5496edf-be28-4b3d-b9ca-d0bacaf49991","order_by":2,"name":"Ann-Kristin Kock-Schoppenhauer","email":"","orcid":"","institution":"University of Luebeck, University Hospital Schleswig-Holstein","correspondingAuthor":false,"prefix":"","firstName":"Ann-Kristin","middleName":"","lastName":"Kock-Schoppenhauer","suffix":""},{"id":486722955,"identity":"4ceecd3b-7dbc-41f8-ab3a-12c76100a582","order_by":3,"name":"Josef Ingenerf","email":"","orcid":"","institution":"University of Luebeck, University Hospital Schleswig-Holstein","correspondingAuthor":false,"prefix":"","firstName":"Josef","middleName":"","lastName":"Ingenerf","suffix":""}],"badges":[],"createdAt":"2025-06-27 13:38:33","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6992162/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6992162/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s13326-025-00343-4","type":"published","date":"2025-11-29T15:57:56+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":87373299,"identity":"f692791f-a4f8-44ba-a575-358cee1037e4","added_by":"auto","created_at":"2025-07-23 07:29:09","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":165158,"visible":true,"origin":"","legend":"\u003cp\u003eExcerpt from the Alpha-ID-SE 2025 source file, from [35]. The format of the file is columnar with a pipe (“|”) separator, without a header row. Column meanings: Status (mapped to a property); stable Alpha-ID code (mapped to the code); primary code in ICD-10-GM, asterisk code in ICD-10-GM; dagger code in ICD-10-GM; second primary code in ICD-10-GM; ORPHAcode; associated text (mapped to display). The additional codes are mapped to properties and have their basis in the Germany-specific adaption of the ICD-10-GM, where some classes must or may be refined through combination of dagger and asterisk codes to represent aetiology (dagger) and manifestation (asterisk).\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-6992162/v1/f4dd41e21bb3cb0d298fcd95.png"},{"id":87373302,"identity":"4473a245-e091-41f9-9677-02d7d5d24f34","added_by":"auto","created_at":"2025-07-23 07:29:09","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":147791,"visible":true,"origin":"","legend":"\u003cp\u003eHigh-level concept \u0026amp; implementation of the BabelFSH approach. The input file contains standard FSH code (black), and a structured comment bounded by recognition tokens (green) that identifies the called plugin and provides its arguments. This file is read and parsed by BabelFSH, which generates the metadata as a resource skeleton, and then calls the specified plugin. Based on the command line arguments and provided input file, the plugin adds the required content to the output, which is finally written to disk in FHIR JSON format.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-6992162/v1/a0abe6d8f29133cb8d5a5e21.png"},{"id":87375517,"identity":"643096de-0dd3-4b16-ac27-e02784a84840","added_by":"auto","created_at":"2025-07-23 07:53:09","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":355143,"visible":true,"origin":"","legend":"\u003cp\u003eExcerpt from the generated FSH file for validation using SUSHI. The property declarations are truncated, but identical to the complete declarations in Figure 5. Only three concepts are shown. Some concepts reference many more properties through the code/value pattern than shown here but appear identically otherwise. Line breaks were added manually for publication and is used in a fashion technically not valid within the FSH language.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-6992162/v1/cdc850ebf797be87445f6965.png"},{"id":87373953,"identity":"0f216172-a2b5-42b6-be75-c41f44253eea","added_by":"auto","created_at":"2025-07-23 07:37:09","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":180281,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e\u003cstrong\u003eTruncated CodeSystem generated from the FSH code in Figure 3\u003c/strong\u003e\u003c/em\u003e\u003cem\u003e. The code system was generated through SUSHI by truncating the code system declarations to two concepts, such that the generated CS is correct in terms of the FHIR specification. Property declarations and usage in the concepts are also truncated due to space restrictions.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-6992162/v1/8d748e5d3c524214aa7a3b63.png"},{"id":87373950,"identity":"54fcbf4a-a26d-49a9-bd0e-9508a07b99ac","added_by":"auto","created_at":"2025-07-23 07:37:09","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":326376,"visible":true,"origin":"","legend":"\u003cp\u003eComplete BabelFSH source file for conversion of Alpha-ID-SE 2025. The file is not truncated in any way. Metadata is declared identically to the FSH representation. The BabelFSH semantic comment is colorized as follows: The recognition tokens are orange, the plugin ID cyan, the argument names dark purple and the argument values light purple. The usefulness of command-line parsing libraries is also evident from the intentional variety of styles in which the parameter names are quoted. Line breaks were added for publication and are also not compliant with the FSH grammar.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-6992162/v1/57b75fb9634d5a3fb87fb7bb.png"},{"id":97178769,"identity":"ab6aa237-f3f2-4082-b84a-d3a09bedf429","added_by":"auto","created_at":"2025-12-01 16:13:29","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2210586,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6992162/v1/9ca202bd-d40d-4191-ab3a-5497fbc66cfe.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"BabelFSH—A Toolkit for an Effective HL7 FHIR-based Terminology Provision","fulltext":[{"header":"BACKGROUND","content":"\u003ch2\u003eHL7 FHIR for national and international standardization and harmonization\u003c/h2\u003e\n\u003cp\u003eBoth national and international demands on healthcare data interchange have made it ever-more important for healthcare providers to cooperate and to provide their primary-care data in machine-readable formats. Legislation in many healthcare systems, such as the US \u003cem\u003e21st Century Cures Act\u0026nbsp;\u003c/em\u003epassed in 2016 [1, 2], or the emerging \u003cem\u003eEuropean Health Data Space\u003c/em\u003e (EHDS) [3] in the European Union, have catalyzed this requirement. Concurrently, national initiatives in the research domain, such as the German \u003cem\u003eMedical Informatics Initiative\u003c/em\u003e (MII) have accelerated this need for interchange and alignment to common standards even further [4].\u003c/p\u003e\n\u003cp\u003eSuch large-scale integrations between disparate systems clearly require harmonization and standardization, especially towards common data structures, encodings and processes. The use of interoperability standards presents itself as a suitable approach in this regard, and the \u003cem\u003eHealth Level 7 Fast Healthcare Interoperability Resources\u003c/em\u003e Standard (HL7 FHIR standard) [5] has been recognized to be a suitable means to this end in many jurisdictions [6, 7].\u003c/p\u003e\n\u003cp\u003eHL7\u0026nbsp;FHIR is inherently designed as an internationally applicable standard to support the broadly different requirements of different jurisdictions. As part of this design requirement, extensibility and adaptability were dominant considerations in the standard development processes. Consequently, national actors as well as use-case driven consortia are expected to define and utilize FHIR profiles, derived from the core standard, to tailor the definitions of the standard to their needs.\u003c/p\u003e\n\u003ch2\u003eProfiling FHIR and ValueSet bindings\u003c/h2\u003e\n\u003cp\u003eApart from constraining element cardinality and adding extensions where applicable, the profiling process also generally changes numerous terminology bindings for coded data elements. As a rule, all coded data elements in FHIR must be bound to a \u003cem\u003eValueSet\u003c/em\u003e, which in turn is defined as subset of codes from one or more code systems. All coded data elements in the core FHIR standard have bindings already applied, but profiling often overrides these to better fit the reality of the respective jurisdiction; this mechanism is core to the FHIR philosophy.\u003c/p\u003e\n\u003cp\u003eFrom this large need to define bindings in profiles arises a need to provide \u003cem\u003eValueSet\u003c/em\u003e resources to technical consumers. \u003cem\u003eValueSet\u003c/em\u003e (VS) resources in FHIR pick codes from one or more code systems. The standard also defines the resource type \u003cem\u003eCodeSystem\u003c/em\u003e (CS) for the representation of these definitions within FHIR resources. In view of the enormous heterogeneity and complexity of classifications (e.g. ICD-10, ATC with their respective national adaptions) and terminologies (e.g. SNOMED CT, LOINC), their conversion into FHIR code systems can be at times quite challenging and is the main subject of this paper. To facilitate mappings between code systems, e.g. from non-standard local towards standard international terminology, \u003cem\u003eConceptMap\u003c/em\u003e (CM) resources can also define unidirectional mappings scoped to a certain use case. To implement system interactions with these resources, FHIR terminological services (TS) have emerged from a subsection of the specification [5 sect. 4.0] that provide both read-write access to FHIR resources and allow operations-based interactions with these resources.\u003c/p\u003e\n\u003ch2\u003eFHIR representation of terminological concepts\u003c/h2\u003e\n\u003cp\u003eIn this paper, we must carefully differentiate between the intellectual concept of \u003cem\u003ecode systems\u003c/em\u003e, \u003cem\u003evalue sets\u003c/em\u003e, and \u003cem\u003econcept maps/mappings\u003c/em\u003e, as they have been defined in the HL7 Version 3 Core Principles [8],\u003cem\u003e\u0026nbsp;\u003c/em\u003eand their manifestation in FHIR through the \u003cem\u003eCodeSystem\u003c/em\u003e (CS), \u003cem\u003eValueSet\u003c/em\u003e (VS) and \u003cem\u003eConceptMap\u003c/em\u003e (CM) resources. The definitions from the HL7 Version 3 standard have been carried forward into the design of the FHIR standard. It is stated there that \u0026ldquo;\u003cem\u003eCode System\u003c/em\u003es are often described as collections of uniquely identifiable concepts with associated representations, designations, associations, and meanings\u0026rdquo; [8 sect. 5.1.2]. Value sets are then defined as \u0026ldquo;[representing] a uniquely identifiable set of valid concept identifiers, where any concept identifier in a coded element can be tested to determine whether it is a member of the \u003cem\u003eValue Set\u003c/em\u003e at a specific point in time.\u0026rdquo; [8 sect. 5.1.3].\u003c/p\u003e\n\u003cp\u003eHowever, the HL7 Version 3 standard, and early FHIR versions including the DSTU2 release, did not provide a representation (via a resource type in FHIR) for representing the content of a code system itself, as it was understood that complex, standard terminology will be managed through independent means. From the DSTU2 release of FHIR, it is stated that:\u003c/p\u003e\n\u003cp\u003eValue sets that contain\u003cstrong\u003e\u0026nbsp;inline code systems\u003c/strong\u003e \u003cstrong\u003eare\u003c/strong\u003e \u003cstrong\u003eintended for small, simple code systems\u003c/strong\u003e that are found throughout the implementation context (e.g., lists of words, status codes, enumerations). The inline code system definition is \u003cstrong\u003enot intended to represent large publically defined terminologies\u003c/strong\u003e such as LOINC, etc. - \u003cstrong\u003ethese terminologies have their own distribution formats\u003c/strong\u003e. [9\u0026nbsp;sect.\u0026nbsp;6.21.1]\u003c/p\u003e\n\u003cp\u003eAs such, small code systems were manifested through their use within \u003cem\u003eValueSet\u003c/em\u003e resources, providing definitions for small, constrained applications such as communicating the status of an \u003cem\u003eObservation\u0026nbsp;\u003c/em\u003eresource. For standard terminologies, it was understood that they are maintained independently, but consuming and producing systems that communicate using these codes would have access to the required terminology to then expose it to their users and internal processes.\u003c/p\u003e\n\u003cp\u003eIn FHIR STU3, it was realized that many terminological use cases could be addressed by representing code systems in a FHIR resource of their own. While the in-line definition in the VS resource worked well for constrained use cases, where the value set is used just a few times within the standard, this approach does not scale with the representation of standard terminology, that is intended to be used in many use cases and value sets. While the then-created \u003cem\u003eCodeSystem\u003c/em\u003e resource was not intended to be used for the maintenance process of most terminology, which was understood to be better served by the established processes [5 sect. 4.8.1], most coding systems can thus be expressed using native FHIR resources. This allows the definition of FHIR-based terminology services for code systems that are not considered internal to the HL7 specification and derived artefacts, but also large code systems brought into the server through their FHIR representation.\u003c/p\u003e\n\u003cp\u003eTo differentiate between these abstract concepts and their manifestations in FHIR, we will only use the long forms \u003cem\u003ecode\u0026nbsp;system\u003c/em\u003e, \u003cem\u003evalue\u0026nbsp;set\u003c/em\u003e, and \u003cem\u003econcept\u0026nbsp;mapping\u003c/em\u003e to reference the abstract concepts and resources that are not represented in FHIR resources, while the forms \u003cem\u003eCodeSystem\u003c/em\u003e/\u003cem\u003eValueSet\u003c/em\u003e/\u003cem\u003eConceptMap\u003c/em\u003e and the abbreviations \u003cem\u003eCS\u003c/em\u003e/\u003cem\u003eVS\u003c/em\u003e/\u003cem\u003eCM\u003c/em\u003e are only used to reference the FHIR manifestation.\u003c/p\u003e\n\u003cp\u003eAs is common in the HL7 FHIR standard\u0026rsquo;s design, the CS resource can only capture a subset of use cases directly. Based on the 80/20 rule, FHIR resource development should be \u0026ldquo;focus[ed] on the 20 % of requirements that satisfy 80 % of the interoperability needs\u0026rdquo; [5\u0026nbsp;sect.\u0026nbsp;2.1.19.2], leading to limitations in the expressivity of some resources for specialized use cases.\u003c/p\u003e\n\u003cp\u003eFor example, formal ontologies that are often expressed in the Web Ontology Language (OWL) often build on top of each other. The OBO foundry, a project that aims to develop a number of interoperable biomedical ontologies, even states in their guiding principles that the \u0026ldquo;scope\u0026rdquo; of any ontology they oversee \u0026ldquo;should be fairly narrow\u0026rdquo; and that \u0026ldquo;[r]equired terms that are out of scope should be imported from the appropriate ontology unless no such ontology exists\u0026rdquo; [10]. For example, a narrowly-scoped ontology in the domain of human ontologies can build on the concepts defined in an ontology describing cross-species anatomy [11]. The narrowly scoped concepts (\u003cem\u003eclasses\u003c/em\u003e in the convention of formal ontologies) are then defined as sub-classes or children of those imported classes. The subsumption relationship in the domain of formal ontologies follows that of description logic, and has a very rigid definition [12]. Every subclass of a class is also an instance of the parent class, leading to the strict, transitive, \u003cem\u003eis-a\u003c/em\u003e relationship. In this way, connected ontologies span a large semantic network through their is-a relationships. While this relationship can be expressed in the CS resource, using custom properties that link concepts across CS boundaries, the subsumption relationship in FHIR TS is only defined within the scope of a single CS. Regardless, a conversion of the code system defined in formal ontologies to the FHIR CS resource can be beneficial, as highlighted by Metke-Jimenez et al. [11].\u003c/p\u003e\n\u003cp\u003eAnother shortcoming of the CS resource in particular is the impossibility of representing compositional code systems, such as the Unified Code of Units and Measures (UCUM) [13]. Units in UCUM can be combined arbitrarily, to express measurements in any kind of physical dimension. That fact does allow, however, for an infinite number of combinations of the defined unit symbols, especially in conjunction with the annotation facilities that can be used to express countable concepts. For example, the expression \u003cem\u003e{FHIR CodeSystem resources}/[nmi_i].s2\u003c/em\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003e(countable things per nautical-mile per second-squared) is nonsensical for any kind of real measurement, but perfectly valid in the language defined by the UCUM grammar. While a finite fragment of available and common UCUM codes could be created and distributed, the FHIR CS resource lacks the means to express the grammar-based UCUM standard. A FHIR TS could, however, provide some interactions defined in the terminology module of the standard through special casing, such that validation of UCUM codes can be implemented adequately.\u003c/p\u003e\n\u003ch2\u003eTerminological services within national projects\u003c/h2\u003e\n\u003cp\u003eTerminological services are an important factor towards healthcare data interoperability and, ultimately, data harmonization. The German MII (since 2015) and Network University Medicine (NUM, since 2020), funded by the German Federal Ministry of Education and Research (now German Federal Ministry of Research, Technology and Space) have recognized this need. In this funding scheme, the sub-project \u003cem\u003eService Unit Terminological Services\u0026nbsp;\u003c/em\u003e(SU-TermServ) [14] has been tasked with providing terminological services through a FHIR terminology server, beginning in 2023. The required content is, in particular, defined and specified by the modular Core Data Set (CDS) of the MII [15]. From the Germany-wide drive to implement and exchange CDS-compliant FHIR resources, the need for a harmonized tooling to provide FHIR-based representations of referenced code systems originated. The core of the SU-TermServ project revolves around the provisioning of a central instance of Ontoserver [16] with all needed resources. Many parts of both the CDS and of implementation guides the CDS references use large, externally defined code systems, and those are needed in the HL7 FHIR format so they can be uploaded to a FHIR terminology server. For the professional support of a TS with such a national scope, transparent and effective processes must be established to provide the needed resources at the correct time for the TS to then generate a benefit for the surrounding health technology landscape.\u003c/p\u003e\n\u003cp\u003eThis is at odds with the current reality: most well-recognized code systems currently are instead maintained using specific tools and platforms and are often provided using proprietary formats or using very different standards. For example, the ICD-10-WHO resource (translated for use in Germany) is maintained by the German Federal Institute for Drugs and Medical Devices (\u003cem\u003eBfArM\u003c/em\u003e) using internal processes, taking change requests from the medical community into account. As of writing, the document-based distribution in PDF format is the authoritative version, but a number of machine-readable formats are made available: one version uses the ISO ClaML standard (Classification Markup Language, ISO 13120:2013) [17], while one format uses tab-separated files for import into relational databases\u003csup\u003e[1].\u003c/sup\u003e\u003c/p\u003e\n\u003cp\u003eContrast this to OncoTree, which is a classification for tumor types [18]. It is distributed mainly through an unstandardized web-based application programming interface (API). Considering also the fact that internal catalogs are often only available directly in the systems they are defined within, a clear need for simple and extensible toolkit for conversion of terminological artifacts into the HL7 FHIR resources \u003cem\u003eCodeSystem\u003c/em\u003e, \u003cem\u003eValueSet\u003c/em\u003e and \u003cem\u003eConceptMap\u003c/em\u003e emerges. In Table 1, a brief survey of some terminologies referenced within the CDS of the MII further illustrates this need.\u003c/p\u003e\n\u003cp\u003eTable 1: Exemplary terminologies/code systems referenced in the German Medical Informatics Initiative\u0026rsquo;s Core Data Set with distribution formats. Available FHIR CS resources are highlighted in boldface.\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eName\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eUse Case\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDistribution Format\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"3\" valign=\"top\" style=\"width: 627px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDiagnosis coding\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eICD-10-GM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eGerman adaptation of the ICD-10 for morbidity coding\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eClaML,\u003cstrong\u003e\u0026nbsp;some versions as FHIR CS\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eICD-10-WHO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eMortality coding\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eClaML, \u003cstrong\u003esome versions as FHIR CS\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eICD-O-3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eCancer classification\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eClaML\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eAlpha-ID-SE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eCoding for rare diseases aligned to ICD-10-GM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eColumnar text file\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eORPHAcodes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eCoding for rare diseases\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eXML\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eOncoTree\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eCancer classification\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eWeb API\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"3\" valign=\"top\" style=\"width: 627px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eProcedure coding\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eOPS\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eProcedure coding\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eClaML\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"3\" valign=\"top\" style=\"width: 627px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMedication coding\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eATC (German \u0026amp; International variant)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003ePharmaceutical agents\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eSpreadsheet (Office Open XML \u003cem\u003eXLSX\u0026nbsp;\u003c/em\u003eformat)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eASK\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003ePharmaceutical agents (on the substance level)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eXML\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eCAS\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eChemical substances\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003ePartially available in the ASK XML file\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eEDQM Standard Terms\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003ePharmaceutical dose forms, routes of administration, \u0026hellip;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eWeb API\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eUNII\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003ePharmaceutical agents (on the substance level)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eColumnar text file\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003ePZN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eCommercially available pharmaceutical products in Germany\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eColumnar text file\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"3\" valign=\"top\" style=\"width: 627px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMedical Devices \u0026amp; Imaging\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eISO/IEEE 11073-10101\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eMedical device communication standard terminology\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eFHIR CS\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eDICOM DCM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eMedical imaging standard\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eFHIR CS plus FHIR VS\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eRadLex\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eRadiological lexicon for reporting\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eOWL\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"3\" valign=\"top\" style=\"width: 627px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eOMICS data\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eHGNC\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eGene symbols/names\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eColumnar text file or JSON\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eGENO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eGene functions\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eOWL\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eHPO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003ePhenotypes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eOWL\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eSO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eSequence annotation\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 209px;\"\u003e\n \u003cp\u003eOWL\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eAs outlined above, the FHIR standard has always anticipated this fact, but many FHIR-based terminology services require FHIR-based representations of code systems to provide the resources through the APIs defined in the terminology module. The concept of a FHIR TS, which is specified in a functional manner through its API [5\u0026nbsp;sect.\u0026nbsp;4.7], does not actually require the terminology to be represented in FHIR resources. However, the alternative approach, whereby the terminology is maintained by the system developer in proprietary formats, is often not suitable for large-scale projects such as the MII and NUM. As those projects inherently move fast, the resulting dependency on the TS developer to provide new terminologies in an agile fashion can be a limiting factor in the projects\u0026rsquo; development.\u003c/p\u003e\n\u003cp\u003eWithout the terminology artefacts required by the respective domain at the correct time, even best-of-breed terminology services cannot generate a benefit to the surrounding landscape and can thus be considered of no use. In this work, we thus propose a novel solution for making terminologies available through FHIR resources, building on established standards and allowing for rapid implementation for highly specific conversion processes.\u003c/p\u003e\n\u003ch2\u003ePrior Work\u003c/h2\u003e\n\u003cp\u003eThe research community has so far not addressed the problem of terminology generation for HL7 FHIR extensively. The developers of \u003cem\u003eOntoserver\u003c/em\u003e, a FHIR-based terminology server in widespread use and the basis for the service provided by the SU-TermServ, have touched upon the generation of resources in their 2018 paper discussing the software [16]. This includes support for ClaML [17] generation using the ClaML-FHIR tool [19]. From the same working group, a transformation of Web Ontology Language (OWL) ontologies into FHIR was presented in 2019 [11].\u003c/p\u003e\n\u003cp\u003eMoreover, a literature review reveals several transformation scripts for specific resources. We have ourselves for example provided scripts for OncoTree and the EDQM Standard Terms database [18,\u0026nbsp;20\u0026ndash;22], and have discussed further challenges in CS and VS transformation from the ART-DECOR platform, already touching upon the problems outlined above [23].\u003c/p\u003e\n\u003cp\u003eEspecially considering the open-source code of the identified prior work, a key insight towards simplifying this process is the fact that most tools follow a similar structure. First, the metadata of the resources, e.g. the canonical URI, the version, description, author, copyright, and related fields are generated. Sometimes, this is hard coded within the tool, while in other instances, these data are defined using configuration files or provided as command line arguments.\u003c/p\u003e\n\u003cp\u003eOnce the metadata of the resource is defined, the tools then take in the authoritative sources of the artifact and generate the needed content: for a CS, a list of \u003cem\u003econcept\u003c/em\u003e elements, for VS, \u003cem\u003einclude\u003c/em\u003e and \u003cem\u003eexclude\u0026nbsp;\u003c/em\u003eelements, and for a CM, a list of \u003cem\u003egroups\u003c/em\u003e, which in turn contain \u003cem\u003eelements\u003c/em\u003e. The content generator functionality in the tools can be highly specific (e.g. for OncoTree) or could also be generalizable to many different resources using the same formats (e.g. for ClaML).\u003c/p\u003e\n\u003cp\u003eFinally, the resources are written to disk. While the implementation differs broadly, this pattern of defining metadata first and generating content afterwards holds for all related tools that we have so far identified.\u003c/p\u003e\n\u003cp\u003eAnother insight from the literature are the degrees of freedom associated with providing FHIR resources from other sources. For CS, consider the representation of hierarchical relationships: \u003cem\u003econcept\u003c/em\u003e elements can define \u003cem\u003econcept\u003c/em\u003e elements in-line, representing a strictly monohierarchical relationship. The representation through a property, generally using the \u003cem\u003eparent\u0026nbsp;\u003c/em\u003eand/or \u003cem\u003echild\u003c/em\u003e relationship code, is however anecdotally preferred by TS implementers; and the FHIR standard states that they should not be used concurrently [5 sect. 4.8.13]. However, one CS might only use \u003cem\u003eparent\u003c/em\u003e, another only \u003cem\u003echild\u003c/em\u003e, and another one might link concepts in both directions with both properties.\u003c/p\u003e\n\u003cp\u003eIn rare cases, resources might not use those standard properties, instead assigning a different kind of relationship between concepts. The rare disease nomenclature ORPHAcodes [24], defined by the Orphanet database [25], for example, uses the concept of \u003cem\u003eaggregation levels\u003c/em\u003e to link disorders and subtypes of disorders. Depending on the reporting use case, more specific or more general concepts might be required. This property can be understood as a special case of parent-child relationships, yet there is not a single parent for each property as in many other classifications. Instead, the terminology defines a \u0026ldquo;forest\u0026rdquo; of small \u0026ldquo;trees\u0026rdquo; through the aggregation level. In some use cases, such as feasibility queries [26], this relationship can be considered to be equivalent to the parent/child relationships, as researchers selecting a single concept from the user interface also expect the subtypes of that disorder to be selected. This needs to be achieved either by the search interface providing special consideration for the code system, by the TS interpreting this relationship as equivalent to the parent/child relationship, or by the FHIR resource serializing this property as a parent/child relationship.\u003c/p\u003e\n\u003ch2\u003eObjective\u003c/h2\u003e\n\u003cp\u003eThe aim of this work is to provide to the broader FHIR community a simple-to-use and simple-to-extend toolkit for providing all manner of terminological resources, derived from their native distribution formats, building on established mechanisms.\u003c/p\u003e\n\u003cp\u003eThe key architectural principle derived from the literature is the clear separation between the definition of resource metadata and the generation of content. Metadata generation refers both to metadata that describes the resources in their entirety, such as name, ID, canonical URI etc., and to metadata that can be applied to CS concepts, i.e. properties. The metadata generation should follow a uniform approach across all resources, thus allowing for harmonization and convention across resources. In contrast, content generation must support both reusable, generic methods and highly specific transformations, such that all requirements of the terminologies in question can be addressed appropriately.\u003c/p\u003e\n\u003cp\u003eThrough this tool, the representation of terminology resources can be harmonized through the re-use of rules, conventions and transformation logic, thus aiding terminology service adoption.\u003c/p\u003e"},{"header":"IMPLEMENTATION","content":"\u003ch2\u003eFHIR Shorthand\u003c/h2\u003e\n\u003cp\u003eThe FHIR Shorthand (FSH) language specification [27] has quickly become a cornerstone of the FHIR community, and is now considered the dominant means to define FHIR resources as source-code using a domain-specific language (DSL). By compiling the source code files with the reference compiler, SUSHI [28], the definition of profiles and other \u003cem\u003eStructureDefinition\u003c/em\u003e resources has been simplified and streamlined.\u003c/p\u003e\n\u003cp\u003eFSH\u0026rsquo;s language elements are geared towards the concise definition of FHIR resources, especially profiles and extensions, but also resource instances of all other resource types, such as example resources within an \u003cem\u003eImplementationGuide\u003c/em\u003e (IG). Crucially, the FSH language and the SUSHI compiler also support the definition of \u003cem\u003eCodeSystem\u003c/em\u003e and \u003cem\u003eValueSet\u003c/em\u003e resources directly, and \u003cem\u003eConceptMap\u003c/em\u003e via the arbitrary-resource facilities. However, especially the CS support is geared towards the definition of small, internal CS with only a few codes. All concepts must be listed in-line of the entire FSH source file (see an example in\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003eFigure 3). Generating FSH sources from original content and then compiling this with SUSHI is in our experiments certainly possible, but computationally quite expensive.\u003c/p\u003e\n\u003cp\u003eHowever, as SUSHI is the reference implementation of an open-source formal language specification, and only a subset of this specification is relevant to the terminology generation process, the implementation of a use-case specific compiler tool was initiated, which we call \u003cem\u003eBabelFSH\u003c/em\u003e.\u003c/p\u003e\n\u003ch2\u003eImplementation\u003c/h2\u003e\n\u003cp\u003ePrior to the implementation of the system, we have defined eight design goals that the tool must adequately address:\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\u003cstrong\u003eDG1\u0026mdash;Compatibility with SUSHI:\u0026nbsp;\u003c/strong\u003eThe BabelFSH source files shall remain completely compatible with the SUSHI reference implementation (only) for \u003cem\u003eCodeSystem\u003c/em\u003e,\u003cem\u003e\u0026nbsp;ValueSet\u0026nbsp;\u003c/em\u003eand\u003cem\u003e\u0026nbsp;ConceptMap\u003c/em\u003e; no language elements shall be added to FSH DSL that result in errors when using SUSHI to compile BabelFSH source files. Thus, changes in the FSH specification going forward can be incorporated into BabelFSH without delay.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eDG2\u0026mdash;Metadata in FSH:\u0026nbsp;\u003c/strong\u003eThe FSH DSL shall be used to define the metadata of the resulting terminology resources. The core FSH language elements shall not be used to define the concepts of CodeSystem resources. This requirement enforces the beneficial separation of metadata and content that was identified from the literature.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eDG3\u0026mdash;Extensibility:\u0026nbsp;\u003c/strong\u003eIt shall be easy to generate content from diverse sources. Core functionality shall be provided to implementers to address common requirements. If a code system requires highly specialized logic, this must be supported. Otherwise, the expressivity of the system would be so limited that many resources would not be convertible, dramatically limiting the usefulness of the system.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eDG4\u0026mdash;Simple API and Self-Documentation:\u0026nbsp;\u003c/strong\u003eImplementers shall be provided a simple API to hook into the generation process, such that needed logic can be implemented quickly, allowing for fast turnaround and easy maintenance.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eDG5\u0026mdash;Validation:\u0026nbsp;\u003c/strong\u003eGenerated resources shall be validated for their correctness against the FHIR specification. Especially syntactic problems shall not be accepted, but semantic validation should also be performed if applicable. A strong validation process helps to ensure consistent resources, aiding implementers in creating correct and comprehensive metadata for their resources.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eDG6\u0026mdash;Performance:\u0026nbsp;\u003c/strong\u003eBabelFSH shall be performant, such that the generation of multiple versions of large code systems can be carried out without issue and should offer substantial performance benefits over native FSH compilation. Were the performance of BabelFSH similar to SUSHI, the benefit of a separate parser/compiler would be offset by the complexity of this tool.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eDG7\u0026mdash;FHIR Version support:\u0026nbsp;\u003c/strong\u003eThe tool shall support multiple versions of the FHIR core specification. At the time of writing, this especially requires support for versions R4/R4B and R5, leading to a need of special consideration especially in the CM generation aspect, since CM has undergone considerable conceptual changes in FHIR R5. The user must be explicit in which FHIR version they use, the concurrent generation of R4B and R5 resource in the same application invocation shall not be supported. This ensures relevance of BabelFSH going forward, such that later versions of FHIR can also be integrated.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eDG8\u0026mdash;Open Source:\u0026nbsp;\u003c/strong\u003eBabelFSH must be licensed with an open-source license, so that the FHIR community can, without any restrictions for commercial use, make use of the tool, so that enhancements can also be fed back into the applications. Due to the open nature of the FHIR community, the provision of an open-source tool can foster collaboration and improve the adoption of FHIR terminology services and FHIR in general.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eWith respect to \u003cstrong\u003eDG2\u003c/strong\u003e, we define \u003cem\u003emetadata\u003c/em\u003e as those data elements available in terminological resources to describe the resource in its entirety (i.e., the canonical URI, version, name, title, technical ID, publisher, etc., as well as extensions for the entire resource), and also those data elements to describe the concepts defined within a CS (i.e., properties). The first kind of metadata is defined by the FHIR core standard, and FSH rules will be required to fill the respective slots. Properties follow a code-value-paradigm, whereby the declaration assigns each property a code and data type. The FHIR specification implicitly defines a selection of properties that need not be declared in CS, such as the \u003cem\u003eparent\u003c/em\u003e, \u003cem\u003echild\u003c/em\u003e, or \u003cem\u003einactive\u0026nbsp;\u003c/em\u003eproperties that are commonly used in FHIR CS resources. All other properties must be declared in the FSH source and will then be used in the plugin implementation when concepts are generated. General converter implementations \u003cstrong\u003e(DG3)\u003c/strong\u003e then need to provide configuration options to map properties in the terminology sources to the properties generated by the plugin.\u003c/p\u003e\n\u003cp\u003eConcerning \u003cstrong\u003eDG5\u0026nbsp;\u003c/strong\u003eand \u003cstrong\u003eDG7\u003c/strong\u003e, supporting either R4B or R4 is adequate, as no changes between these versions affect the terminology module [5\u0026nbsp;sect.\u0026nbsp;2.1.11], and validation using either of these versions will result in the same output messages.\u003c/p\u003e\n\u003ch3\u003eProgramming Language\u003c/h3\u003e\n\u003cp\u003eOur implementation of BabelFSH is written using the Kotlin programming language [29], utilizing the Java Virtual Machine. Kotlin is a modern, statically-typed and inherently null-safe language that is entirely compatible with the Java language that many eHealth researchers are already familiar with. Hence, established libraries in the rich Java ecosystem can also be utilized. Moreover, Java source files can be included in Kotlin projects, so that Java plugins can also be implemented by users not familiar with Kotlin. The tool is built using the Gradle build automation toolkit [30].\u003c/p\u003e\n\u003ch3\u003eParsing FSH code\u003c/h3\u003e\n\u003cp\u003eTo parse the FSH source files needed for the BabelFSH system, we use the free and open-source parser generator ANTLR v4 [31]. ANTLR uses formal grammars to generate recursive-descent lexers and parsers, with associated infrastructure to integrate the generated code into applications. Lexers generate tokens from the input code, which are then assigned meaning in the parser grammar. Using ANTLR-provided hooks, applications can receive the parsed structure as the parser processes the input stream and populate their internal data model from this token stream.\u003c/p\u003e\n\u003cp\u003eWe base our parser and lexer grammar on the implementation for the FSH reference implementation, SUSHI [28], which also utilizes ANTLR v4. By modifying select sections of the SUSHI grammar and adding other grammars as needed, we can add parseable language elements to the generated FSH parsers. To remain compatible with standard FSH code (\u003cstrong\u003eDG1\u003c/strong\u003e), we decided on a strategy further outlined in Table 2. We use block comments, which are an existing language element of FSH, to define command line arguments to the plugins. Block comments follow the C-style syntax using a slash and asterisk symbol at the leading and trailing end of the comment. For these comments to be considered by BabelFSH, a special recognition token (leading: /*^babelfsh, trailing ^babelfsh*/) was introduced, so that normal comments are not hijacked by our approach. The FSH source code is checked for syntactic and semantic correctness and then dissected into a parse tree. Errors are thus caught early in the processing and are surfaced to the user in meaningful error messages, pointing them towards the line where a declaration does not follow the FSH language specification.\u003c/p\u003e\n\u003cp\u003eTable 2: Grammar algorithm of the BabelFSH implementation.\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eStep #\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 161px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eComponent\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 413px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eChanges/Strategy\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e1\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 161px;\"\u003e\n \u003cp\u003eFSH Lexer grammar\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 413px;\"\u003e\n \u003cul\u003e\n \u003cli\u003eComment tokens are not skipped, but sent to a different channel\u003c/li\u003e\n \u003cli\u003eA new recognition token for the start end end of a multi-line comment was introduced: /*^babelfsh and *^babelfsh*/, so that the content of these (structured) comments can be bubbled up into the parser.\u003c/li\u003e\n \u003c/ul\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e2\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 161px;\"\u003e\n \u003cp\u003eFSH Parser grammar\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 413px;\"\u003e\n \u003cul\u003e\n \u003cli\u003eThe parser rules for instances, CS and VS and reusable RuleSets were amended add an optional terminologyPluginComment? rule\u003c/li\u003e\n \u003c/ul\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e3\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 161px;\"\u003e\n \u003cp\u003eANTLR Parser Listener\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 413px;\"\u003e\n \u003cul\u003e\n \u003cli\u003eEarly validation of the content: resource types other than those supported by ANTLR are ignored\u003c/li\u003e\n \u003cli\u003eContext of declarations is processed\u003c/li\u003e\n \u003c/ul\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e3\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 161px;\"\u003e\n \u003cp\u003eCommand Line Parser\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 413px;\"\u003e\n \u003cul\u003e\n \u003cli\u003eA secondary grammar enforces a syntactic structure in the command line arguments.\u003c/li\u003e\n \u003cli\u003eSemantics of these declarations is defined by the plugins.\u003c/li\u003e\n \u003c/ul\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 66px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e4\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 161px;\"\u003e\n \u003cp\u003eFSH Rule Parser\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 413px;\"\u003e\n \u003cul\u003e\n \u003cli\u003eAnother secondary grammar is used to create a parse tree from the FSH rules, so that soft indexing is supported.\u003c/li\u003e\n \u003cli\u003eFSH rules can address the \u003cem\u003ecurrent\u003c/em\u003e (=) and\u0026nbsp;\u003cem\u003enext\u003c/em\u003e (+) elements:\u0026nbsp;\u003cbr\u003e* code[+].coding[=].display = \u0026quot;Display\u0026quot;.\u0026nbsp;\u003cbr\u003eThis mechanism facilitates the re-use of rules within RuleSet FSH items.\u003c/li\u003e\n \u003c/ul\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003ch3\u003eApplication Programming Interface\u003c/h3\u003e\n\u003cp\u003eTo allow for easy extensibility (\u003cstrong\u003eDG3\u003c/strong\u003e) of the core logic of the application, we have decided on a plugin-based approach. Plugins must be provided and registered at compile-time. They are identified using a unique plugin ID, and need to define the arguments that they expect, such as input file paths or URIs, columns to map, properties to render. To make the declaration of these arguments in the FSH code simple and understandable (\u003cstrong\u003eDG4\u003c/strong\u003e), we utilize an established command line parsing library for Kotlin [32]. Plugins then declare short and long forms of their needed arguments, such as ‑‑file/‑f\u0026nbsp;for the input file, and associated help texts for each argument. By declaring optional and required arguments in the source code, validations for the provided argument values, and the facility to validate the interaction between e.g. mutually exclusive arguments, the plugins receive a type-safe set of arguments they can use in concept generation. As plugin developers also need to provide help texts for the arguments, BabelFSH can automatically generate an interactive help, and can give the user detailed feedback on incorrect argument use without effort to the plugin developer (\u003cstrong\u003eDG4\u003c/strong\u003e). Based on these help texts, online documentation can be pre-generated, such that additional information can be provided to users.\u003c/p\u003e\n\u003cp\u003eThe plugins are intended only for content generation, not the definition of metadata, in line with \u003cstrong\u003eDG2\u003c/strong\u003e. Abstract classes define the entry points for this content generation:\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003eFor CS resources, the generation of concept entries,\u003c/li\u003e\n \u003cli\u003efor VS, the definition in terms of inclusions and exclusions,\u003c/li\u003e\n \u003cli\u003eand for CM, the definition of groups.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eEach plugin must implement only two methods, one for parsing the command-line arguments in a type-safe fashion, one for generating the content. The generator API can support complex iteration and inter-dependencies for the resources, such that the expressivity of the generator routines is not artificially limited compared to purpose-built tools.\u003c/p\u003e\n\u003cp\u003eTo support multiple FHIR versions in the same application \u003cstrong\u003e(DG7)\u003c/strong\u003e, the plugins do not natively generate FHIR resources. Instead, they use version-agnostic proxy classes, which are serialized as FHIR data structures only when the resources are written out to disk. The release R5 of FHIR changed terminological resources in two domains. First, all resources have additional metadata fields, such as \u003cem\u003eeditor\u0026nbsp;\u003c/em\u003eor \u003cem\u003ereviewer\u003c/em\u003e. As those are within the responsibility area of the standard FSH code \u003cstrong\u003e(DG2)\u003c/strong\u003e, the use of R5-specific metadata declarations requires the user to switch to the R5 processing mode. Otherwise, their use would result in validation errors, as they are unknown to the R4B validator.\u003c/p\u003e\n\u003cp\u003eIn the area of resource content, minor changes present for CS and VS, which are handled by the proxy-class approach. However, CM has undergone a fundamental redesign which makes a version-agnostic implementation extremely challenging. Most importantly, the R5 CM requires a data element \u003cem\u003erelationship\u003c/em\u003e which corresponds to the \u003cem\u003eequivalence\u003c/em\u003e attribute in R4B. This attribute needs to be provided for each mapping entry and states the correspondence between the source and target concepts within the specified code system. The coding of these attributes has been changed substantially, such that mapping these codes is a problematic undertaking that BabelFSH will not attempt automatically. As such, for CM generation, the representation of concept correspondence within each plugin must address this challenge themselves, whereas support for a single version is an option in the API.\u003c/p\u003e\n\u003ch3\u003eResource Validation\u003c/h3\u003e\n\u003cp\u003eTo ensure that the generated resources are compliant with the specification, we have implemented a comprehensive validation strategy (\u003cstrong\u003eDG5\u003c/strong\u003e).\u003c/p\u003e\n\u003cp\u003eThe HAPI FHIR [33] library is used to prevent the generation of invalid output resources by our grammar-based FSH rule interpreter, which operates independent of the FHIR resource definition, and can thus not catch such errors on its own. To correlate the declaration of a rule (complicated by the possibility of rule re-use through \u003cem\u003eRuleSet\u003c/em\u003e FSH items) with the error messages from HAPI FHIR, we have implemented an iterative approach. First, we generate a common resource skeleton. In parsing the rules, the data model stores the precise location (filename, FSH item and line number) the rule was originally defined. Rules that belong together, such as the system URI, the display and the code within a \u003cem\u003eCoding\u003c/em\u003e datatype, are grouped using a parse tree. By iterating over the first level of rules in this tree and executing the rules on this skeleton, HAPI error messages can be correlated with the declaration context, which points the user towards the error.\u003c/p\u003e\n\u003cp\u003eMoreover, users are provided an additional layer of validation through the integration of the FHIR validation engine that is maintained as a core infrastructural component by HL7 International in conjunction with the HAPI FHIR developers [34]. This additional validation step is opt-in and errors in this validation will not result in application termination, as many messages from this validation pipeline will be false positives. However, the validator will catch incorrect usage of profiles, missing elements etc. that might not be caught by the HAPI FHIR pipeline, so that the developer can rapidly iterate over the resource definition in their BabelFSH source files.\u003c/p\u003e\n\u003ch2\u003eConcept Validation Strategy\u003c/h2\u003e\n\u003cp\u003eBoth the correctness and the performance \u003cstrong\u003e(DG6)\u003c/strong\u003e of the proposed tool needs to be assessed. As a point of comparison, the existing SUSHI reference compiler can be used: by generating standard FSH code for a suitable code system and compiling that with SUSHI both dimensions can be evaluated.\u003c/p\u003e\n\u003cp\u003eWe utilize the German Alpha-ID-SE code system [35] for comparison. This artefact is an alphabetical index to the morbidity classification ICD-10-GM [36], with Alpha-ID-SE containing many more entries (2025 version: 90399) than ICD-10-GM (2025 version: 17089). For the more differentiated text entries below the ICD-10 code level, ORPHAcodes for rare diseases are added where possible [24]. The resource is distributed as a pipe-separated plain-text file (\u003cem\u003esource file)\u003c/em\u003e, making the conversion to SUSHI straightforward. Each row of the file defines a single concept and several additional properties for these concepts. An excerpt of this file is shown in\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003eFigure 1.\u003c/p\u003e\n\u003cp\u003eUsing this file, the code system can be rendered as a standard FSH file using simple scripts. At the header of the output file, some metadata declarations are required, and from the source file, every line in that file will result in several lines of FSH code that declare not only the concept, but also the properties defined by Alpha-ID-SE. An equivalent BabelFSH file can be generated from the same template, where the metadata is declared identically (consistent with \u003cstrong\u003eDG1\u003c/strong\u003e, all FSH declarations should be correctly implemented by BabelFSH). By adding a BabelFSH plugin comment to that declaration, the content is then pulled out of the FSH declarations into the plugin architecture by providing the needed plugin arguments in the comment. From these two files, two JSON representations of the Alpha-ID-SE code system can be generated using SUSHI and BabelFSH, while the runtime of each approach is timed using command line tools.\u003c/p\u003e\n\u003cp\u003eAssessing the completeness of the generation is thus straightforward: if the number of concepts generated from the FSH and BabelFSH files are identical to the number of rows in the source file, the generation is complete. However, validating the completeness of the entire method is difficult to generalize, as that greatly depends on the plugin. Validating the completeness of each plugin thus needs to be done by the plugin authors on a case-by-case basis, As the source code for each plugin will generally be quite compact, they will be easily testable and debuggable.\u003c/p\u003e"},{"header":"RESULTS","content":"\u003cp\u003eThe resulting system is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. First, BabelFSH identifies the input files using filename matchers, with all files needing the extensions \u0026ldquo;.babel.fsh\u0026rdquo; or \u0026ldquo;.babelfsh.fsh\u0026rdquo; to be considered by the tool. This is to ensure a clear delineation between aspects better served by SUSHI and those to be solved by BabelFSH. The BabelFSH approach is implemented through these four steps:\u003c/p\u003e\u003cp\u003e\u003col\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eThe input files are parsed against the FSH grammar into a set of FSH items and rules.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eThrough iterative application of the rules, the resources are generated and the rules are validated for compliance with the FHIR standard.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eThe plugin command lines are parsed, and the identified plugin is called with the command line arguments to then add the respective content to the output.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eLastly, the output is written to an output file in the specified output folder.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003c/ol\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eEvaluation\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eComparing the performance of our approach with the state-of-the art SUSHI compiler, we have generated FSH code from the Alpha-ID-SE sources in version 2025. The implementation of the FSH conversion was accomplished through a na\u0026iuml;ve Python implementation that reads the file line by line and uses string templates to generate FSH statements for each concept.\u003c/p\u003e\u003cp\u003eThe approximately 90 400 input lines (4825 KiB) thus balloon to more than 500 000 lines (+\u0026thinsp;453%) of FSH code (21 492 KiB, + 345%) to not only define the concepts and displays, but also the metadata for each concept available in the resource. Converting the sources on the test computer, an Apple MacBook Pro with a M4 Pro processor and 48 GB of RAM, was done in less than one second.\u003c/p\u003e\u003cp\u003eHowever, it was apparent that SUSHI struggles with the generation of content from this extremely large file. Using the \u003cem\u003egnomon\u003c/em\u003e tool [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e], we timed that SUSHI version 3.14.0 took approximately 16 seconds just to read in the file, and generating the single FHIR resource from this file took an additional 812 seconds. All in all, the command terminated after 833 seconds or 13.9 minutes wall clock time, with the laptop running in high-power mode and being otherwise idle. A short excerpt of the generated FSH file is available in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, and the resulting FHIR JSON representation in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eComparing this with the BabelFSH approach, a single version of Alpha-ID-SE was defined in less than 60 lines of FSH code with the metadata of the resource being identical to the standard FSH version. Converting this resource to FHIR JSON took less than 3 seconds in total. The full BabelFSH source file is available in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eAs a further point of comparison, defining all available versions of Alpha-ID-SE starting in 2015, when the resource first became available, required less than 160 lines of code (including whitespace and comments) and compiled in 16.1 seconds (with the supplemental validation enabled), demonstrating the power of FSH in terms of definition re-use. Using parametrized rulesets, the declaration of the FSH item for each version only requires five lines of self-explanatory code.\u003c/p\u003e\u003cp\u003eFor the example of Alpha-ID-SE, we have verified that the generated resources are indeed semantically identical: all metadata is present, all concepts from the sources are defined, and all properties are mapped to FHIR properties. By formatting the resources using JSON tooling and comparing them using the standard GNU \u003cem\u003ediff\u003c/em\u003e tool, this equivalence could be rigidly asserted.\u003c/p\u003e\u003cp\u003eAll resources for this validation are available in the BabelFSH source code repository.\u003c/p\u003e\u003cp\u003eSystem in Use\u003c/p\u003e\u003cp\u003eAs has been motivated in the background section, BabelFSH was conceived of in the context of the requirements for the provision of a research FHIR terminology server within the MII, having a national scope. Using the BabelFSH tool, we have both streamlined the initial efforts and the continued maintenance of diverse resources accessible to our services. As of writing, we distribute resources defined by the German BfArM (ICD-10-GM, OPS, ICD-O-3, ASK), the German adaption of ATC, several important OWL ontologies including the Gene Ontology or the Human Phenotype Ontology, the EDQM Standard Terms database, the HGNC gene names database and others using BabelFSH (cf. Table\u0026nbsp;1 for explanations for each terminology thus generated).\u003c/p\u003e\u003cp\u003eMoreover, besides the direct sharing of resources, the system could be used to provide access to FHIR representations of resources that must be licensed for machine-readable distribution. In particular, the Medical Dictionary for Regulatory Activities (MedDRA) terminology [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e] can only be distributed to licensees, which the TS can currently not enforce. By distributing the BabelFSH source files, however, all parties that have a MedDRA subscription and license can generate identical representations of this input, making the distribution of the FHIR resource to those parties mostly redundant.\u003c/p\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eWe consider all design goals previously laid out as suitably addressed. A complete assessment and a summary of the strategy for achieving each goal is stated in Table 3. While the development guided by the design goals did not follow any strict software engineering methodology, their definition ahead of the implementation was extremely helpful in guiding the development process. The design and implementation and performance of the system was positively discussed within the FHIR community, both during in-person meetings of the German FHIR standardization community as well as on-line and during the 2025 edition of the developer conference \u003cem\u003eFHIR DevDays\u0026nbsp;\u003c/em\u003ewith international members of the community.\u003c/p\u003e\n\u003cp\u003eTable 3: Design goals and implementation in the BabelFSH application\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"595\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 179px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDesign Goal\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 416px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eImplementation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 179px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDG1\u0026mdash;Compatibility with SUSHI\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 416px;\"\u003e\n \u003cp\u003eBabelFSH supports a strict subset of the FSH specification, focusing on terminology resources. All valid BabelFSH source files are thus also valid FSH files.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 179px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDG2\u0026mdash;Metadata in FSH\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 416px;\"\u003e\n \u003cp\u003eThe tool supports a clear separation of concerns through the plugin architecture.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 179px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDG3\u0026mdash;Extensibility\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 416px;\"\u003e\n \u003cp\u003eThe provision of new plugins requires only the implementation of a very simple API.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 179px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDG4\u0026mdash;Simple API and Self-Documentation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 416px;\"\u003e\n \u003cp\u003eBased on the provided API, all plugins must be documented by the plugin arguments and further help texts, such that automatic help can be generated.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 179px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDG5\u0026mdash;Validation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 416px;\"\u003e\n \u003cp\u003eAll FSH rules are validated for compliance with the FHIR specification through established validation tooling integrated into the software. Additional validation rules are enforced in the content generation pipeline.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 179px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDG6\u0026mdash;Performance\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 416px;\"\u003e\n \u003cp\u003eBabelFSH has proven to be performant and scalable.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 179px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDG7\u0026mdash;FHIR Version support\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 416px;\"\u003e\n \u003cp\u003eFHIR versions R4B and R5 are supported through a mode switch, and further versions could be added to the API as they are released.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 179px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDG8\u0026mdash;Open Source\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 416px;\"\u003e\n \u003cp\u003eBabelFSH is freely available under the Open Source Initiative-approved Apache 2.0 license.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\u003cp\u003eThe most pressing limitation of our system currently is the number and selection of available plugins. As our system originates in the context of a well-defined project, our own efforts in plugin development in particular center around the needs for this project. Due to the open-source nature and elegant simplicity of the approach, we believe that the broader FHIR community will recognize the potential of this tool, and define their own plugins as needed. The application design does also lend itself to the creation of advanced plugins, such as the implementation of a database connector that would be able to connect, using standard database access layers, to legacy systems that define internal catalogs, which could then be integrated into FHIR resources as needed; a feature idea that has been met with an enthusiastic response in the community.\u003c/p\u003e\u003cp\u003eIn terms of the API, the CS generation API can currently be considered mature, since it has proven itself for several quite different plugins now. That for CM is likely stable, but VS generation has not been considered extensively so far. As there are two ways of defining VS that could be defined, this API defines careful considerations about the intended scope of the generation. First, the intensional approach defines a VS based on rules that are evaluated by the terminological services, e.g. all concepts that are children of a specific concept. Seconds, in extensional definitions, the included concepts are listed directly. Especially for intensionally-defined VS, the SUSHI implementation is already adequate, while extensional VS can, depending on their size, present the same (performance) limitations as SUSHI does for CS resources.\u003c/p\u003e\u003cp\u003eA further limitation of the current implementation is our input validation. BabelFSH currently has no concept about the FHIR metadata structures and applies the rules as they are written. If there is a typo or other error in the declarations, this will be caught by the HAPI validator [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e] instead, and the error message with an indication where the error occurred, but complex data elements such as \u003cem\u003eCodeableConcept\u003c/em\u003e have to be validated in concert, the error messages can often only point the user towards several line numbers in the sources, rather than towards the precise line where the error occurred. However, as the target group of BabelFSH are advanced FHIR implementers, this level of error-checking is deemed adequate, since the messages generated in this way are nevertheless quite descriptive.\u003c/p\u003e\u003cp\u003eAn interesting consideration for the future could be the integration of this tool into the reference standard SUSHI compiler and, potentially, the integration of new language elements into the FSH language definition. In this way, the functionality provided by BabelFSH could be integrated with the standard build tools that are now established in the FHIR community for IG creation. However, the intended user groups of BabelFSH and SUSHI are not always the same people. SUSHI is geared more towards IG creators, while BabelFSH is aimed at those providing FHIR terminology resources or services. While there is a degree of overlap between those user groups, it is rare for FHIR IGs to be responsible for the maintenance and distribution of larger-scale terminologies. The narrow focus of BabelFSH and the inherent constraints of the FSH implementation can allow for quicker iteration, cleaner code and faster learning for terminology converter compared to a SUSHI-native implementation. Moreover, the command-line interface of the BabelFSH app itself still allows integration into automation processes including Continuous Integration (CI) pipelines that are now common in the development of IGs. In this regard, the performance of the implemented system is also beneficial, as long runtimes of the conversion processes could be an obstacle to the profilers due to the time and cost penalties incurred by a longer runtime during IG builds on the CI infrastructure.\u003c/p\u003e\u003cp\u003eRegarding the integration of this plugin architecture into the FSH language specification via a new language element instead of the block comments that are currently used, this document has now reached a level of maturity that makes most rules normative, disallowing changes that are not backwards compatible. While adding new language elements is not a breaking change, there are some constraints in the BabelFSH implementation that may not be compatible with the specification as written, such that integration would require very careful consideration.\u003c/p\u003e\u003cp\u003eLastly, for FHIR terminology services to generate a benefit to implementers and users, the maintenance and content delivery of these systems needs to be integrated with the requirements of all stakeholders. Since FHIR TS are generally only deployed within a broader infrastructure, the content delivery of such a service needs to be carried out in lockstep with other changes in the infrastructure. A tool like BabelFSH can aid in that area, by both allowing fast reaction to new or shifted requirements, such as providing new resources, and the maintenance of such resources in the future, but is only one tool towards professionalized TS maintenance. Within the SU-TermServ project, providing such a service with a broad scope, BabelFSH has become an important tool, but only develops its potential through the integration into a comprehensive strategy and tooling infrastructure.\u003c/p\u003e"},{"header":"CONCLUSIONS","content":"\u003cp\u003eIn this work, we have presented a novel, powerful and open-source approach to making heterogeneous sources of terminological knowledge broadly accessible as FHIR resources. In this way, BabelFSH is an important tool towards the adoption of comprehensive HL7 FHIR terminological services, and thus greater (semantic) interoperability in general. It is furthermore orthogonal to existing approaches of terminology conversions: the core logic for content generations is often cleanly separable from the (often hard-coded) generation of the needed metadata, such that the generation part can be easily integrated into the BabelFSH tool.\u003c/p\u003e\u003cp\u003eUsage of the BabelFSH tool and uptake by the community can ensure the availability of required terminological resources through FHIR terminology services, aiding the adoption of the HL7 FHIR standard overall. The tool has the potential to be adopted by terminology authors to streamline the provision of authoritative FHIR representations of their terminologies in the first place, such that community efforts to generate such artefacts become unnecessary.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAPI\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eApplication Programming Interface\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eBfArM\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003e\u003cem\u003eBundesinstitut f\u0026uuml;r Arzneimittel und Medizinprodukte\u003c/em\u003e (German Federal Institute for Drugs and Medical Devices)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eClaML\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eClassification Markup Language\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCI\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eContinuous Integration\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCM\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eConceptMap (a concrete FHIR resource)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCS\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eCodeSystem (a concrete FHIR resource)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDSL\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eDomain-Specific Language\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDSTU2\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eDraft Standard for Trial Use Release 2 (a version of the HL7 FHIR standard)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eEDQM\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eEuropean Directorate for the Quality of Medicines \u0026amp; HealthCare\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eEHDS\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eEuropean Health Data Space\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eFSH\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eFHIR Shorthand\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eHL7 FHIR\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eHealth Level 7 Fast Healthcare Interoperability Resources\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eIG\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eImplementation Guide\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMedDRA\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eMedical Dictionary for Regulatory Activities\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMII\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eMedical Informatics Initiative\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNUM\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eNetwork University Medicine\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eOWL\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eWeb Ontology Language\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eR4/R4B/R5\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eHL7 FHIR Release 4/Release 4B/Release 5\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSTU3\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eStandard for Trial Use Release 3 (a version of the HL7 FHIR standard)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSU-TermServ\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eService Unit Terminological Services\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSUSHI\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eSUSHI Unshortens Short Hand Inputs (the reference compiler for FSH)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTS\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eTerminological services\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 132px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eVS\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 507px;\"\u003e\n \u003cp\u003eValueSet (a concrete FHIR resource)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003eProject name: BabelFSH\u003c/li\u003e\n \u003cli\u003eProject home page: https://gitlab.com/mii-termserv/babelfsh\u003c/li\u003e\n \u003cli\u003eArchived version: https://doi.org/10.5281/zenodo.15755172\u003c/li\u003e\n \u003cli\u003eOperating system: Platform independent\u003c/li\u003e\n \u003cli\u003eProgramming language: Kotlin (primarily), Java\u003c/li\u003e\n \u003cli\u003eOther requirements: Java 18 or higher, Gradle\u003c/li\u003e\n \u003cli\u003eLicense: Apache License 2.0\u003c/li\u003e\n \u003cli\u003eAny restrictions to use by non-academics: none apply\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests\u003cem\u003e.\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was funded by the German Federal Ministry of Research, Technology and Space (\u003cem\u003eBundesministerium f\u0026uuml;r Forschung, Technologie und Raumfahrt\u003c/em\u003e, BMFTR) as part of the Medical Informatics Initiative Germany under the grant number 01ZZ2312A.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConceptualization, Methodology, Investigation: J.W., T.O., A.K., J.I.\u003c/p\u003e\n\u003cp\u003eSoftware: J.W.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eValidation: J.W, T.O.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eVisualization, Writing \u0026ndash; original draft: J.W.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWriting \u0026ndash; review \u0026amp; editing: J.W., T.O., A.K., J.I.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors would like to thank and acknowledge the members of the worldwide and diverse HL7 FHIR community, who have been involved in shaping the development of this tool through valuable feedback, and we hope that it is useful to all of them. We would further like to acknowledge the authors and developers of the FSH specification and SUSHI reference implementation for their important work in making FHIR profiling simpler and more accessible.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors\u0026rsquo; Information\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAs part of the Medical Informatics Initiative, all authors are responsible for the provision and development of FHIR-based terminological services to the MII across Germany, and have substantial experience in working with medical terminologies and FHIR terminological services. They are active in the development of the FHIR standard and national adaptations through HL7 Germany, and in the working groups of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS).\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eUS Public Law. 21st Century Cures Act. 130 Stat 1033. Stat., 114\u0026ndash;255 2016 p. 312. \u003c/li\u003e\n\u003cli\u003eVijayaraghavan M, Genes N, Darrow BJ, Rucker DW. The 21st Century Cures Act and Emergency Medicine \u0026ndash; Part 2: Facilitating Interoperability. Ann Emerg Med. 2022;79:13\u0026ndash;7. doi:10.1016/j.annemergmed.2021.08.002\u003c/li\u003e\n\u003cli\u003eEuropean Parliament, Council of the European Union. Regulation (EU) 2025/327 of the European Parliament and of the Council of 11 February 2025 on the European Health Data Space and amending Directive 2011/24/EU and Regulation (EU) 2024/2847 (Text with EEA relevance). Mar 5, 2025. \u003c/li\u003e\n\u003cli\u003eSemler S, Wissing F, Heyder R. German Medical Informatics Initiative: A National Approach to Integrating Health Data from Patient Care and Medical Research. Methods Inf Med. 2018;57:e50\u0026ndash;6. doi:10.3414/ME18-03-0003\u003c/li\u003e\n\u003cli\u003eHL7 International. HL7 Fast Healthcare Interoperability Resources, Release 5 (HL7 FHIR R5) [Internet]. 2023 [cited 3 Feb 2025]. Available from: https://hl7.org/fhir/R5\u003c/li\u003e\n\u003cli\u003eBenson T, Grieve G. Principles of Health Interoperability: FHIR, HL7 and SNOMED CT. Fourth edition. Cham: Springer International Publishing; 2021. doi:10.1007/978-3-030-56883-2_2\u003c/li\u003e\n\u003cli\u003eVorisek CN, Lehne M, Klopfenstein SAI, Mayer PJ, Bartschke A, Haese T, et al. Fast Healthcare Interoperability Resources (FHIR) for Interoperability in Health Research: Systematic Review. JMIR Med Inform. 2022;10:e35724. doi:10.2196/35724\u003c/li\u003e\n\u003cli\u003eGrieve G, Klein WT, Beeler Jr. G, Hamm R, McKenzie L. \u003cem\u003eHL7 International, Modeling and Methodology and Vocabulary Work Groups\u003c/em\u003e. HL7 Version 3 Standard: Core Principles and Properties of HL7 Version 3 Models, Release 1 [Internet]. ANSI/HL7 V3 CPPV3MODELS, R1-2012, 2012 [cited 3 Feb 2025]. Available from: https://www.hl7.org/documentcenter/public/standards/V3/core_principles/infrastructure/coreprinciples/v3modelcoreprinciples.html\u003c/li\u003e\n\u003cli\u003eHL7 FHIR. Resource ValueSet - Content, Fast Healthcare Interoperability Resources Release DSTU2 [Internet]. 2015 [cited 3 Feb 2025]. Available from: https://hl7.org/fhir/DSTU2/valueset.html\u003c/li\u003e\n\u003cli\u003eOBO Foundry. OBO Foundry Principles: Scope (Principle 5) [Internet]. 2022 [cited 20 Jun 2025]. Available from: https://obofoundry.org/principles/fp-005-delineated-content.html\u003c/li\u003e\n\u003cli\u003eMetke-Jimenez A, Lawley M, Hansen D. FHIR OWL: Transforming OWL ontologies into FHIR terminology resources. AMIA Annu Symp Proc AMIA Symp. 2019;2019:664\u0026ndash;72. \u003c/li\u003e\n\u003cli\u003eBrachman. What IS-A Is and Isn\u0026rsquo;t: An Analysis of Taxonomic Links in Semantic Networks. Computer [Internet]. 1983 [cited 24 Jun 2025];16:30\u0026ndash;6. doi:10.1109/MC.1983.1654194\u003c/li\u003e\n\u003cli\u003eSchadow G, McDonald CJ. The Unified Code of Units and Measures [Internet]. Version 2.2, 2024 [cited 8 Mar 2025]. Available from: https://ucum.org/ucum\u003c/li\u003e\n\u003cli\u003eBundesministerium f\u0026uuml;r Bildung und Forschung. SU-TermServ \u0026ndash; Medizininformatik-Struktur \u0026ldquo;Service Unit Terminological Services\u0026rdquo; for MI-I and NUM [Internet]. 2023 [cited 28 Feb 2025]. Available from: https://www.gesundheitsforschung-bmbf.de/de/su-termserv-medizininformatik-struktur-service-unit-terminological-services-for-mi-i-and-16144.php\u003c/li\u003e\n\u003cli\u003eAmmon D, Kurscheidt M, Buckow K, Kirsten T, L\u0026ouml;be M, Meineke F, et al. Arbeitsgruppe Interoperabilit\u0026auml;t: Kerndatensatz und Informationssysteme f\u0026uuml;r Integration und Austausch von Daten in der Medizininformatik-Initiative. Bundesgesundheitsblatt - Gesundheitsforschung - Gesundheitsschutz. 2024;67:656\u0026ndash;67. doi:10.1007/s00103-024-03888-4\u003c/li\u003e\n\u003cli\u003eMetke-Jimenez A, Steel J, Hansen D, Lawley M. Ontoserver: a syndicated terminology server. J Biomed Semant [Internet]. 2018 [cited 7 Mar 2024];9:24. doi:10.1186/s13326-018-0191-z\u003c/li\u003e\n\u003cli\u003eInternational Standards Organization. ISO 13120:2019 \u0026mdash; Health informatics \u0026mdash; Syntax to represent the content of healthcare classification systems \u0026mdash; Classification Markup Language (ClaML). 2019. \u003c/li\u003e\n\u003cli\u003eKundra R, Zhang H, Sheridan R, Sirintrapun SJ, Wang A, Ochoa A, et al. OncoTree: A Cancer Classification System for Precision Oncology. JCO Clin Cancer Inform [Internet]. 2021 [cited 8 Feb 2025];221\u0026ndash;30. doi:10.1200/CCI.20.00108\u003c/li\u003e\n\u003cli\u003eSteel J, Wiedekopf J, dfleming9, Werner P. ClaML to FHIR Transformer [Internet]. 2024 [cited 8 Feb 2025]. Available from: https://github.com/aehrc/fhir-claml\u003c/li\u003e\n\u003cli\u003eWiedekopf J. Oncotree to FHIR Converter [Internet]. 2020 [cited 8 Feb 2025]. Available from: https://github.com/skfit-uni-luebeck/oncotree-fhir\u003c/li\u003e\n\u003cli\u003eEuropean Directorate for the Quality of Medicines \u0026amp; HealthCare. Standard Terms Database [Internet]. 2024 [cited 8 Feb 2025]. Available from: https://standardterms.edqm.eu/\u003c/li\u003e\n\u003cli\u003eWiedekopf J. EDQM2FHIR [Internet]. 2024 [cited 8 Feb 2025]. Available from: https://github.com/skfit-uni-luebeck/EDQM2FHIR\u003c/li\u003e\n\u003cli\u003eWiedekopf J, Drenkhahn C, Ulrich H, Kock-Schoppenhauer A-K, Ingenerf J. Providing ART-DECOR ValueSets via FHIR Terminology Servers - A Technical Report. Stud Health Technol Inform. 2021;283:127\u0026ndash;35. doi:10.3233/SHTI210550\u003c/li\u003e\n\u003cli\u003eInstitut national de la sant\u0026eacute; et de la recherche m\u0026eacute;dicale (INSERM) US14. Orphanet Nomenclature for Coding and Associated Tools [Internet]. 2024 [cited 20 Jun 2025]. Available from: https://www.orphadata.com/orphanet-nomenclature-for-coding/pack-nomenclature\u003c/li\u003e\n\u003cli\u003eInstitut national de la sant\u0026eacute; et de la recherche m\u0026eacute;dicale (INSERM) US14. Orphanet\u0026mdash;Knowledge on rare diseases and orphan drugs [Internet]. 2025 [cited 20 Jun 2025]. Available from: https://www.orpha.net/\u003c/li\u003e\n\u003cli\u003eRosenau L, Majeed RW, Ingenerf J, Kiel A, Kroll B, K\u0026ouml;hler T, et al. Generation of a Fast Healthcare Interoperability Resources (FHIR)-based Ontology for Federated Feasibility Queries in the Context of COVID-19: Feasibility Study. JMIR Med Inform. 2022;10:e35789. doi:10.2196/35789\u003c/li\u003e\n\u003cli\u003eHL7 International, FHIR Infrastructure Working Group. FHIR Shorthand, Version 3.0.0 [Internet]. 2024 [cited 3 Feb 2025]. Available from: https://hl7.org/fhir/uv/shorthand/N2/\u003c/li\u003e\n\u003cli\u003eHealth Level Seven International. SUSHI [Internet]. 2024 [cited 3 Feb 2025]. Available from: https://github.com/FHIR/sushi\u003c/li\u003e\n\u003cli\u003eJetbrains, Inc. Kotlin Programming Language [Internet]. 2025 [cited 3 Feb 2025]. Available from: https://kotlinlang.org/\u003c/li\u003e\n\u003cli\u003eGradle, Inc., Dockter H, Murdoch A, Faber S, Niederwieser P, Daley L, et al. Gradle [Internet]. 2008 [cited 3 Feb 2025]. Available from: https://gradle.org\u003c/li\u003e\n\u003cli\u003eParr T. The definitive ANTLR 4 reference. Book version: P 2.0. Dallas, Texas \u0026amp; Raleigh, North Carolina: The Pragmatic Bookshelf; 2014. \u003c/li\u003e\n\u003cli\u003eGonsalves L, Auckland S, Mittelbach B, Hebert C, Filios K, Gockel T. xenomachina/kotlin-argparser [Internet]. 2022 [cited 20 Feb 2025]. Available from: https://github.com/xenomachina/kotlin-argparser\u003c/li\u003e\n\u003cli\u003eUniversity Health Network, James Agnew. HAPI FHIR [Internet]. 2014 [cited 14 Mar 2025]. Available from: https://hapifhir.io\u003c/li\u003e\n\u003cli\u003eIantorno M, Otasek D, Yang H, Hall D, Passas R, Werner P. hapifhir/org.hl7.fhir.validator-wrapper [Internet]. 2025 [cited 13 Jun 2025]. Available from: https://github.com/hapifhir/org.hl7.fhir.validator-wrapper\u003c/li\u003e\n\u003cli\u003eBundesinstitut f\u0026uuml;r Arzneimittel und Medizinprodukte. Alpha-ID-SE [Internet]. 2025 [cited 28 Feb 2025]. Available from: https://www.bfarm.de/EN/Code-systems/Terminologies/Alpha-ID-SE/_node.html\u003c/li\u003e\n\u003cli\u003eBundesinstitut f\u0026uuml;r Arzneimittel und Medizinprodukte. ICD-10-GM\u0026mdash;International Statistical Classification of Diseases, German Modification [Internet]. 2025 [cited 28 Feb 2025]. Available from: https://www.bfarm.de/EN/Code-systems/Classifications/ICD/ICD-10-GM/_node.html\u003c/li\u003e\n\u003cli\u003eZetlen J, PayPal, Inc. gnomon [Internet]. 2016 [cited 16 Apr 2025]. Available from: https://github.com/paypal/gnomon\u003c/li\u003e\n\u003cli\u003eInternational Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Medical Dictionary for Regulatory Activities (MedDRA) [Internet]. 28.0, 2025 [cited 21 May 2025]. Available from: https://www.meddra.org\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"Footnotes","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003e A FHIR representation of this resource has now been put forward by the BfArM. At the time BabelFSH was conceptualized, this resource did not yet exist, so that a conversion of the ClaML representation was quite relevant at the time. As of writing, this FHIR representation is not available for all terminologies in all versions maintained by the BfArM.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"journal-of-biomedical-semantics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"jbsm","sideBox":"Learn more about [Journal of Biomedical Semantics](http://jbiomedsem.biomedcentral.com/)","snPcode":"13326","submissionUrl":"https://submission.nature.com/new-submission/13326/3","title":"Journal of Biomedical Semantics","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"HL7 FHIR, Terminology as Topic, Terminology Servers, Knowledge Bases","lastPublishedDoi":"10.21203/rs.3.rs-6992162/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6992162/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground: \u003c/strong\u003eHL7 FHIR terminological services (TS) are a valuable tool towards better healthcare interoperability, but require representations of terminologies using FHIR resources. As most terminologies are not natively distributed using FHIR resources, converters are needed. Large-scale FHIR projects, especially those with a national or even an international scope, define enormous numbers of value sets and reference many complex code systems, which must be regularly updated in TS and other systems. This necessitates a flexible, scalable and efficient provision of these artifacts. This work aims to develop a comprehensive, extensible and accessible toolkit for FHIR terminology conversion.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eImplementation: \u003c/strong\u003eBased on the prevalent HL7 FHIR Shorthand (FSH) specification, a converter toolkit, called \u003cem\u003eBabelFSH\u003c/em\u003e, was created that utilizes an adaptable plugin architecture to separate the definition of content from that of the needed declarative metadata. The development process was guided by formalized design goals.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults: \u003c/strong\u003eAll eight design goals were addressed by BabelFSH. Validation of the systems’ performance and completeness was exemplarily demonstrated using Alpha-ID-SE, an important terminology used for diagnosis coding especially of rare diseases within Germany. The tool is now used extensively within the content delivery pipeline for a central FHIR TS with a national scope within the German Medical Informatics Initiative and Network University Medicine.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDiscussion: \u003c/strong\u003eThe first development focus was geared towards the requirements of the central research FHIR TS for the federated FHIR infrastructure in Germany, and has proven to be very useful towards that goal. Opportunities for further improvement were identified in the validation process especially, as the validation messages are currently imprecise at times. The design of the application lends itself to the implementation of further use cases, such as direct connectivity to legacy systems for catalog conversion to FHIR.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusions: \u003c/strong\u003eThe developed \u003cem\u003eBabelFSH\u003c/em\u003e tool is a novel, powerful and open-source approach to making heterogenous sources of terminological knowledge accessible as FHIR resources, thus aiding semantic interoperability in healthcare in general.\u003c/p\u003e","manuscriptTitle":"BabelFSH—A Toolkit for an Effective HL7 FHIR-based Terminology Provision","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-07-23 07:29:04","doi":"10.21203/rs.3.rs-6992162/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-08-31T20:02:54+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-07-30T15:29:47+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-07-23T18:11:18+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"333650510666135635370263237672127709347","date":"2025-07-23T08:23:26+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"95804372182059307721813817087127290965","date":"2025-07-17T07:18:26+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-07-17T06:42:41+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-07-02T08:39:05+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-07-02T08:36:43+00:00","index":"","fulltext":""},{"type":"submitted","content":"Journal of Biomedical Semantics","date":"2025-06-27T13:36:58+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"journal-of-biomedical-semantics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"jbsm","sideBox":"Learn more about [Journal of Biomedical Semantics](http://jbiomedsem.biomedcentral.com/)","snPcode":"13326","submissionUrl":"https://submission.nature.com/new-submission/13326/3","title":"Journal of Biomedical Semantics","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"f79efb20-8128-470e-b27c-3348faa10cf7","owner":[],"postedDate":"July 23rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-12-01T16:06:41+00:00","versionOfRecord":{"articleIdentity":"rs-6992162","link":"https://doi.org/10.1186/s13326-025-00343-4","journal":{"identity":"journal-of-biomedical-semantics","isVorOnly":false,"title":"Journal of Biomedical Semantics"},"publishedOn":"2025-11-29 15:57:56","publishedOnDateReadable":"November 29th, 2025"},"versionCreatedAt":"2025-07-23 07:29:04","video":"","vorDoi":"10.1186/s13326-025-00343-4","vorDoiUrl":"https://doi.org/10.1186/s13326-025-00343-4","workflowStages":[]},"version":"v1","identity":"rs-6992162","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6992162","identity":"rs-6992162","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.