phd-thesis/chapter-epa/chapter.tex


\chapterquote{attributed to Grace Hopper\cite{
    WikiQuoteGraceHopper,
    QuoteOriginMost2014,
}}{
    The most dangerous phrase in the language is ``We've always done it this way!''.
}

\chaptertitle{The German ePA: A Motivating Counter-Example}

\todo{FIXME: Proper citation here}
\sourceattrib{This part is based on a short paper written by me and presented by me at the HS3 workshop at ESORICS
2025.}
Looking at the landscape of computer security solutions, we are presented with a wide variety of vendors and products
that may give the impression that hardware security is a solved problem. Vendors sell various claims rangning from
\emph{You don't need hardware security, just do it in the cloud!} to \emph{Buy our HSM and you will be secure!}. In
practice, things are not as easy and even well-intentioned projects still often go awry on the hardware security
dimension. Concluding this chapter, we will now have a look at one such project that was done by capable people with the
best intentions, yet it resulted in a hardware security design that is dangerously inadequate for the purpose.

Beginning May 2025, after several delays, Germany has started the nation-scale rollout of its new electronic medical
record system. The system aims to create a national database accessible to all healthcare providers that holds the
complete electronic medical records of all publically insured people living in Germany. The system aims to replace
paper-based workflows that are error-prone and lead to healthcare providers often only having access to a subset of
patient's medical records. Data in scope for the system includes medical letters, laboratory results, and medical
imaging files.

Due to Germany's mandatory health insurance laws, the system's user base encompasses the majority of all German
residents. People who have replaced their public health insurance with private insurance as of now are not subject to
the system. In Germany, by law private health insurance is only available to people from the top 10th percentile of
household income. This means that the system disproportionally affects people who have low income, creating an equity
issue. While it is possible to opt out from the use of the system, the process of opting out is difficult. Additionally,
the government and health insurance providers have publically depicted the system in a one-sidedly positive way, meaning
that it is unlikely the majority of people subject to the system have a comprehensive understanding of the system's
benefits and risks that would be necessary for an informed decision.

While there has been loud criticism of the system's security from civil society organizations such as digital rights
nonprofit organization Chaos Computer Club (CCC) \cite{kochMoreMoreExperts2025} and several severe security flaws have
been demonstrated practically, this criticism has largely been ignored by the political structures in charge. We observe
that despite this civil society outrage and the system's large scale, it has received little attention from the academic
cryptography and information security community.

In this chapter, we aim to point out some perplexing cryptographic engineering decisions in the system. In particular,
we point out that the system's core per-user secrets are kept in a rudimentary key escrow system whose security is based
on engineering assumptions, not on cryptographic principles. Furthermore, we observe that by specification, the
individual user keys of the system are derived from a per-user cleartext salt based on a system-wide long-term secret
with only 256 bits of entropy\footnote{
    In previous versions of the standard \cite{
        gematikSpezifikationSchluesselgenerierungsdienstEPA2023,
        gematikUebergreifendeSpezifikationVerwendung2025,
    }, there were two escrow services, with both keys used in layers to reduce the risk of a compromise of either one.
    The current standard only requires one escrow service, and drops the entropy requirement of the root keys from 512
    bits to 256 bits. The apparent reason for the long-term nature of these keys is that they are updated manually.
}. Finally, we note that according to specification, the only physical security requirement for the protection of this
highly sensitive secret is a ``hard, opaque potting material'', with no tamper detection and response required. We
belive that Inertial HSMs provide a path forward for systems like this, enabling physical security in applications that
currently rely on insecure, legacy systems. Even if for regulatory reasons a poorly secured conventional HSM without
active tamper sensing is chosen, it would be conceivable to construct an IHSM enclosure \emph{around} this conventional
HSM, in effect retrofitting the missing active tamper-sensing envelope.

We base our analysis of the ePA on the system's publicly available standards in their latest version as of the writing
of the paper underlying this chapter in April 2025, describing version 3.0 of the healthcare record system \cite{
    gematikSpezifikationAktensystemEPA2025,
    gematikUbergreifendeSpezifikationVerwendung2024,
}. We note that the implementation might well deviate from these standards and be more secure--however, with the
system's history of flaws, we believe this is unlikely to be the case. The reference implementation provided by the
specification authority \cite{GithubRepositoryERPFD} follows the specified minimum requirements closely. As of now,
there is no meaningful way for either the public or for researchers such as us to ascertain the concrete implementation
security of the system.

\section{The Design of ePA}

ePA (short for \emph{elektronische Patientenakte}, ``electronic patient record''), is embedded into Germany's national
public healthcare backend system ``Telematikinfrastruktur'' (TI). TI is a highly complex system, and a detailed
description would exceed the limits of this analysis. Briefly put, TI consists of a shared DMZ that parties like
insurance providers and healthcare providers connect to through a VPN. At the client location, usually an individual
doctor's office or a hospital, this VPN connection is terminated by a specialized VPN appliance named ``Konnektor'' that
simultaneously acts as a trusted component inside the client network hosting some software for purposes such as
authentication. The Konnektor contains several smart cards that store keys used for authentication. Konnektor devices
are offered by several vendors and healthcare providers like doctor's offices are indivudally responsible for purchasing
and maintaining a Konnektor.

% FIXME: Is there a threat/trust model of the system that you could summarise in a few sentences?

Every person enrolled in the system as well as every healthcare professional providing services under it is issued an ID
card that contains a smart card that contains keys used to authenticate towards the central infrastructure. The primary
use of these smart cards up to now is that when someone visits a healthcare provider, they will insert their ID card
into a terminal so the healthcare provider can automatically fetch their personal information such as name, birth date,
address and enrollment status from their insurance provider.

ePA is implemented inside the TI system. Its centralized services are accessed by healthcare providers through the TI's
VPN. Patient records are encrypted and decrypted inside TI's backend systems. Smart cards authenticate parties and
hardware devices to each other. Each insurance provider picks one of several implementations of ePA's server-side
infrastructure to run for its clients. Currently, there are two approved implementations of this server-side
infrastructure.

With the current version of the specificatoin, the overall architecture of ePA heavily relies on Trusted Execution
Environments (TEEs). Data processing on the server side is done in plaintext inside TEEs, with some cryptographic key
management delegated to a Hardware Security Module. While attacks on the TEEs are considered in the system, the HSMs are
assumed to be perfectly secure, and the system does not include mitigations for a compromised HSM. The primary
motivation for plaintext processing seems to be to enable large-scale data analysis for research purposes without
requiring consent or cooperation of the people whose records are being processed.

The primary services offered by the server side are authentication services, key escrow, and a database storing the
encrypted records themselves. Records are symmetrically encrypted with keys that are derived from system-wide secrets
inside an HSM. The primary motivation behind the use of a key escrow service seems to be to enable the creation of a
duplicate patient ID smartcard in case a person looses theirs. While the current version of the standard is unclear on
the exact mechanism of key derivation, in previous versions of the standard, the escrow service's root key, a random
salt, and the healthcare ID number of the person owning the record was used in SHA256-HKDF. The specification requires
that a new root key is generated once a year, but as far as we can tell, record key rollover is not done automatically
but is only meant to be done when the \emph{user} requests it, and old root keys must be retained forever to ensure old
records can be accessed.

\section{Related Work}

The state-owned company specifying the system commissioned several security assessments of the system relating to the
key escrow service. \textcite{fischlinKryptographischeAnalyseSpezifikation2021} focuses on the cryptographic
dimension of the key escrow service used in an older version of the standard, and is now obsolete.
\textcite{slanySicherheitsanalyseZurSicherheit2020} approaches the system at a higher level, and focuses on the
cryptography of the inner protocol layers spoken between the system's components. Industry research organization
Fraunhofer SIT was comissioned for a structured, theoretical assessment of attack paths to the system
\cite{fraunhofersitAbschlussberichtSicherheitsanalyseGesamtsystems2024}. We are not currently aware of
independent academic security research on the system.

The design and operation of the system have been independently described in detail by civil society activists, who have
demonstrated several successful attacks on the system. \textcite{tschirsichHackerHinOder0100} demonstrated how they
could trivially acquire each of the smartcards as well as the Konnektor necessary for accessing the system.
\textcite{tschirsichKonnteBisherNoch0100} summarize the history of attacks demonstrated on the system and show multiple
practical attacks on various parts of the system's implementation.

\section{Concerning Cryptographic Engineering Choices}

We wish to highlight some of the design choices in the system that we believe stray from current best practice. This is
by no means an exhaustive list, and is only meant to underscore why we believe the system deserves more scrutiny.

\subsection{Use of Key Escrow}

First, the system's general approach of using a key escrow service instead of securely storing the keys inside the
system's already existing smart card infrastructure is concerning, given that this key escrow service poses a
centralized security risk. The system's designers made this decision since it was deemed important that access to an
encrypted record can be restored quickly after an insurance ID card is lost, without requiring the cooperation of the
healthcare providers holding the primary copies of the person's medical records.

While key escrow services have been a topic of political debate in decades past, in the cryptographic community,
consensus generally is that they are a bad idea since they pose a centralized target for attack, and increase attack
surface \cite{
    abelsonRisksKeyRecovery1997,
    abelsonKeysDoormats2015,
    andersonSecurityEngineeringGuide2020,
}.

\subsection{Cryptographic Design}

The system's overall cryptographic design is intentionally kept simple. The standard explicitly mentions that symmetric
primitives have been preferred over asymmetric primitives in the core key escrow functions due to the risk of an attack
on asymmetric primitives in the long term. Notably, other advanced cryptographic techniques such as secret sharing
schemes, oblivious pseudo-random functions, or multiparty computation that could help with the security and privacy of
the key escrow service by reducing trust placed in any single component of the service are also absent while the system
relies extensively on the engineering-based security guarantees of TEEs and HSMs. Given that the ePA system trusts its
HSMs as unconditionally secure, it is unclear what purpose the manual yearly root key renewal serves, especially absent
an automatic way to roll over the wrapped record keys.

A consequence of the systems' simple cryptographic design is that the system trusts its components to a large degree.
For instance, the system leaks a person's insurance ID number to the key escrow HSM every time record keys are
requested. Along with the timing and frequency of these requests, this leaks information on the person's condition to
the key escrow service in an identifiable way.

% TODO I feel that this section is a mix-up of critique on the cryptographic design and the approach to privacy
% protection and data minimisation. How are they linked? I'm missing some discussion here.

\subsection{A Realistic Attacker Model}

We observe that the system as a whole does not appear to be designed to defend against well-resourced adversaries. The
series of practical attacks that have been demonstrated on the system confirm this impression. In
\textcite{tschirsichKonnteBisherNoch0100} summarize a series of successful attacks. Attacks include social engineering
resulting in access to copies of smartcards enabling accessing patient records, using misconfigured Konnektor VPN
appliances with their LAN DMZ and authentication interface exposed on the public internet, circumventing video-based
authentication processes resulting in duplicate file keys being provided, classis SQL injection on a backend service
maintaining an authentication database, accessing all national patient records through brute-force enumeration of weak
identifiers, and several more.

We believe that a system like this must be designed to withstand well-resourced adversaries such as enemy secret
services, since the medical data stored in such as information on chronic illness, sexually transmittable disease or
severe food allergies has intelligence value. Repeated breaches of national digital infrastructure such as the 2015
breach of the US Office of Personnel Management \cite{barrettUSSuspectsHackers2015} or the 2024 compromise of US
telecommunications wiretapping systems \cite{mennChineseGovernmentHackers2024} demonstrate that such state-sponsored
attacks on national digital infrastructure are a realistic concern. A possible scenario in the ePA system would be an
enemy secret service gaining access to one of the HSMs storing the systems' root secrets, extracting the root secret by
an advanced physical attack, then being able to decrypt captured encrypted health records at will. Similarly, a
nation-state adversary might have access to an exploit allowing the compromise of the system's TEEs, which would enable
the extraction of any patient records being processed in plaintext inside these TEEs.

\subsection{Physical Security}

Physical security has received some consideration in the system's specification. First, smart cards are used extensively
for authentication. Second, Hardware Security Modules are used in key locations of the system to process some
cryptographic secrets. The core of the system's key escrow service is implemented inside an HSM. However, it is notable
that the actual security level required for this HSM is only FIPS 140-2 level
3 \cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2002}. Not only has FIPS 140-2
been superseded by FIPS 140-3 since
2019 \cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2019}, its security level 3
mostly provides logical separation of cryptographic functions from other logic and is not very meaningful in the context
of physical attacks. The only physical requirement of FIPS 140-2 level 3 is that the HSM has a hard, opaque coating.
This coating is specified to be tamper-evident, but notably no active tamper detection or response features are required
by this standard. In contrast to the newer FIPS 140-3 standard and the related ISO/IEC 19790 \cite{ISOIEC19790} as well
as ISO/IEC 24759 \cite{ISOIEC24759} standards, FIPS 140-2 does not make any particular requirements regarding resistance
to side-channel attacks. The lack of tamper response, unspecified resistance to side-channel attacks and the fact that
the ePA specification only requires the long-lived key escrow root key inside the HSM to have 256 bits of entropy lead
to an unsatisfactory overall constellation.

\section{Conclusion}

In conclusion, we observe that in Germany's ePA national medical record database, despite the decade-long
standardization and implementation process, several cryptographic compromises ended up in the system's final deployment.
Even assuming that nation-scale key escrow is a good idea, the implementation of this key escrow system seems to stray
from current best practice. The system uses a secret key with only 256 bits of entropy to derive highly sensitive secret
keys for potentially tens of millions of people sharing an insurance provider. The cryptographic design of this escrow
system is unsophisticated, ignoring the past three decades in cryptographic developments particularly in multiparty
computation (MPC) and other secret sharing techniques in favor of an engineering approach. In the engineering dimension,
the system's physical security is only held to the basic level 3 of the obsolete FIPS 140-2 standard, which is
considerably less secure than an average credit card payment terminal. The system's root keys are only protected by a
``hard, opaque potting material'' and no tamper detection and response is required. We estimate that the system poses an
attractive and soft target to nation-state adversaries. The system's shortcomings are made more severe by the fact that
the system disproportionally affects the lives of people with low income.

From an academic perspective, it is interesting to see how the ePA ended up in its current state, and the gaps in
cryptographic solutions left by academic research that contributed. A fundamental truth in cryptographic engineering is
that in the absence of technical checks, political promises are no guarantees of restraint. As such, the degree of trust
the ePA system places on organizational measures leads to a concerning overall picture. In particular, the system's
strong reliance on conventional HSMs built to long obsolete security standards as well as on trusted execution
environment technology that has been broken multiple times highlights the need for new approaches to hardware security
that better accomodate real-world use cases.

We believe that Inertial HSMs can address this use case by cleanly separating the physical security primitive into a
retargetable design that can be applied to entire servers if needed, and augment or replace technology like conventional
HSMs or trusted execution environments to provide high-level hardware security.