phd-thesis/chapter-introduction/chapter.tex
2025-10-02 07:09:50 +02:00

520 lines
41 KiB
TeX
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

\chapterquote{Meredith Whittaker~\cite{greenbergSignalMoreEncrypted2024}}{
Its not for lack of ideas or possibilities. Its that we actually have to start taking seriously the shifts that
are going to be required to do this thing—to build tech that rejects surveillance and centralized control—whose
necessity is now obvious to everyone.
}
\chaptertitle{Introduction}
All Cops Are Bastards, or ACAB is a slogan popular in far left and anarchist circles since the mid-twentieth century
that expresses a rejection of state authority~\cite{constantinouAppliedResearchPolicing2021}. While politically, this
blanket rejection is a fringe viewpoint with no mainstream acceptance, there exists an interesting parallel between this
and modern cryptographic best practice. In modern cryptography, it is generally seen as best practice to have the least
amount of keys possible involved in any computation, and cryptographers have time and time again strongly rejected
attempts by states and other authorities to insert backdoor access mechanisms into cryptographic systems~\cite{
abelsonRisksKeyRecovery1997,
abelsonKeysDoormats2015,
andersonSecurityEngineeringGuide2020,
}.
The aversion of cryptographers against backdoor access shows up everywhere---from cryptographic protocol standards like
TLS, to cryptographic applications like the Singal messenger, not only is backdoor access excluded from the system
design, its possibility is considered a potential vulnerability and measures such as forward secrecy and post-compromise
security are taken to mitigate its impact when it is achieved through other means. In computing, this design aspect
makes cryptographic protocols a unique holdout. In other parts of the stack, explicit or implicit backdoor access is
commonplace, and attempts at preventing it are rare. For instance, network providers are generally required to comply
with so-called \emph{Lawful Interception} orders on particular customers or traffic types, and datacenter operators
commonly provide hardware access to state authorities. The design decisions in cryptographic protocols generally hold,
and the gold standard for backdoor access to modern systems is either exploiting a \emph{zero-day} flaw that is not yet
publically known, or acquiring physical access to the target system.
In this thesis, we wish to extend the level of protection afforded by cryptographic protocol design down the technology
stack. While cryptographic protocols and modern software from the operating system up make it possible to secure the
software side of the stack to a high level, the hardware side remains poorly protected. There are a variety of hardware
security solutions in the wild, but the majority of them either do not target protection against local, physical attacks
-- such as Trusted Platform Modules (TPMs) -- or are not widely available due to market segmentation or cost -- such as
conventional Hardware Security Modules (HSMs).
To extend this protection, we propose the Inertial Hardware Security Module (IHSM), a new type of HSM that extends the
high level of protection offered by the modern cryptographic software stack down to the hardware level, enabling secure
computation in insecure places. We chose to publish all our IHSM as open source and unencumbered by patents to enable
widespread adoption. IHSMs can be custom built with only basic manufacturing capabilities at small scale and enable the
deployment of secure computation in insecure places even to small organizations such as university research departments,
NGOs and small businesses.
Recent history has shown that state-level adversaries are a mounting threat to civil rights organizations, human rights
lawyers, members of minorities, and many others. While western democracies used to be considered safe havens of human
rights, today human rights are under attack both from within and from the outside in countries across the globe.
Publishing IHSM technology as open source, we hope to provide one building block for new computing systems accessible to
all that are resilient and secure in the face of growing adversity.
Complementing our IHSM concept and prototype, we provide solutions to engineering issues such as wireless power transfer
adapting them to our use case. Further, we propose improvements to the state of the art in HSM tamper sensors such as
the use of low-cost, embeddable Time-Domain Reflectometry (TDR) that not only improve the security of IHSMs, but that
can even be applied to conventional HSMs. We conclude this thesis with an overview of two concrete use cases IHSMs
unlock that were previously infeasible using conventional HSMs: Datacenter-scale Secure Multiparty Computation (SMPC)
and long-range Quantum Key Distribution (QKD) networks.
\section{Building Inertial HSMs}
In a system with a secure software stack, the role of a HSM is to secure the hardware part of the stack. The basic
approach of a HSM is to combine a secure software stack with a fast self-destruct mechanism and tamper sensors. The
self-destruct mechanism can be hardware or software that quickly, securely destroys all cryptographic secrets, rendering
the device worthless to an attacker. The tamper sensors are tasked with detecting any physical attack an attacker could
mount on the device. Common classes of such sensors include \emph{tamper-sensing meshes}, i.e.\ flexible foils attached
to the HSM's enclosure that detect attempts at penetrating the shell of the device with probes, and environmental
sensors such as temperature or radiation sensors that detect attempts at causing controllable faults in the HSM by
heating, cooling or irradiating it. Out of these sensors, the tamper-sensing meshes are the core line of defense against
most physical attacks. Such meshes are very effective at mitigating almost all physical attacks, but they are difficult
to construct securely as they usually require bespoke manufacturing processes. As a result, they are currently only used
in niche applications, and even there not every realization is equally secure.
Inertial HSMs solve the issue of creating an impenetrable tamper-sensing envelope by replacing the bespoke
tamper-sensing mesh foil with a set of simple, rigid meshes made from commodity Printed Circuit Boards (PCBs) that are
rotating at high speed. In motion, these simple PCB tamper-sensing meshes are as secure as the much more sophisticated
bespoke foils used in conventional HSMs, yet they are simpler and less expensive to manufacture. To verify that the mesh
is rotating correctly, an accelerometer is placed on the rotating mesh, and its centrifugal force reading is used to
validate its path of motion.
\section{Cryptographic Principles and Physical Reality}
Cryptographers' aversion to backdoor access derives from a combination of two fundamental computing principles:
Kerckhoffs' principle, and the principle of least authority. In cryptography, Kerckhoffs' principle, named after Dutch
military cryptographer Auguste Kerckhoffs, expresses that the security of a cryptographic system should only depend on
the secrecy of its keys, not on the secrecy of its design. In this way, Kerckhoff's principle states the opposite of the
common industry practice of \emph{Security by Obscurity}, which aims to achieve security by making it sufficiently
annoying to cryptoanalyze a system that nobody bothers. Complementary to Kerckhoff's is the principle of least
authority, which describes that in a secure system each component should only have access to the smallest set of
capabilities necessary to fulfill its purpose. Applying both to a cryptographic system means that the system's design
should be transparent and not include any hiddent components or opaque parts that cannot be inspected, and that the
system's keys should be scoped to place the least amount of trust possible in each participating party.
Let's take a basic videoconferencing system as an example. In our example system's deployment, users logen to a central
conference server, which receives and distributes the users' video streams. Allowing backdoor access to the video
streams to some third party like a datacenter operator or a state would violate Kerckhoffs' principle since it would
have to be hidden from the systems' participants, who would therefore not have a complete view of the systems' deployed
architecture. The principle of least authority would also be violated since in almost all cases, such a backdoor access
system would not see legitimate use. As a result, it would possess capabilities that almost never would be essential to
the proper function of the videoconference system.
In their design, almost all modern software -- especially open source -- cleanly applies these principles. However, the
practical reality after deployment almost always deviates from them. While backdoors are vanishingly rare in modern
open-source software, practical depoloyments usually are vulnerable to physical attacks. Modern hardware generally is
not designed with a local attacker with advanced physical attack capabilities in mind since no mitigation can fully
prevent them, they can only be detected, or at best slowed down. As a result, commonplace attacks against modern
software often involve taking over the hardware at some point in the chain. Even End-to-End-Encrypted (E2EE)
communication systems can be compromised if one of the encrypted channel's endpoints can be physically compromised.
Corresponding \emph{digital forensics} capabilities are commonplace among state actors, and are available as a turnkey
solution on the market.
\section{Inertial HSM Applications}
Inertial HSMs are the first fully open source HSM with advanced tamper sensing features. Across application domains,
Inertial HSMs can be applied to gain resistance to physical attacks in scenarios where conventional HSMs were not used
because of cost, computing power or implementation effort. Where conventional HSMs come as fully integrated devices that
only expose limited APIs to their users, Inertial HSMs at their core are just an enclosure that the user can put
whatever hardware they need into. Since the simpler tamper-sensing mesh construction of IHSMs scales to larger payload
volumes, entire servers can be protected---something that is impossible with conventional HSMs. Since the mesh in an
IHSM is constantly moving, unlike a mesh in a convetional HSM, it does not have to entirely cover the payload. Instead,
it can have gaps that allow for air flow between outside and inside, enabling active cooling of the IHSM's payload. This
cooling capability sharply increases computing power by increasing feasible payload power dissipation by
two orders of magnitude.
\section{A note on terminology}
\section{Hardware Security Modules}
In this thesis, we use the term \emph{Hardware Security Module (HSM)} to refer to a security device that has the
following three properties.
\begin{enumerate}
\item A HSM targets the prevention of any conceivable physical attack. In particular, this includes intrusion attempts
such as careful drilling or cutting into the device from any direction.
\item A HSM includes tamper sensors that when triggered result in an active tamper response, usually deleting all
cryptographic secrets and rendering the device inoperable.
\item A HSM's tamper sensing and response subsystem is continuously powered from a backup power supply, usually a
battery. Loss of power triggers the tamper response.
\end{enumerate}
This use of the term \emph{HSM} aligns with common usage of the term both in the academic literature and in everyday
conversation. Particularly the requirement of active tamper detection and response is crucial to distinguish a HSM from
simpler devices such as TPMs, smart cards or secure enclaves in SoCs. Note that our use of the term HSM is slightly
different from its use in government standards, from its use in the PCI (card payment industry asscociation) standards,
and from its industry use.
In industry, the term HSM is often used for solutions that are only logically segregated and that do not include any
particular defense against hardware attacks. Our conjecture is that this is a consequence of the standardization
landscape, where for applications outside of card payment processing the US FIPS
140-22~\cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2002} standard was central to
the industry. Despite encompassing both devices that include active tamper detection and response, FIPS 140-2 did not
draw a distinction in its terminology between the two classes.
\paragraph{Use in government standards}
Under US national standard FIPS 140 in in its 2002 version
2~\cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2002}, a HSM would be called a
\emph{Multiple-Chip Cryptographic Module} that conforms to the standard's \emph{Security Level 4}. Interesting to note
are that only security level 4 requires any active tamper detection and response, so its security levels 3 and below do
not align with our HSM definition. Futher of note is that according to the standard, a single-chip solution does not
require any tamper detection and response either to meet the standard's security level 4, which is in misalignment with
our definition. The standard's 2019 updated version FIPS
140-3~\cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2019} defers to the
international standards ISO/IEC 19790 and 24759.
ISO/IEC 19790~\cite{ISOIEC19790} and ISO/IEC 24759~\cite{ISOIEC24759} call what we call a HSM a \emph{Hardware
Cryptographic Module} corresponding with the standards \emph{Security Level 4}. However, these standards only require
active tamper detection and response when cryptographic secrets are transmitted in plaintext between chips.
\paragraph{Use in card payment processing (PCI SSC) standards}
The Payment Card Industry Security Standards Council (PCI SSC) is an association of credit card network operators that
defines standards for all layes of card payment processing from card payment terminals in stores through the handling of
payment data in online shop backend systems.
PCI SSC terminology aligns with our use and with common everyday use of the term HSM. In PCI SSC terminology, a HSM is a
crytographic device that has active tamper detecion and response circuitry. However, PCI SSC terminology only differs
from our use of the term HSM in one nuance: In PCI SSC terminology, a HSM is specifically a datacenter device used for
backend processing of payment data. The general class of ``hardware devices performing some security function with or
without particular physical security requirements'' that ISO/IEC 19790 and other standards call a \emph{Hardware
Cryptographic Module}, in PCI SSC terminology is termed \emph{Secure Cryptographic Device (SCD)} in more recent standard
versions, which was updated from the previous term \emph{Tamper-Resistant Security Module (TRSM)}. Other than HSMs, PCI
SSC includes smartcards and card payment terminals in this category. Card payment terminals, referred to as
\emph{Pin-Entry Device (PED)} in PCI SSC standards, have to include a surprising amount of active tamper detection and
response functionality including partial coverage of areas like they system's main cryptographic processor and smart
card reader by battery-backed tamper-sensing meshes.
\section{Tamper-Sensing Meshes}
In this thesis, we use the terms \emph{Tamper-Sensing Mesh} and \emph{Security Mesh} synonymous. We use both terms to
refer to any electrical circuit whose path is laid out to cover a surface with the intent of detecting attempts at
drilling, cutting or otherwise manipulating this surface. While the term \emph{Security Mesh} is more concise, it is
less clear to people unfamiliar with the matter. It is also polysemous, and depending on context can also refer to woven
or stamped metal meshes used as fences or as screens in front of windows to prevent break-ins. As a result, it is harder
to use in online searches, and when using Large Language Models (LLMs), it frequently leads to amusing hallucinations.
%In the early days of mass-market computing, the expectations towards this new tool were high. Even before people
%realized the potential of computers and the internet for commercial gain, there was widespread optimism about the
%potential of globally networked computing to liberate ideas and better humanity. People imagined a future where any
%information would be available at a mere thought, where cultural and language barriers were eroded by technological
%advances, and where technology served as a universal equalizer, narrowing socioeconomic gaps and enhancing the quality
%of life for everybody.
%
%Needless to say, things did not turn out that way. After initially, home computers and the internet were briefly the
%domain of a particular brand of free-spirited enthusiast, it did not take long until the domain was captured by
%commercial interests. The dotcom bubble inflated and burst, and the introduction of smartphones catalyzed the rise of
%the social web, bringing computing to the masses. While by itself the democratization and the widespread adoption of
%computing is a good thing, the capitalist environment caused it to coincide with an overal drift of the industry away
%from the libertarian principles that were characteristic for its beginning.
%
%Specifically, throughout the past thirty years, computing ecosystems have continued a gradual evolution into walled
%gardens, primarily serving not their users anymore, but the interests of whoever owns the place that hired the place
%that made them. While in the 90ies, owning a computer meant you would be able to run any piece of software on it,
%today's platform business model means that every program requires prior approval by the platform's owners. The publicly
%stated motivation for this gradual creep invariably is security or protection from harm by bad people writing software,
%while the actual motivation is without doubt the tremendous monetary gain an operator can obtain by seeking rent for its
%platform.
%
%The platformization of computing has captured all levels of the industry, from backend systems running on hyperscale
%cloud platforms, through desktop computers running only vendor-approved operating systems through secure boot chains, up
%to low-cost smartphones containing highly secure enclaves tasked with the protection of Digital Restrictions Management
%(DRM) keys aimed at stopping the user from copying media played back on the device. Increasingly, this trend towards
%platform owners having the ultimate authority on users's computers is becoming a practical issue in high-risk settings.
% Cypherpunks
% ACAB is a anti-authoritarian sentiment
% In anarchist discourse, "cops" are not just policemen and -women, but also other means of centralized control.
% Anarchism rejects centralized authority in favor of the freedom of individuals because it recognizes the dangers
% inherent in centralized authority
% While anarchism is one extreme of the spectrum, the dangers of centralized control are well-established.
% The constitutions of all modern democracies recognize these dangers, and contain elaborate provisions such as a
% separation of powers, and extensive protections for civil society and journalism
% While modern democratic policy rejects anarchism, it embraces it's criticism of power in some vital niches.
% Examples: Whistleblower protection, attorney-client privilege, doctor-patient confidentiality and protections on state
% agents such as judges or politicians
% Centralized authority promises efficiency, but it has a tendency to go awry.
% These sanctuaries carved out from the state's authority in democracies are vital to the functioning of the system
% In today's computing environment, we observe some parallels to this limitation of centralized authority
% In classical computing, centralized control was used abundantly to create order
% Like absolute political authority becomes dangerous when subverted, centralized control in computing becomes dangerous
% when systems are compromised through hacking.
% Allocating control can be done using cryptography
% Cryptography provides near-perfect mathematical solutions to almost any control problem
% However, as anyone who has taken an introductory crypto course knows, encrypting things isn't the hard part. The hard
% part is managing keys.
% computing solutions to these problems include: Air-gapping, separation of concerns, extreme case: HSMs and TEEs
% provide security even during compromise
% interesting parallel to state control / anarchy discourse above:
% they are secure even against the state/police if implemented correctly
% observation: competent hackers are about as competent as competent police
% observation: cannot digitally encode ethics or legal stuff, so no "good guys only" backdoors
% other applications of this principle of distrusting systems are (perfect) forward secrecy
% see signal
% however, system such as TEEs and HSMs are largely a niche solution
% while some are widely deployed, e.g. TEEs for DRM and as secure boot root of trust in phones, desktops
% they are not usually democratic. despite wide deployment authority is with their manufacturer.
% To ordinary users, these capabilities are distant
% EU regulation was necessary to force apple to open up some APIs cf. nfc payment
% normal users are shit out of luck
% Thus, we need new tools. Tools that enable normal people / small orgs to assume control of their data/keys/etc.
% we need to open up the power of TEEs to everybody
% right now, open source is often less secure than closed-source
% trusted boot rarely implemented (right) in open source
% no TEE security at all because of lack of access
% we want to create democratic, open source HSMs
% open source HSMs enable many use cases to the public and small orgs that up to now only large corps or states could do
% email encryption
% secure group messaging
% signing key servers
% secure video / audio calls
% private data storage
% things like that twitter/x protocol for pin-based key recovery
% timestamping / attestation services
% base for distributed consensus protocols
% might have applications in cryptocurrencies when operated as heterogenous cluster
% but beyond that, they enable entirely new use cases.
% conventional hsms limited in computing power, crippled for the purpose of market segmentation
% ours are much more powerful, enable much higher computation crypto such as generic smpc
% generic smpc can do things like key management, pin-based security, secret statistics etc.
% furthermore, above we noted parallel between anarchist distrust of authority and core cryptographic principles
% our hsms not only protect against classical attackers, but also against states
% can be used as democratic check and balance
% example: secure comms that cannot be accessed by the state / police
% example: secure, authenticated photo and video capture
% that's especially relevant in the age of ai
%\section{The Trust Perspective}
\section{A Motivating Counter-Example}
% EPA paper from ESORICS HS3 workshop
Looking at the landscape of computer security solutions, we are presented with a wide variety of vendors and products
that may give the impression that hardware security is a solved problem. Vendors sell various claims rangning from
\emph{You don't need hardware security, just do it in the cloud!} to \emph{Buy our HSM and you will be secure!}. In
practice, things are not as easy and even well-intentioned projects still often go awry on the hardware security
dimension. Concluding this chapter, we will now have a look at one such project that was done by capable people with the
best intentions, yet it resulted in a hardware security design that is dangerously inadequate for the purpose.
Beginning May 2025, after several delays, Germany has started the nation-scale rollout of its new electronic medical
record system. The system aims to create a national database accessible to all healthcare providers that holds the
complete electronic medical records of all publically insured people living in Germany. The system aims to replace
paper-based workflows that are error-prone and lead to healthcare providers often only having access to a subset of
patient's medical records. Data in scope for the system includes medical letters, laboratory results, and medical
imaging files.
Due to Germany's mandatory health insurance laws, the system's user base encompasses the majority of all German
residents. People who have replaced their public health insurance with private insurance as of now are not subject to
the system. In Germany, by law private health insurance is only available to people from the top 10th percentile of
household income. This means that the system disproportionally affects people who have low income, creating an equity
issue. While it is possible to opt out from the use of the system, the process of opting out is difficult. Additionally,
the government and health insurance providers have publically depicted the system in a one-sidedly positive way, meaning
that it is unlikely the majority of people subject to the system have a comprehensive understanding of the system's
benefits and risks that would be necessary for an informed decision.
While there has been loud criticism of the system's security from civil society organizations such as digital rights
nonprofit organization Chaos Computer Club (CCC) \cite{kochMoreMoreExperts2025} and several severe security flaws have
been demonstrated practically, this criticism has largely been ignored by the political structures in charge. We observe
that despite this civil society outrage and the system's large scale, it has received little attention from the academic
cryptography and information security community.
In this section, we aim to point out some perplexing cryptographic engineering decisions in the system. In particular,
we point out that the system's core per-user secrets are kept in a rudimentary key escrow system whose security is based
on engineering assumptions, not on cryptographic principles. Furthermore, we observe that by specification, the
individual user keys of the system are derived from a per-user cleartext salt based on a system-wide long-term secret
with only 256 bits of entropy\footnote{
In previous versions of the standard \cite{
gematikSpezifikationSchluesselgenerierungsdienstEPA2023,
gematikUebergreifendeSpezifikationVerwendung2025,
}, there were two escrow services, with both keys used in layers to reduce the risk of a compromise of either one.
The current standard only requires one escrow service, and drops the entropy requirement of the root keys from 512
bits to 256 bits. The apparent reason for the long-term nature of these keys is that they are updated manually.
}. Finally, we note that according to specification, the only physical security requirement for the protection of this
highly sensitive secret is a ``hard, opaque potting material'', with no tamper detection and response required.
We base our analysis on the system's publicly available standards in their latest version as of the writing of the paper
underlying this section in April 2025, describing version 3.0 of the healthcare record system \cite{
gematikSpezifikationAktensystemEPA2025,
gematikUbergreifendeSpezifikationVerwendung2024,
}. We note that the implementation might well deviate from these standards and be more secure--however, with the
system's history of flaws, we believe this is unlikely to be the case. The reference implementation provided by the
specification authority \cite{GithubRepositoryERPFD} follows the specified minimum requirements closely. As of now,
there is no meaningful way for either the public or for researchers such as us to ascertain the concrete implementation
security of the system.
\subsection{The Design of ePA}
ePA (short for \emph{elektronische Patientenakte}, ``electronic patient record''), is embedded into Germany's national
public healthcare backend system ``Telematikinfrastruktur'' (TI). TI is a highly complex system, and a detailed
description would exceed the limits of this analysis. Briefly put, TI consists of a shared DMZ that parties like
insurance providers and healthcare providers connect to through a VPN. At the client location, usually an individual
doctor's office or a hospital, this VPN connection is terminated by a specialized VPN appliance named ``Konnektor'' that
simultaneously acts as a trusted component inside the client network hosting some software for purposes such as
authentication. The Konnektor contains several smart cards that store keys used for authentication. Konnektor devices
are offered by several vendors and healthcare providers like doctor's offices are indivudally responsible for purchasing
and maintaining a Konnektor.
% FIXME: Is there a threat/trust model of the system that you could summarise in a few sentences?
Every person enrolled in the system as well as every healthcare professional providing services under it is issued an ID
card that contains a smart card that contains keys used to authenticate towards the central infrastructure. The primary
use of these smart cards up to now is that when someone visits a healthcare provider, they will insert their ID card
into a terminal so the healthcare provider can automatically fetch their personal information such as name, birth date,
address and enrollment status from their insurance provider.
ePA is implemented inside the TI system. Its centralized services are accessed by healthcare providers through the TI's
VPN. Patient records are encrypted and decrypted inside TI's backend systems. Smart cards authenticate parties and
hardware devices to each other. Each insurance provider picks one of several implementations of ePA's server-side
infrastructure to run for its clients. Currently, there are two approved implementations of this server-side
infrastructure.
With the current version of the specificatoin, the overall architecture of ePA heavily relies on Trusted Execution
Environments (TEEs). Data processing on the server side is done in plaintext inside TEEs, with some cryptographic key
management delegated to a Hardware Security Module. While attacks on the TEEs are considered in the system, the HSMs are
assumed to be perfectly secure, and the system does not include mitigations for a compromised HSM. The primary
motivation for plaintext processing seems to be to enable large-scale data analysis for research purposes without
requiring consent or cooperation of the people whose records are being processed.
The primary services offered by the server side are authentication services, key escrow, and a database storing the
encrypted records themselves. Records are symmetrically encrypted with keys that are derived from system-wide secrets
inside an HSM. The primary motivation behind the use of a key escrow service seems to be to enable the creation of a
duplicate patient ID smartcard in case a person looses theirs. While the current version of the standard is unclear on
the exact mechanism of key derivation, in previous versions of the standard, the escrow service's root key, a random
salt, and the healthcare ID number of the person owning the record was used in SHA256-HKDF. The specification requires
that a new root key is generated once a year, but as far as we can tell, record key rollover is not done automatically
but is only meant to be done when the \emph{user} requests it, and old root keys must be retained forever to ensure old
records can be accessed.
\subsection{Related Work}
The state-owned company specifying the system commissioned several security assessments of the system relating to the
key escrow service. \textcite{fischlinKryptographischeAnalyseSpezifikation2021} focuses on the cryptographic
dimension of the key escrow service used in an older version of the standard, and is now obsolete.
\textcite{slanySicherheitsanalyseZurSicherheit2020} approaches the system at a higher level, and focuses on the
cryptography of the inner protocol layers spoken between the system's components. Industry research organization
Fraunhofer SIT was comissioned for a structured, theoretical assessment of attack paths to the system
\cite{fraunhofersitAbschlussberichtSicherheitsanalyseGesamtsystems2024}. We are not currently aware of
independent academic security research on the system.
The design and operation of the system have been independently described in detail by civil society activists, who have
demonstrated several successful attacks on the system. \textcite{tschirsichHackerHinOder0100} demonstrated how they
could trivially acquire each of the smartcards as well as the Konnektor necessary for accessing the system.
\textcite{tschirsichKonnteBisherNoch0100} summarize the history of attacks demonstrated on the system and show multiple
practical attacks on various parts of the system's implementation.
\subsection{Concerning Cryptographic Engineering Choices}
We wish to highlight some of the design choices in the system that we believe stray from current best practice. This is
by no means an exhaustive list, and is only meant to underscore why we believe the system deserves more scrutiny.
\subsubsection{Use of Key Escrow}
First, the system's general approach of using a key escrow service instead of securely storing the keys inside the
system's already existing smart card infrastructure is concerning, given that this key escrow service poses a
centralized security risk. The system's designers made this decision since it was deemed important that access to an
encrypted record can be restored quickly after an insurance ID card is lost, without requiring the cooperation of the
healthcare providers holding the primary copies of the person's medical records.
While key escrow services have been a topic of political debate in decades past, in the cryptographic community,
consensus generally is that they are a bad idea since they pose a centralized target for attack, and increase attack
surface \cite{
abelsonRisksKeyRecovery1997,
abelsonKeysDoormats2015,
andersonSecurityEngineeringGuide2020,
}.
\subsubsection{Cryptographic Design}
The system's overall cryptographic design is intentionally kept simple. The standard explicitly mentions that symmetric
primitives have been preferred over asymmetric primitives in the core key escrow functions due to the risk of an attack
on asymmetric primitives in the long term. Notably, other advanced cryptographic techniques such as secret sharing
schemes, oblivious pseudo-random functions, or multiparty computation that could help with the security and privacy of
the key escrow service by reducing trust placed in any single component of the service are also absent while the system
relies extensively on the engineering-based security guarantees of TEEs and HSMs. Given that the ePA system trusts its
HSMs as unconditionally secure, it is unclear what purpose the manual yearly root key renewal serves, especially absent
an automatic way to roll over the wrapped record keys.
A consequence of the systems' simple cryptographic design is that the system trusts its components to a large degree.
For instance, the system leaks a person's insurance ID number to the key escrow HSM every time record keys are
requested. Along with the timing and frequency of these requests, this leaks information on the person's condition to
the key escrow service in an identifiable way.
% TODO I feel that this section is a mix-up of critique on the cryptographic design and the approach to privacy
% protection and data minimisation. How are they linked? I'm missing some discussion here.
\subsubsection{A Realistic Attacker Model}
We observe that the system as a whole does not appear to be designed to defend against well-resourced adversaries. The
series of practical attacks that have been demonstrated on the system confirm this impression. In
\textcite{tschirsichKonnteBisherNoch0100} summarize a series of successful attacks. Attacks include social engineering
resulting in access to copies of smartcards enabling accessing patient records, using misconfigured Konnektor VPN
appliances with their LAN DMZ and authentication interface exposed on the public internet, circumventing video-based
authentication processes resulting in duplicate file keys being provided, classis SQL injection on a backend service
maintaining an authentication database, accessing all national patient records through brute-force enumeration of weak
identifiers, and several more.
We believe that a system like this must be designed to withstand well-resourced adversaries such as enemy secret
services, since the medical data stored in such as information on chronic illness, sexually transmittable disease or
severe food allergies has intelligence value. Repeated breaches of national digital infrastructure such as the 2015
breach of the US Office of Personnel Management \cite{barrettUSSuspectsHackers2015} or the 2024 compromise of US
telecommunications wiretapping systems \cite{mennChineseGovernmentHackers2024} demonstrate that such state-sponsored
attacks on national digital infrastructure are a realistic concern. A possible scenario in the ePA system would be an
enemy secret service gaining access to one of the HSMs storing the systems' root secrets, extracting the root secret by
an advanced physical attack, then being able to decrypt captured encrypted health records at will. Similarly, a
nation-state adversary might have access to an exploit allowing the compromise of the system's TEEs, which would enable
the extraction of any patient records being processed in plaintext inside these TEEs.
\subsubsection{Physical Security}
Physical security has received some consideration in the system's specification. First, smart cards are used extensively
for authentication. Second, Hardware Security Modules are used in key locations of the system to process some
cryptographic secrets. The core of the system's key escrow service is implemented inside an HSM. However, it is notable
that the actual security level required for this HSM is only FIPS 140-2 level
3 \cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2002}. Not only has FIPS 140-2
been superseded by FIPS 140-3 since
2019 \cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2019}, its security level 3
mostly provides logical separation of cryptographic functions from other logic and is not very meaningful in the context
of physical attacks. The only physical requirement of FIPS 140-2 level 3 is that the HSM has a hard, opaque coating.
This coating is specified to be tamper-evident, but notably no active tamper detection or response features are required
by this standard. In contrast to the newer FIPS 140-3 standard and the related ISO/IEC 19790 \cite{ISOIEC19790} as well
as ISO/IEC 24759 \cite{ISOIEC24759} standards, FIPS 140-2 does not make any particular requirements regarding resistance
to side-channel attacks. The lack of tamper response, unspecified resistance to side-channel attacks and the fact that
the ePA specification only requires the long-lived key escrow root key inside the HSM to have 256 bits of entropy lead
to an unsatisfactory overall constellation.
\subsection{Conclusion}
In conclusion, we observe that in Germany's ePA national medical record database, despite the decade-long
standardization and implementation process, several cryptographic compromises ended up in the system's final deployment.
Even assuming that nation-scale key escrow is a good idea, the implementation of this key escrow system seems to stray
from current best practice. The system uses a secret key with only 256 bits of entropy to derive highly sensitive secret
keys for potentially tens of millions of people sharing an insurance provider. The cryptographic design of this escrow
system is unsophisticated, ignoring the past three decades in cryptographic developments particularly in multiparty
computation (MPC) and other secret sharing techniques in favor of an engineering approach. In the engineering dimension,
the system's physical security is only held to the basic level 3 of the obsolete FIPS 140-2 standard, which is
considerably less secure than an average credit card payment terminal. The system's root keys are only protected by a
``hard, opaque potting material'' and no tamper detection and response is required. We estimate that the system poses an
attractive and soft target to nation-state adversaries. The system's shortcomings are made more severe by the fact that
the system disproportionally affects the lives of people with low income.
%FIXME work in rogawayMoralCharacterCryptographic?
% FIXME "draw an arc" does that work as an idiom here?
Drawing a wider arc, we observe that despite ample availability of commercial solutions promising easy hardware
security, clearly there is still a lack of solutions that provide the adaptability necessary for some real use cases at
low enough cost. By publishing the tamper-sensing technology we developed during the making of this thesis as open
source hardware designs, we wish to provide this missing building block to provide high-level hardware security in
real-world applications. Our hardware designs can be adapted to a devices ranging from Single-Board Computers (SBCs) to
servers, they are compatible with non-computing applications like Quantum Key Distribution (QKD) and their design
approaches can even be integrated into existing HSM designs to provide better security at little additional cost.
% FIXME FIXME FIXME chapter overview