525 lines
41 KiB
TeX
525 lines
41 KiB
TeX
|
||
\chapterquote{Meredith Whittaker~\cite{greenbergSignalMoreEncrypted2024}}{
|
||
It’s not for lack of ideas or possibilities. It’s that we actually have to start taking seriously the shifts that
|
||
are going to be required to do this thing—to build tech that rejects surveillance and centralized control—whose
|
||
necessity is now obvious to everyone.
|
||
}
|
||
|
||
\chaptertitle{Introduction}
|
||
|
||
All Cops Are Bastards, or ACAB is a slogan popular in far left and anarchist circles since the mid-twentieth century
|
||
that expresses a rejection of state authority~\cite{constantinouAppliedResearchPolicing2021}. While politically, this
|
||
blanket rejection is a fringe viewpoint with no mainstream acceptance, there exists an interesting parallel between this
|
||
and modern cryptographic best practice. In modern cryptography, it is generally seen as best practice to have the least
|
||
amount of keys possible involved in any computation, and cryptographers have time and time again strongly rejected
|
||
attempts by states and other authorities to insert backdoor access mechanisms into cryptographic systems~\cite{
|
||
abelsonRisksKeyRecovery1997,
|
||
abelsonKeysDoormats2015,
|
||
andersonSecurityEngineeringGuide2020,
|
||
}.
|
||
|
||
The aversion of cryptographers against backdoor access shows up everywhere---from cryptographic protocol standards like
|
||
TLS, to cryptographic applications like the Singal messenger, not only is backdoor access excluded from the system
|
||
design, its possibility is considered a potential vulnerability and measures such as forward secrecy and post-compromise
|
||
security are taken to mitigate its impact when it is achieved through other means. In computing, this design aspect
|
||
makes cryptographic protocols a unique holdout. In other parts of the stack, explicit or implicit backdoor access is
|
||
commonplace, and attempts at preventing it are rare. For instance, network providers are generally required to comply
|
||
with so-called \emph{Lawful Interception} orders on particular customers or traffic types, and datacenter operators
|
||
commonly provide hardware access to state authorities. The design decisions in cryptographic protocols generally hold,
|
||
and the gold standard for backdoor access to modern systems is either exploiting a \emph{zero-day} flaw that is not yet
|
||
publically known, or acquiring physical access to the target system.
|
||
|
||
In this thesis, we wish to extend the level of protection afforded by cryptographic protocol design down the technology
|
||
stack. While cryptographic protocols and modern software from the operating system up make it possible to secure the
|
||
software side of the stack to a high level, the hardware side remains poorly protected. There are a variety of hardware
|
||
security solutions in the wild, but the majority of them either do not target protection against local, physical attacks
|
||
-- such as Trusted Platform Modules (TPMs) -- or are not widely available due to market segmentation or cost -- such as
|
||
conventional Hardware Security Modules (HSMs).
|
||
|
||
To extend this protection, we propose the Inertial Hardware Security Module (IHSM), a new type of HSM that extends the
|
||
high level of protection offered by the modern cryptographic software stack down to the hardware level, enabling secure
|
||
computation in insecure places. We chose to publish all our IHSM as open source and unencumbered by patents to enable
|
||
widespread adoption. IHSMs can be custom built with only basic manufacturing capabilities at small scale and enable the
|
||
deployment of secure computation in insecure places even to small organizations such as university research departments,
|
||
NGOs and small businesses.
|
||
|
||
Recent history has shown that state-level adversaries are a mounting threat to civil rights organizations, human rights
|
||
lawyers, members of minorities, and many others. While western democracies used to be considered safe havens of human
|
||
rights, today human rights are under attack both from within and from the outside in countries across the globe.
|
||
Publishing IHSM technology as open source, we hope to provide one building block for new computing systems accessible to
|
||
all that are resilient and secure in the face of growing adversity.
|
||
|
||
Complementing our IHSM concept and prototype, we provide solutions to engineering issues such as wireless power transfer
|
||
adapting them to our use case. Further, we propose improvements to the state of the art in HSM tamper sensors such as
|
||
the use of low-cost, embeddable Time-Domain Reflectometry (TDR) that not only improve the security of IHSMs, but that
|
||
can even be applied to conventional HSMs. We conclude this thesis with an overview of two concrete use cases IHSMs
|
||
unlock that were previously infeasible using conventional HSMs: Datacenter-scale Secure Multiparty Computation (SMPC)
|
||
and long-range Quantum Key Distribution (QKD) networks.
|
||
|
||
\section{Building Inertial HSMs}
|
||
|
||
In a system with a secure software stack, the role of a HSM is to secure the hardware part of the stack. The basic
|
||
approach of a HSM is to combine a secure software stack with a fast self-destruct mechanism and tamper sensors. The
|
||
self-destruct mechanism can be hardware or software that quickly, securely destroys all cryptographic secrets, rendering
|
||
the device worthless to an attacker. The tamper sensors are tasked with detecting any physical attack an attacker could
|
||
mount on the device. Common classes of such sensors include \emph{tamper-sensing meshes}, i.e.\ flexible foils attached
|
||
to the HSM's enclosure that detect attempts at penetrating the shell of the device with probes, and environmental
|
||
sensors such as temperature or radiation sensors that detect attempts at causing controllable faults in the HSM by
|
||
heating, cooling or irradiating it. Out of these sensors, the tamper-sensing meshes are the core line of defense against
|
||
most physical attacks. Such meshes are very effective at mitigating almost all physical attacks, but they are difficult
|
||
to construct securely as they usually require bespoke manufacturing processes. As a result, they are currently only used
|
||
in niche applications, and even there not every realization is equally secure.
|
||
|
||
Inertial HSMs solve the issue of creating an impenetrable tamper-sensing envelope by replacing the bespoke
|
||
tamper-sensing mesh foil with a set of simple, rigid meshes made from commodity Printed Circuit Boards (PCBs) that are
|
||
rotating at high speed. In motion, these simple PCB tamper-sensing meshes are as secure as the much more sophisticated
|
||
bespoke foils used in conventional HSMs, yet they are simpler and less expensive to manufacture. To verify that the mesh
|
||
is rotating correctly, an accelerometer is placed on the rotating mesh, and its centrifugal force reading is used to
|
||
validate its path of motion.
|
||
|
||
\section{Cryptographic Principles and Physical Reality}
|
||
|
||
Cryptographers' aversion to backdoor access derives from a combination of two fundamental computing principles:
|
||
Kerckhoffs' principle, and the principle of least authority. In cryptography, Kerckhoffs' principle, named after Dutch
|
||
military cryptographer Auguste Kerckhoffs, expresses that the security of a cryptographic system should only depend on
|
||
the secrecy of its keys, not on the secrecy of its design. In this way, Kerckhoff's principle states the opposite of the
|
||
common industry practice of \emph{Security by Obscurity}, which aims to achieve security by making it sufficiently
|
||
annoying to cryptoanalyze a system that nobody bothers. Complementary to Kerckhoff's is the principle of least
|
||
authority, which describes that in a secure system each component should only have access to the smallest set of
|
||
capabilities necessary to fulfill its purpose. Applying both to a cryptographic system means that the system's design
|
||
should be transparent and not include any hiddent components or opaque parts that cannot be inspected, and that the
|
||
system's keys should be scoped to place the least amount of trust possible in each participating party.
|
||
|
||
Let's take a basic videoconferencing system as an example. In our example system's deployment, users logen to a central
|
||
conference server, which receives and distributes the users' video streams. Allowing backdoor access to the video
|
||
streams to some third party like a datacenter operator or a state would violate Kerckhoffs' principle since it would
|
||
have to be hidden from the systems' participants, who would therefore not have a complete view of the systems' deployed
|
||
architecture. The principle of least authority would also be violated since in almost all cases, such a backdoor access
|
||
system would not see legitimate use. As a result, it would possess capabilities that almost never would be essential to
|
||
the proper function of the videoconference system.
|
||
|
||
In their design, almost all modern software -- especially open source -- cleanly applies these principles. However, the
|
||
practical reality after deployment almost always deviates from them. While backdoors are vanishingly rare in modern
|
||
open-source software, practical depoloyments usually are vulnerable to physical attacks. Modern hardware generally is
|
||
not designed with a local attacker with advanced physical attack capabilities in mind since no mitigation can fully
|
||
prevent them, they can only be detected, or at best slowed down. As a result, commonplace attacks against modern
|
||
software often involve taking over the hardware at some point in the chain. Even End-to-End-Encrypted (E2EE)
|
||
communication systems can be compromised if one of the encrypted channel's endpoints can be physically compromised.
|
||
Corresponding \emph{digital forensics} capabilities are commonplace among state actors, and are available as a turnkey
|
||
solution on the market.
|
||
|
||
\section{Inertial HSM Applications}
|
||
|
||
Inertial HSMs are the first fully open source HSM with advanced tamper sensing features. Across application domains,
|
||
Inertial HSMs can be applied to gain resistance to physical attacks in scenarios where conventional HSMs were not used
|
||
because of cost, computing power or implementation effort. Where conventional HSMs come as fully integrated devices that
|
||
only expose limited APIs to their users, Inertial HSMs at their core are just an enclosure that the user can put
|
||
whatever hardware they need into. Since the simpler tamper-sensing mesh construction of IHSMs scales to larger payload
|
||
volumes, entire servers can be protected---something that is impossible with conventional HSMs. Since the mesh in an
|
||
IHSM is constantly moving, unlike a mesh in a convetional HSM, it does not have to entirely cover the payload. Instead,
|
||
it can have gaps that allow for air flow between outside and inside, enabling active cooling of the IHSM's payload. This
|
||
cooling capability sharply increases computing power by increasing feasible payload power dissipation by
|
||
two orders of magnitude.
|
||
|
||
\section{A note on terminology}
|
||
|
||
\section{Hardware Security Modules}
|
||
|
||
In this thesis, we use the term \emph{Hardware Security Module (HSM)} to refer to a security device that has the
|
||
following three properties.
|
||
|
||
\begin{enumerate}
|
||
\item A HSM targets the prevention of any conceivable physical attack. In particular, this includes intrusion attempts
|
||
such as careful drilling or cutting into the device from any direction.
|
||
\item A HSM includes tamper sensors that when triggered result in an active tamper response, usually deleting all
|
||
cryptographic secrets and rendering the device inoperable.
|
||
\item A HSM's tamper sensing and response subsystem is continuously powered from a backup power supply, usually a
|
||
battery. Loss of power triggers the tamper response.
|
||
\end{enumerate}
|
||
|
||
This use of the term \emph{HSM} aligns with common usage of the term both in the academic literature and in everyday
|
||
conversation. Particularly the requirement of active tamper detection and response is crucial to distinguish a HSM from
|
||
simpler devices such as TPMs, smart cards or secure enclaves in SoCs. Note that our use of the term HSM is slightly
|
||
different from its use in government standards, from its use in the PCI (card payment industry asscociation) standards,
|
||
and from its industry use.
|
||
|
||
In industry, the term HSM is often used for solutions that are only logically segregated and that do not include any
|
||
particular defense against hardware attacks. Our conjecture is that this is a consequence of the standardization
|
||
landscape, where for applications outside of card payment processing the US FIPS
|
||
140-22~\cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2002} standard was central to
|
||
the industry. Despite encompassing both devices that include active tamper detection and response, FIPS 140-2 did not
|
||
draw a distinction in its terminology between the two classes.
|
||
|
||
\paragraph{Use in government standards}
|
||
|
||
Under US national standard FIPS 140 in in its 2002 version
|
||
2~\cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2002}, a HSM would be called a
|
||
\emph{Multiple-Chip Cryptographic Module} that conforms to the standard's \emph{Security Level 4}. Interesting to note
|
||
are that only security level 4 requires any active tamper detection and response, so its security levels 3 and below do
|
||
not align with our HSM definition. Futher of note is that according to the standard, a single-chip solution does not
|
||
require any tamper detection and response either to meet the standard's security level 4, which is in misalignment with
|
||
our definition. The standard's 2019 updated version FIPS
|
||
140-3~\cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2019} defers to the
|
||
international standards ISO/IEC 19790 and 24759.
|
||
|
||
ISO/IEC 19790~\cite{ISOIEC19790} and ISO/IEC 24759~\cite{ISOIEC24759} call what we call a HSM a \emph{Hardware
|
||
Cryptographic Module} corresponding with the standards \emph{Security Level 4}. However, these standards only require
|
||
active tamper detection and response when cryptographic secrets are transmitted in plaintext between chips.
|
||
|
||
\paragraph{Use in card payment processing (PCI SSC) standards}
|
||
|
||
The Payment Card Industry Security Standards Council (PCI SSC) is an association of credit card network operators that
|
||
defines standards for all layes of card payment processing from card payment terminals in stores through the handling of
|
||
payment data in online shop backend systems.
|
||
|
||
PCI SSC terminology aligns with our use and with common everyday use of the term HSM. In PCI SSC terminology, a HSM is a
|
||
crytographic device that has active tamper detecion and response circuitry. However, PCI SSC terminology only differs
|
||
from our use of the term HSM in one nuance: In PCI SSC terminology, a HSM is specifically a datacenter device used for
|
||
backend processing of payment data. The general class of ``hardware devices performing some security function with or
|
||
without particular physical security requirements'' that ISO/IEC 19790 and other standards call a \emph{Hardware
|
||
Cryptographic Module}, in PCI SSC terminology is termed \emph{Secure Cryptographic Device (SCD)} in more recent standard
|
||
versions, which was updated from the previous term \emph{Tamper-Resistant Security Module (TRSM)}. Other than HSMs, PCI
|
||
SSC includes smartcards and card payment terminals in this category. Card payment terminals, referred to as
|
||
\emph{Pin-Entry Device (PED)} in PCI SSC standards, have to include a surprising amount of active tamper detection and
|
||
response functionality including partial coverage of areas like they system's main cryptographic processor and smart
|
||
card reader by battery-backed tamper-sensing meshes.
|
||
|
||
\section{Tamper-Sensing Meshes}
|
||
|
||
In this thesis, we use the terms \emph{Tamper-Sensing Mesh} and \emph{Security Mesh} synonymous. We use both terms to
|
||
refer to any electrical circuit whose path is laid out to cover a surface with the intent of detecting attempts at
|
||
drilling, cutting or otherwise manipulating this surface. While the term \emph{Security Mesh} is more concise, it is
|
||
less clear to people unfamiliar with the matter. It is also polysemous, and depending on context can also refer to woven
|
||
or stamped metal meshes used as fences or as screens in front of windows to prevent break-ins. As a result, it is harder
|
||
to use in online searches, and when using Large Language Models (LLMs), it frequently leads to amusing hallucinations.
|
||
|
||
%In the early days of mass-market computing, the expectations towards this new tool were high. Even before people
|
||
%realized the potential of computers and the internet for commercial gain, there was widespread optimism about the
|
||
%potential of globally networked computing to liberate ideas and better humanity. People imagined a future where any
|
||
%information would be available at a mere thought, where cultural and language barriers were eroded by technological
|
||
%advances, and where technology served as a universal equalizer, narrowing socioeconomic gaps and enhancing the quality
|
||
%of life for everybody.
|
||
%
|
||
%Needless to say, things did not turn out that way. After initially, home computers and the internet were briefly the
|
||
%domain of a particular brand of free-spirited enthusiast, it did not take long until the domain was captured by
|
||
%commercial interests. The dotcom bubble inflated and burst, and the introduction of smartphones catalyzed the rise of
|
||
%the social web, bringing computing to the masses. While by itself the democratization and the widespread adoption of
|
||
%computing is a good thing, the capitalist environment caused it to coincide with an overal drift of the industry away
|
||
%from the libertarian principles that were characteristic for its beginning.
|
||
%
|
||
%Specifically, throughout the past thirty years, computing ecosystems have continued a gradual evolution into walled
|
||
%gardens, primarily serving not their users anymore, but the interests of whoever owns the place that hired the place
|
||
%that made them. While in the 90ies, owning a computer meant you would be able to run any piece of software on it,
|
||
%today's platform business model means that every program requires prior approval by the platform's owners. The publicly
|
||
%stated motivation for this gradual creep invariably is security or protection from harm by bad people writing software,
|
||
%while the actual motivation is without doubt the tremendous monetary gain an operator can obtain by seeking rent for its
|
||
%platform.
|
||
%
|
||
%The platformization of computing has captured all levels of the industry, from backend systems running on hyperscale
|
||
%cloud platforms, through desktop computers running only vendor-approved operating systems through secure boot chains, up
|
||
%to low-cost smartphones containing highly secure enclaves tasked with the protection of Digital Restrictions Management
|
||
%(DRM) keys aimed at stopping the user from copying media played back on the device. Increasingly, this trend towards
|
||
%platform owners having the ultimate authority on users's computers is becoming a practical issue in high-risk settings.
|
||
|
||
|
||
|
||
% Cypherpunks
|
||
% ACAB is a anti-authoritarian sentiment
|
||
% In anarchist discourse, "cops" are not just policemen and -women, but also other means of centralized control.
|
||
% Anarchism rejects centralized authority in favor of the freedom of individuals because it recognizes the dangers
|
||
% inherent in centralized authority
|
||
|
||
% While anarchism is one extreme of the spectrum, the dangers of centralized control are well-established.
|
||
% The constitutions of all modern democracies recognize these dangers, and contain elaborate provisions such as a
|
||
% separation of powers, and extensive protections for civil society and journalism
|
||
% While modern democratic policy rejects anarchism, it embraces it's criticism of power in some vital niches.
|
||
% Examples: Whistleblower protection, attorney-client privilege, doctor-patient confidentiality and protections on state
|
||
% agents such as judges or politicians
|
||
|
||
% Centralized authority promises efficiency, but it has a tendency to go awry.
|
||
% These sanctuaries carved out from the state's authority in democracies are vital to the functioning of the system
|
||
% In today's computing environment, we observe some parallels to this limitation of centralized authority
|
||
% In classical computing, centralized control was used abundantly to create order
|
||
% Like absolute political authority becomes dangerous when subverted, centralized control in computing becomes dangerous
|
||
% when systems are compromised through hacking.
|
||
% Allocating control can be done using cryptography
|
||
% Cryptography provides near-perfect mathematical solutions to almost any control problem
|
||
% However, as anyone who has taken an introductory crypto course knows, encrypting things isn't the hard part. The hard
|
||
% part is managing keys.
|
||
|
||
% computing solutions to these problems include: Air-gapping, separation of concerns, extreme case: HSMs and TEEs
|
||
% provide security even during compromise
|
||
% interesting parallel to state control / anarchy discourse above:
|
||
% they are secure even against the state/police if implemented correctly
|
||
% observation: competent hackers are about as competent as competent police
|
||
% observation: cannot digitally encode ethics or legal stuff, so no "good guys only" backdoors
|
||
|
||
% other applications of this principle of distrusting systems are (perfect) forward secrecy
|
||
% see signal
|
||
% however, system such as TEEs and HSMs are largely a niche solution
|
||
% while some are widely deployed, e.g. TEEs for DRM and as secure boot root of trust in phones, desktops
|
||
% they are not usually democratic. despite wide deployment authority is with their manufacturer.
|
||
% To ordinary users, these capabilities are distant
|
||
% EU regulation was necessary to force apple to open up some APIs cf. nfc payment
|
||
% normal users are shit out of luck
|
||
|
||
% Thus, we need new tools. Tools that enable normal people / small orgs to assume control of their data/keys/etc.
|
||
% we need to open up the power of TEEs to everybody
|
||
% right now, open source is often less secure than closed-source
|
||
% trusted boot rarely implemented (right) in open source
|
||
% no TEE security at all because of lack of access
|
||
% we want to create democratic, open source HSMs
|
||
|
||
% open source HSMs enable many use cases to the public and small orgs that up to now only large corps or states could do
|
||
% email encryption
|
||
% secure group messaging
|
||
% signing key servers
|
||
% secure video / audio calls
|
||
% private data storage
|
||
% things like that twitter/x protocol for pin-based key recovery
|
||
% timestamping / attestation services
|
||
% base for distributed consensus protocols
|
||
% might have applications in cryptocurrencies when operated as heterogenous cluster
|
||
|
||
% but beyond that, they enable entirely new use cases.
|
||
% conventional hsms limited in computing power, crippled for the purpose of market segmentation
|
||
% ours are much more powerful, enable much higher computation crypto such as generic smpc
|
||
% generic smpc can do things like key management, pin-based security, secret statistics etc.
|
||
% furthermore, above we noted parallel between anarchist distrust of authority and core cryptographic principles
|
||
% our hsms not only protect against classical attackers, but also against states
|
||
% can be used as democratic check and balance
|
||
% example: secure comms that cannot be accessed by the state / police
|
||
% example: secure, authenticated photo and video capture
|
||
% that's especially relevant in the age of ai
|
||
|
||
%\section{The Trust Perspective}
|
||
|
||
\section{A Motivating Counter-Example}
|
||
|
||
% EPA paper from ESORICS HS3 workshop
|
||
|
||
Looking at the landscape of computer security solutions, we are presented with a wide variety of vendors and products
|
||
that may give the impression that hardware security is a solved problem. Vendors sell various claims rangning from
|
||
\emph{You don't need hardware security, just do it in the cloud!} to \emph{Buy our HSM and you will be secure!}. In
|
||
practice, things are not as easy and even well-intentioned projects still often go awry on the hardware security
|
||
dimension. Concluding this chapter, we will now have a look at one such project that was done by capable people with the
|
||
best intentions, yet it resulted in a hardware security design that is dangerously inadequate for the purpose.
|
||
|
||
Beginning May 2025, after several delays, Germany has started the nation-scale rollout of its new electronic medical
|
||
record system. The system aims to create a national database accessible to all healthcare providers that holds the
|
||
complete electronic medical records of all publically insured people living in Germany. The system aims to replace
|
||
paper-based workflows that are error-prone and lead to healthcare providers often only having access to a subset of
|
||
patient's medical records. Data in scope for the system includes medical letters, laboratory results, and medical
|
||
imaging files.
|
||
|
||
Due to Germany's mandatory health insurance laws, the system's user base encompasses the majority of all German
|
||
residents. People who have replaced their public health insurance with private insurance as of now are not subject to
|
||
the system. In Germany, by law private health insurance is only available to people from the top 10th percentile of
|
||
household income. This means that the system disproportionally affects people who have low income, creating an equity
|
||
issue. While it is possible to opt out from the use of the system, the process of opting out is difficult. Additionally,
|
||
the government and health insurance providers have publically depicted the system in a one-sidedly positive way, meaning
|
||
that it is unlikely the majority of people subject to the system have a comprehensive understanding of the system's
|
||
benefits and risks that would be necessary for an informed decision.
|
||
|
||
While there has been loud criticism of the system's security from civil society organizations such as digital rights
|
||
nonprofit organization Chaos Computer Club (CCC) \cite{kochMoreMoreExperts2025} and several severe security flaws have
|
||
been demonstrated practically, this criticism has largely been ignored by the political structures in charge. We observe
|
||
that despite this civil society outrage and the system's large scale, it has received little attention from the academic
|
||
cryptography and information security community.
|
||
|
||
In this section, we aim to point out some perplexing cryptographic engineering decisions in the system. In particular,
|
||
we point out that the system's core per-user secrets are kept in a rudimentary key escrow system whose security is based
|
||
on engineering assumptions, not on cryptographic principles. Furthermore, we observe that by specification, the
|
||
individual user keys of the system are derived from a per-user cleartext salt based on a system-wide long-term secret
|
||
with only 256 bits of entropy\footnote{
|
||
In previous versions of the standard \cite{
|
||
gematikSpezifikationSchluesselgenerierungsdienstEPA2023,
|
||
gematikUebergreifendeSpezifikationVerwendung2025,
|
||
}, there were two escrow services, with both keys used in layers to reduce the risk of a compromise of either one.
|
||
The current standard only requires one escrow service, and drops the entropy requirement of the root keys from 512
|
||
bits to 256 bits. The apparent reason for the long-term nature of these keys is that they are updated manually.
|
||
}. Finally, we note that according to specification, the only physical security requirement for the protection of this
|
||
highly sensitive secret is a ``hard, opaque potting material'', with no tamper detection and response required.
|
||
|
||
We base our analysis on the system's publicly available standards in their latest version as of the writing of the paper
|
||
underlying this section in April 2025, describing version 3.0 of the healthcare record system \cite{
|
||
gematikSpezifikationAktensystemEPA2025,
|
||
gematikUbergreifendeSpezifikationVerwendung2024,
|
||
}. We note that the implementation might well deviate from these standards and be more secure--however, with the
|
||
system's history of flaws, we believe this is unlikely to be the case. The reference implementation provided by the
|
||
specification authority \cite{GithubRepositoryERPFD} follows the specified minimum requirements closely. As of now,
|
||
there is no meaningful way for either the public or for researchers such as us to ascertain the concrete implementation
|
||
security of the system.
|
||
|
||
\subsection{The Design of ePA}
|
||
|
||
ePA (short for \emph{elektronische Patientenakte}, ``electronic patient record''), is embedded into Germany's national
|
||
public healthcare backend system ``Telematikinfrastruktur'' (TI). TI is a highly complex system, and a detailed
|
||
description would exceed the limits of this analysis. Briefly put, TI consists of a shared DMZ that parties like
|
||
insurance providers and healthcare providers connect to through a VPN. At the client location, usually an individual
|
||
doctor's office or a hospital, this VPN connection is terminated by a specialized VPN appliance named ``Konnektor'' that
|
||
simultaneously acts as a trusted component inside the client network hosting some software for purposes such as
|
||
authentication. The Konnektor contains several smart cards that store keys used for authentication. Konnektor devices
|
||
are offered by several vendors and healthcare providers like doctor's offices are indivudally responsible for purchasing
|
||
and maintaining a Konnektor.
|
||
|
||
% FIXME: Is there a threat/trust model of the system that you could summarise in a few sentences?
|
||
|
||
Every person enrolled in the system as well as every healthcare professional providing services under it is issued an ID
|
||
card that contains a smart card that contains keys used to authenticate towards the central infrastructure. The primary
|
||
use of these smart cards up to now is that when someone visits a healthcare provider, they will insert their ID card
|
||
into a terminal so the healthcare provider can automatically fetch their personal information such as name, birth date,
|
||
address and enrollment status from their insurance provider.
|
||
|
||
ePA is implemented inside the TI system. Its centralized services are accessed by healthcare providers through the TI's
|
||
VPN. Patient records are encrypted and decrypted inside TI's backend systems. Smart cards authenticate parties and
|
||
hardware devices to each other. Each insurance provider picks one of several implementations of ePA's server-side
|
||
infrastructure to run for its clients. Currently, there are two approved implementations of this server-side
|
||
infrastructure.
|
||
|
||
With the current version of the specificatoin, the overall architecture of ePA heavily relies on Trusted Execution
|
||
Environments (TEEs). Data processing on the server side is done in plaintext inside TEEs, with some cryptographic key
|
||
management delegated to a Hardware Security Module. While attacks on the TEEs are considered in the system, the HSMs are
|
||
assumed to be perfectly secure, and the system does not include mitigations for a compromised HSM. The primary
|
||
motivation for plaintext processing seems to be to enable large-scale data analysis for research purposes without
|
||
requiring consent or cooperation of the people whose records are being processed.
|
||
|
||
The primary services offered by the server side are authentication services, key escrow, and a database storing the
|
||
encrypted records themselves. Records are symmetrically encrypted with keys that are derived from system-wide secrets
|
||
inside an HSM. The primary motivation behind the use of a key escrow service seems to be to enable the creation of a
|
||
duplicate patient ID smartcard in case a person looses theirs. While the current version of the standard is unclear on
|
||
the exact mechanism of key derivation, in previous versions of the standard, the escrow service's root key, a random
|
||
salt, and the healthcare ID number of the person owning the record was used in SHA256-HKDF. The specification requires
|
||
that a new root key is generated once a year, but as far as we can tell, record key rollover is not done automatically
|
||
but is only meant to be done when the \emph{user} requests it, and old root keys must be retained forever to ensure old
|
||
records can be accessed.
|
||
|
||
\subsection{Related Work}
|
||
|
||
The state-owned company specifying the system commissioned several security assessments of the system relating to the
|
||
key escrow service. \textcite{fischlinKryptographischeAnalyseSpezifikation2021} focuses on the cryptographic
|
||
dimension of the key escrow service used in an older version of the standard, and is now obsolete.
|
||
\textcite{slanySicherheitsanalyseZurSicherheit2020} approaches the system at a higher level, and focuses on the
|
||
cryptography of the inner protocol layers spoken between the system's components. Industry research organization
|
||
Fraunhofer SIT was comissioned for a structured, theoretical assessment of attack paths to the system
|
||
\cite{fraunhofersitAbschlussberichtSicherheitsanalyseGesamtsystems2024}. We are not currently aware of
|
||
independent academic security research on the system.
|
||
|
||
The design and operation of the system have been independently described in detail by civil society activists, who have
|
||
demonstrated several successful attacks on the system. \textcite{tschirsichHackerHinOder0100} demonstrated how they
|
||
could trivially acquire each of the smartcards as well as the Konnektor necessary for accessing the system.
|
||
\textcite{tschirsichKonnteBisherNoch0100} summarize the history of attacks demonstrated on the system and show multiple
|
||
practical attacks on various parts of the system's implementation.
|
||
|
||
\subsection{Concerning Cryptographic Engineering Choices}
|
||
|
||
We wish to highlight some of the design choices in the system that we believe stray from current best practice. This is
|
||
by no means an exhaustive list, and is only meant to underscore why we believe the system deserves more scrutiny.
|
||
|
||
\subsubsection{Use of Key Escrow}
|
||
|
||
First, the system's general approach of using a key escrow service instead of securely storing the keys inside the
|
||
system's already existing smart card infrastructure is concerning, given that this key escrow service poses a
|
||
centralized security risk. The system's designers made this decision since it was deemed important that access to an
|
||
encrypted record can be restored quickly after an insurance ID card is lost, without requiring the cooperation of the
|
||
healthcare providers holding the primary copies of the person's medical records.
|
||
|
||
While key escrow services have been a topic of political debate in decades past, in the cryptographic community,
|
||
consensus generally is that they are a bad idea since they pose a centralized target for attack, and increase attack
|
||
surface \cite{
|
||
abelsonRisksKeyRecovery1997,
|
||
abelsonKeysDoormats2015,
|
||
andersonSecurityEngineeringGuide2020,
|
||
}.
|
||
|
||
\subsubsection{Cryptographic Design}
|
||
|
||
The system's overall cryptographic design is intentionally kept simple. The standard explicitly mentions that symmetric
|
||
primitives have been preferred over asymmetric primitives in the core key escrow functions due to the risk of an attack
|
||
on asymmetric primitives in the long term. Notably, other advanced cryptographic techniques such as secret sharing
|
||
schemes, oblivious pseudo-random functions, or multiparty computation that could help with the security and privacy of
|
||
the key escrow service by reducing trust placed in any single component of the service are also absent while the system
|
||
relies extensively on the engineering-based security guarantees of TEEs and HSMs. Given that the ePA system trusts its
|
||
HSMs as unconditionally secure, it is unclear what purpose the manual yearly root key renewal serves, especially absent
|
||
an automatic way to roll over the wrapped record keys.
|
||
|
||
A consequence of the systems' simple cryptographic design is that the system trusts its components to a large degree.
|
||
For instance, the system leaks a person's insurance ID number to the key escrow HSM every time record keys are
|
||
requested. Along with the timing and frequency of these requests, this leaks information on the person's condition to
|
||
the key escrow service in an identifiable way.
|
||
|
||
% TODO I feel that this section is a mix-up of critique on the cryptographic design and the approach to privacy
|
||
% protection and data minimisation. How are they linked? I'm missing some discussion here.
|
||
|
||
\subsubsection{A Realistic Attacker Model}
|
||
|
||
We observe that the system as a whole does not appear to be designed to defend against well-resourced adversaries. The
|
||
series of practical attacks that have been demonstrated on the system confirm this impression. In
|
||
\textcite{tschirsichKonnteBisherNoch0100} summarize a series of successful attacks. Attacks include social engineering
|
||
resulting in access to copies of smartcards enabling accessing patient records, using misconfigured Konnektor VPN
|
||
appliances with their LAN DMZ and authentication interface exposed on the public internet, circumventing video-based
|
||
authentication processes resulting in duplicate file keys being provided, classis SQL injection on a backend service
|
||
maintaining an authentication database, accessing all national patient records through brute-force enumeration of weak
|
||
identifiers, and several more.
|
||
|
||
We believe that a system like this must be designed to withstand well-resourced adversaries such as enemy secret
|
||
services, since the medical data stored in such as information on chronic illness, sexually transmittable disease or
|
||
severe food allergies has intelligence value. Repeated breaches of national digital infrastructure such as the 2015
|
||
breach of the US Office of Personnel Management \cite{barrettUSSuspectsHackers2015} or the 2024 compromise of US
|
||
telecommunications wiretapping systems \cite{mennChineseGovernmentHackers2024} demonstrate that such state-sponsored
|
||
attacks on national digital infrastructure are a realistic concern. A possible scenario in the ePA system would be an
|
||
enemy secret service gaining access to one of the HSMs storing the systems' root secrets, extracting the root secret by
|
||
an advanced physical attack, then being able to decrypt captured encrypted health records at will. Similarly, a
|
||
nation-state adversary might have access to an exploit allowing the compromise of the system's TEEs, which would enable
|
||
the extraction of any patient records being processed in plaintext inside these TEEs.
|
||
|
||
\subsubsection{Physical Security}
|
||
|
||
Physical security has received some consideration in the system's specification. First, smart cards are used extensively
|
||
for authentication. Second, Hardware Security Modules are used in key locations of the system to process some
|
||
cryptographic secrets. The core of the system's key escrow service is implemented inside an HSM. However, it is notable
|
||
that the actual security level required for this HSM is only FIPS 140-2 level
|
||
3 \cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2002}. Not only has FIPS 140-2
|
||
been superseded by FIPS 140-3 since
|
||
2019 \cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2019}, its security level 3
|
||
mostly provides logical separation of cryptographic functions from other logic and is not very meaningful in the context
|
||
of physical attacks. The only physical requirement of FIPS 140-2 level 3 is that the HSM has a hard, opaque coating.
|
||
This coating is specified to be tamper-evident, but notably no active tamper detection or response features are required
|
||
by this standard. In contrast to the newer FIPS 140-3 standard and the related ISO/IEC 19790 \cite{ISOIEC19790} as well
|
||
as ISO/IEC 24759 \cite{ISOIEC24759} standards, FIPS 140-2 does not make any particular requirements regarding resistance
|
||
to side-channel attacks. The lack of tamper response, unspecified resistance to side-channel attacks and the fact that
|
||
the ePA specification only requires the long-lived key escrow root key inside the HSM to have 256 bits of entropy lead
|
||
to an unsatisfactory overall constellation.
|
||
|
||
\subsection{Conclusion}
|
||
|
||
In conclusion, we observe that in Germany's ePA national medical record database, despite the decade-long
|
||
standardization and implementation process, several cryptographic compromises ended up in the system's final deployment.
|
||
Even assuming that nation-scale key escrow is a good idea, the implementation of this key escrow system seems to stray
|
||
from current best practice. The system uses a secret key with only 256 bits of entropy to derive highly sensitive secret
|
||
keys for potentially tens of millions of people sharing an insurance provider. The cryptographic design of this escrow
|
||
system is unsophisticated, ignoring the past three decades in cryptographic developments particularly in multiparty
|
||
computation (MPC) and other secret sharing techniques in favor of an engineering approach. In the engineering dimension,
|
||
the system's physical security is only held to the basic level 3 of the obsolete FIPS 140-2 standard, which is
|
||
considerably less secure than an average credit card payment terminal. The system's root keys are only protected by a
|
||
``hard, opaque potting material'' and no tamper detection and response is required. We estimate that the system poses an
|
||
attractive and soft target to nation-state adversaries. The system's shortcomings are made more severe by the fact that
|
||
the system disproportionally affects the lives of people with low income.
|
||
|
||
%FIXME work in rogawayMoralCharacterCryptographic?
|
||
% FIXME "draw an arc" does that work as an idiom here?
|
||
Drawing a wider arc, we observe that despite ample availability of commercial solutions promising easy hardware
|
||
security, clearly there is still a lack of solutions that provide the adaptability necessary for some real use cases at
|
||
low enough cost. By publishing the tamper-sensing technology we developed during the making of this thesis as open
|
||
source hardware designs, we wish to provide this missing building block to provide high-level hardware security in
|
||
real-world applications. Our hardware designs can be adapted to a devices ranging from Single-Board Computers (SBCs) to
|
||
servers, they are compatible with non-computing applications like Quantum Key Distribution (QKD) and their design
|
||
approaches can even be integrated into existing HSM designs to provide better security at little additional cost.
|
||
|
||
% FIXME FIXME FIXME chapter overview
|
||
|
||
\printbibliography[heading=bibintoc]
|
||
|
||
\end{document}
|
||
|
||
|