From 3b64c52c69b0906eb528bb6f0f712d446bdb9480 Mon Sep 17 00:00:00 2001 From: jaseg Date: Wed, 22 Oct 2025 16:05:52 +0200 Subject: [PATCH] Rework intro chapter --- chapter-conclusion/chapter.tex | 7 + chapter-epa/Makefile | 1 + chapter-epa/chapter.tex | 232 +++++++++++++++++ chapter-introduction/chapter.tex | 429 +++++-------------------------- common-defs.tex | 10 +- common-packages.tex | 1 + thesis.tex | 1 + 7 files changed, 311 insertions(+), 370 deletions(-) create mode 120000 chapter-epa/Makefile create mode 100644 chapter-epa/chapter.tex diff --git a/chapter-conclusion/chapter.tex b/chapter-conclusion/chapter.tex index 2289844..be4a14d 100644 --- a/chapter-conclusion/chapter.tex +++ b/chapter-conclusion/chapter.tex @@ -18,3 +18,10 @@ by IHSM technology due to its ability to protect large payloads that have high p We believe that with the research presented in this thesis, we substantially advanced the physical security field. In particular, we belive that by publishing our research including its artifacts under open-source licenses, we provide the basis for future research in tamper-sensing technology, a field that remains under-served in today's academic landscape. + +Recent history has shown that state-level adversaries are a mounting threat to civil rights organizations, human rights +lawyers, members of minorities, and many others. While western democracies used to be considered safe havens of human +rights, today human rights are under attack both from within and from the outside in countries across the globe. +Publishing IHSM technology as open source, we hope to provide one building block for new computing systems accessible to +all that are resilient and secure in the face of growing adversity. + diff --git a/chapter-epa/Makefile b/chapter-epa/Makefile new file mode 120000 index 0000000..3322dc4 --- /dev/null +++ b/chapter-epa/Makefile @@ -0,0 +1 @@ +../Chapter_Makefile \ No newline at end of file diff --git a/chapter-epa/chapter.tex b/chapter-epa/chapter.tex new file mode 100644 index 0000000..31601b2 --- /dev/null +++ b/chapter-epa/chapter.tex @@ -0,0 +1,232 @@ + +\chapterquote{attributed to Grace Hopper\cite{ + WikiQuoteGraceHopper, + QuoteOriginMost2014 +}}{ + The most dangerous phrase in the language is ``We've always done it this way!''. +} + +\chaptertitle{The German ePA: A Motivating Counter-Example} + +\todo{FIXME: Proper citation here} +\sourceattrib{This part is based on a short paper presented at the HS3 workshop at ESORICS 2025.} +Looking at the landscape of computer security solutions, we are presented with a wide variety of vendors and products +that may give the impression that hardware security is a solved problem. Vendors sell various claims rangning from +\emph{You don't need hardware security, just do it in the cloud!} to \emph{Buy our HSM and you will be secure!}. In +practice, things are not as easy and even well-intentioned projects still often go awry on the hardware security +dimension. Concluding this chapter, we will now have a look at one such project that was done by capable people with the +best intentions, yet it resulted in a hardware security design that is dangerously inadequate for the purpose. + +Beginning May 2025, after several delays, Germany has started the nation-scale rollout of its new electronic medical +record system. The system aims to create a national database accessible to all healthcare providers that holds the +complete electronic medical records of all publically insured people living in Germany. The system aims to replace +paper-based workflows that are error-prone and lead to healthcare providers often only having access to a subset of +patient's medical records. Data in scope for the system includes medical letters, laboratory results, and medical +imaging files. + +Due to Germany's mandatory health insurance laws, the system's user base encompasses the majority of all German +residents. People who have replaced their public health insurance with private insurance as of now are not subject to +the system. In Germany, by law private health insurance is only available to people from the top 10th percentile of +household income. This means that the system disproportionally affects people who have low income, creating an equity +issue. While it is possible to opt out from the use of the system, the process of opting out is difficult. Additionally, +the government and health insurance providers have publically depicted the system in a one-sidedly positive way, meaning +that it is unlikely the majority of people subject to the system have a comprehensive understanding of the system's +benefits and risks that would be necessary for an informed decision. + +While there has been loud criticism of the system's security from civil society organizations such as digital rights +nonprofit organization Chaos Computer Club (CCC) \cite{kochMoreMoreExperts2025} and several severe security flaws have +been demonstrated practically, this criticism has largely been ignored by the political structures in charge. We observe +that despite this civil society outrage and the system's large scale, it has received little attention from the academic +cryptography and information security community. + +In this section, we aim to point out some perplexing cryptographic engineering decisions in the system. In particular, +we point out that the system's core per-user secrets are kept in a rudimentary key escrow system whose security is based +on engineering assumptions, not on cryptographic principles. Furthermore, we observe that by specification, the +individual user keys of the system are derived from a per-user cleartext salt based on a system-wide long-term secret +with only 256 bits of entropy\footnote{ + In previous versions of the standard \cite{ + gematikSpezifikationSchluesselgenerierungsdienstEPA2023, + gematikUebergreifendeSpezifikationVerwendung2025, + }, there were two escrow services, with both keys used in layers to reduce the risk of a compromise of either one. + The current standard only requires one escrow service, and drops the entropy requirement of the root keys from 512 + bits to 256 bits. The apparent reason for the long-term nature of these keys is that they are updated manually. +}. Finally, we note that according to specification, the only physical security requirement for the protection of this +highly sensitive secret is a ``hard, opaque potting material'', with no tamper detection and response required. + +We base our analysis on the system's publicly available standards in their latest version as of the writing of the paper +underlying this section in April 2025, describing version 3.0 of the healthcare record system \cite{ + gematikSpezifikationAktensystemEPA2025, + gematikUbergreifendeSpezifikationVerwendung2024, +}. We note that the implementation might well deviate from these standards and be more secure--however, with the +system's history of flaws, we believe this is unlikely to be the case. The reference implementation provided by the +specification authority \cite{GithubRepositoryERPFD} follows the specified minimum requirements closely. As of now, +there is no meaningful way for either the public or for researchers such as us to ascertain the concrete implementation +security of the system. + +\subsection{The Design of ePA} + +ePA (short for \emph{elektronische Patientenakte}, ``electronic patient record''), is embedded into Germany's national +public healthcare backend system ``Telematikinfrastruktur'' (TI). TI is a highly complex system, and a detailed +description would exceed the limits of this analysis. Briefly put, TI consists of a shared DMZ that parties like +insurance providers and healthcare providers connect to through a VPN. At the client location, usually an individual +doctor's office or a hospital, this VPN connection is terminated by a specialized VPN appliance named ``Konnektor'' that +simultaneously acts as a trusted component inside the client network hosting some software for purposes such as +authentication. The Konnektor contains several smart cards that store keys used for authentication. Konnektor devices +are offered by several vendors and healthcare providers like doctor's offices are indivudally responsible for purchasing +and maintaining a Konnektor. + +% FIXME: Is there a threat/trust model of the system that you could summarise in a few sentences? + +Every person enrolled in the system as well as every healthcare professional providing services under it is issued an ID +card that contains a smart card that contains keys used to authenticate towards the central infrastructure. The primary +use of these smart cards up to now is that when someone visits a healthcare provider, they will insert their ID card +into a terminal so the healthcare provider can automatically fetch their personal information such as name, birth date, +address and enrollment status from their insurance provider. + +ePA is implemented inside the TI system. Its centralized services are accessed by healthcare providers through the TI's +VPN. Patient records are encrypted and decrypted inside TI's backend systems. Smart cards authenticate parties and +hardware devices to each other. Each insurance provider picks one of several implementations of ePA's server-side +infrastructure to run for its clients. Currently, there are two approved implementations of this server-side +infrastructure. + +With the current version of the specificatoin, the overall architecture of ePA heavily relies on Trusted Execution +Environments (TEEs). Data processing on the server side is done in plaintext inside TEEs, with some cryptographic key +management delegated to a Hardware Security Module. While attacks on the TEEs are considered in the system, the HSMs are +assumed to be perfectly secure, and the system does not include mitigations for a compromised HSM. The primary +motivation for plaintext processing seems to be to enable large-scale data analysis for research purposes without +requiring consent or cooperation of the people whose records are being processed. + +The primary services offered by the server side are authentication services, key escrow, and a database storing the +encrypted records themselves. Records are symmetrically encrypted with keys that are derived from system-wide secrets +inside an HSM. The primary motivation behind the use of a key escrow service seems to be to enable the creation of a +duplicate patient ID smartcard in case a person looses theirs. While the current version of the standard is unclear on +the exact mechanism of key derivation, in previous versions of the standard, the escrow service's root key, a random +salt, and the healthcare ID number of the person owning the record was used in SHA256-HKDF. The specification requires +that a new root key is generated once a year, but as far as we can tell, record key rollover is not done automatically +but is only meant to be done when the \emph{user} requests it, and old root keys must be retained forever to ensure old +records can be accessed. + +\subsection{Related Work} + +The state-owned company specifying the system commissioned several security assessments of the system relating to the +key escrow service. \textcite{fischlinKryptographischeAnalyseSpezifikation2021} focuses on the cryptographic +dimension of the key escrow service used in an older version of the standard, and is now obsolete. +\textcite{slanySicherheitsanalyseZurSicherheit2020} approaches the system at a higher level, and focuses on the +cryptography of the inner protocol layers spoken between the system's components. Industry research organization +Fraunhofer SIT was comissioned for a structured, theoretical assessment of attack paths to the system +\cite{fraunhofersitAbschlussberichtSicherheitsanalyseGesamtsystems2024}. We are not currently aware of +independent academic security research on the system. + +The design and operation of the system have been independently described in detail by civil society activists, who have +demonstrated several successful attacks on the system. \textcite{tschirsichHackerHinOder0100} demonstrated how they +could trivially acquire each of the smartcards as well as the Konnektor necessary for accessing the system. +\textcite{tschirsichKonnteBisherNoch0100} summarize the history of attacks demonstrated on the system and show multiple +practical attacks on various parts of the system's implementation. + +\subsection{Concerning Cryptographic Engineering Choices} + +We wish to highlight some of the design choices in the system that we believe stray from current best practice. This is +by no means an exhaustive list, and is only meant to underscore why we believe the system deserves more scrutiny. + +\subsubsection{Use of Key Escrow} + +First, the system's general approach of using a key escrow service instead of securely storing the keys inside the +system's already existing smart card infrastructure is concerning, given that this key escrow service poses a +centralized security risk. The system's designers made this decision since it was deemed important that access to an +encrypted record can be restored quickly after an insurance ID card is lost, without requiring the cooperation of the +healthcare providers holding the primary copies of the person's medical records. + +While key escrow services have been a topic of political debate in decades past, in the cryptographic community, +consensus generally is that they are a bad idea since they pose a centralized target for attack, and increase attack +surface \cite{ + abelsonRisksKeyRecovery1997, + abelsonKeysDoormats2015, + andersonSecurityEngineeringGuide2020, +}. + +\subsubsection{Cryptographic Design} + +The system's overall cryptographic design is intentionally kept simple. The standard explicitly mentions that symmetric +primitives have been preferred over asymmetric primitives in the core key escrow functions due to the risk of an attack +on asymmetric primitives in the long term. Notably, other advanced cryptographic techniques such as secret sharing +schemes, oblivious pseudo-random functions, or multiparty computation that could help with the security and privacy of +the key escrow service by reducing trust placed in any single component of the service are also absent while the system +relies extensively on the engineering-based security guarantees of TEEs and HSMs. Given that the ePA system trusts its +HSMs as unconditionally secure, it is unclear what purpose the manual yearly root key renewal serves, especially absent +an automatic way to roll over the wrapped record keys. + +A consequence of the systems' simple cryptographic design is that the system trusts its components to a large degree. +For instance, the system leaks a person's insurance ID number to the key escrow HSM every time record keys are +requested. Along with the timing and frequency of these requests, this leaks information on the person's condition to +the key escrow service in an identifiable way. + +% TODO I feel that this section is a mix-up of critique on the cryptographic design and the approach to privacy +% protection and data minimisation. How are they linked? I'm missing some discussion here. + +\subsubsection{A Realistic Attacker Model} + +We observe that the system as a whole does not appear to be designed to defend against well-resourced adversaries. The +series of practical attacks that have been demonstrated on the system confirm this impression. In +\textcite{tschirsichKonnteBisherNoch0100} summarize a series of successful attacks. Attacks include social engineering +resulting in access to copies of smartcards enabling accessing patient records, using misconfigured Konnektor VPN +appliances with their LAN DMZ and authentication interface exposed on the public internet, circumventing video-based +authentication processes resulting in duplicate file keys being provided, classis SQL injection on a backend service +maintaining an authentication database, accessing all national patient records through brute-force enumeration of weak +identifiers, and several more. + +We believe that a system like this must be designed to withstand well-resourced adversaries such as enemy secret +services, since the medical data stored in such as information on chronic illness, sexually transmittable disease or +severe food allergies has intelligence value. Repeated breaches of national digital infrastructure such as the 2015 +breach of the US Office of Personnel Management \cite{barrettUSSuspectsHackers2015} or the 2024 compromise of US +telecommunications wiretapping systems \cite{mennChineseGovernmentHackers2024} demonstrate that such state-sponsored +attacks on national digital infrastructure are a realistic concern. A possible scenario in the ePA system would be an +enemy secret service gaining access to one of the HSMs storing the systems' root secrets, extracting the root secret by +an advanced physical attack, then being able to decrypt captured encrypted health records at will. Similarly, a +nation-state adversary might have access to an exploit allowing the compromise of the system's TEEs, which would enable +the extraction of any patient records being processed in plaintext inside these TEEs. + +\subsubsection{Physical Security} + +Physical security has received some consideration in the system's specification. First, smart cards are used extensively +for authentication. Second, Hardware Security Modules are used in key locations of the system to process some +cryptographic secrets. The core of the system's key escrow service is implemented inside an HSM. However, it is notable +that the actual security level required for this HSM is only FIPS 140-2 level +3 \cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2002}. Not only has FIPS 140-2 +been superseded by FIPS 140-3 since +2019 \cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2019}, its security level 3 +mostly provides logical separation of cryptographic functions from other logic and is not very meaningful in the context +of physical attacks. The only physical requirement of FIPS 140-2 level 3 is that the HSM has a hard, opaque coating. +This coating is specified to be tamper-evident, but notably no active tamper detection or response features are required +by this standard. In contrast to the newer FIPS 140-3 standard and the related ISO/IEC 19790 \cite{ISOIEC19790} as well +as ISO/IEC 24759 \cite{ISOIEC24759} standards, FIPS 140-2 does not make any particular requirements regarding resistance +to side-channel attacks. The lack of tamper response, unspecified resistance to side-channel attacks and the fact that +the ePA specification only requires the long-lived key escrow root key inside the HSM to have 256 bits of entropy lead +to an unsatisfactory overall constellation. + +\subsection{Conclusion} + +In conclusion, we observe that in Germany's ePA national medical record database, despite the decade-long +standardization and implementation process, several cryptographic compromises ended up in the system's final deployment. +Even assuming that nation-scale key escrow is a good idea, the implementation of this key escrow system seems to stray +from current best practice. The system uses a secret key with only 256 bits of entropy to derive highly sensitive secret +keys for potentially tens of millions of people sharing an insurance provider. The cryptographic design of this escrow +system is unsophisticated, ignoring the past three decades in cryptographic developments particularly in multiparty +computation (MPC) and other secret sharing techniques in favor of an engineering approach. In the engineering dimension, +the system's physical security is only held to the basic level 3 of the obsolete FIPS 140-2 standard, which is +considerably less secure than an average credit card payment terminal. The system's root keys are only protected by a +``hard, opaque potting material'' and no tamper detection and response is required. We estimate that the system poses an +attractive and soft target to nation-state adversaries. The system's shortcomings are made more severe by the fact that +the system disproportionally affects the lives of people with low income. + +From an academic perspective, it is interesting to see how the ePA ended up in its current state, and the gaps in +cryptographic solutions left by academic research that contributed. A fundamental truth in cryptographic engineering is +that in the absence of technical checks, political promises are no guarantees of restraint. As such, the degree of trust +the ePA system places on organizational measures leads to a concerning overall picture. In particular, the system's +strong reliance on conventional HSMs built to long obsolete security standards as well as on trusted execution +environment technology that has been broken multiple times highlights the need for new approaches to hardware security +that better accomodate real-world use cases. + +We believe that Inertial HSMs can address this use case by cleanly separating the physical security primitive into a +retargetable design that can be applied to entire servers if needed, and augment or replace technology like conventional +HSMs or trusted execution environments to provide high-level hardware security. + diff --git a/chapter-introduction/chapter.tex b/chapter-introduction/chapter.tex index ad6f21d..61bef71 100644 --- a/chapter-introduction/chapter.tex +++ b/chapter-introduction/chapter.tex @@ -29,6 +29,8 @@ commonly provide hardware access to state authorities. The design decisions in c and the gold standard for backdoor access to modern systems is either exploiting a \emph{zero-day} flaw that is not yet publically known, or acquiring physical access to the target system. +\section{Research Questions} + In this thesis, we wish to extend the level of protection afforded by cryptographic protocol design down the technology stack. While cryptographic protocols and modern software from the operating system up make it possible to secure the software side of the stack to a high level, the hardware side remains poorly protected. There are a variety of hardware @@ -36,40 +38,48 @@ security solutions in the wild, but the majority of them either do not target pr -- such as Trusted Platform Modules (TPMs) -- or are not widely available due to market segmentation or cost -- such as conventional Hardware Security Modules (HSMs). -To extend this protection, we propose the Inertial Hardware Security Module (IHSM), a new type of HSM that extends the -high level of protection offered by the modern cryptographic software stack down to the hardware level, enabling secure -computation in insecure places. We chose to publish all our IHSM as open source and unencumbered by patents to enable -widespread adoption. IHSMs can be custom built with only basic manufacturing capabilities at small scale and enable the -deployment of secure computation in insecure places even to small organizations such as university research departments, -NGOs and small businesses. +We approach this task by solving three research questions that progress from theory to practical deployment. -Recent history has shown that state-level adversaries are a mounting threat to civil rights organizations, human rights -lawyers, members of minorities, and many others. While western democracies used to be considered safe havens of human -rights, today human rights are under attack both from within and from the outside in countries across the globe. -Publishing IHSM technology as open source, we hope to provide one building block for new computing systems accessible to -all that are resilient and secure in the face of growing adversity. +\begin{enumerate} + \item Can we achieve physical security without relying on conventional tamper-sensing meshes? + \item Can we monitor tamper-sensing meshes at a higher detail level than the state of the art of a single, scalar + measurement? + \item Can we integrate our findings into a system that provides a useful security guarantee in practice? +\end{enumerate} -Complementing our IHSM concept and prototype, we provide solutions to engineering issues such as wireless power transfer -adapting them to our use case. Further, we propose improvements to the state of the art in HSM tamper sensors such as -the use of low-cost, embeddable Time-Domain Reflectometry (TDR) that not only improve the security of IHSMs, but that -can even be applied to conventional HSMs. We conclude this thesis with an overview of two concrete use cases IHSMs -unlock that were previously infeasible using conventional HSMs: Datacenter-scale Secure Multiparty Computation (SMPC) -and long-range Quantum Key Distribution (QKD) networks. +To solve our first research question, we propose the Inertial Hardware Security Module (IHSM), a new type of HSM that +extends the high level of protection offered by the modern cryptographic software stack down to the hardware level, +enabling secure computation in insecure places. + +To solve our second question, we propose improvements to the state of the art in HSM tamper sensors such as the use of +low-cost, embeddable Time-Domain Reflectometry (TDR) that not only improve the security of IHSMs, but that can even be +applied to conventional HSMs. + +Finally, we solve our last research question by showing in two case studies how an end-to-end design of an IHSM-secured +data processing system could look like. Both case studies concern scenarios that IHSMs unlock that were previously +infeasible using conventional HSMs: Datacenter-scale Secure Multiparty Computation (SMPC) and long-range Quantum Key +Distribution (QKD) networks. As part of this effort we provide a solution adapting and improving upon the state of the +art in wireless power transfer to supply a rotating inertial HSM with a clean, stable power supply. + +We chose to publish all of our research as open source and unencumbered by patents to enable widespread adoption. IHSMs +can be custom built with only basic manufacturing capabilities at small scale and enable the deployment of secure +computation in insecure places even to small organizations such as university research departments, NGOs and small +businesses. \section{Cryptographic Principles and Physical Reality} Cryptographers' aversion to backdoor access derives from a combination of two fundamental computing principles: -Kerckhoffs' principle, and the principle of least authority. In cryptography, Kerckhoffs' principle, named after Dutch -military cryptographer Auguste Kerckhoffs, expresses that the security of a cryptographic system should only depend on -the secrecy of its keys, not on the secrecy of its design. In this way, Kerckhoff's principle states the opposite of the -common industry practice of \emph{Security by Obscurity}, which aims to achieve security by making it sufficiently -annoying to cryptoanalyze a system that nobody bothers. Complementary to Kerckhoff's is the principle of least +Kerckhoffs' principle, and the principle of least authority. Kerckhoffs' principle, named after Dutch military +cryptographer Auguste Kerckhoffs, expresses that the security of a cryptographic system should only depend on the +secrecy of its keys, not on the secrecy of its design. In this way, Kerckhoff's principle states the opposite of the +widespread industry practice of \emph{Security by Obscurity}, which aims to achieve security by making it sufficiently +annoying to cryptoanalyze a system that nobody bothers. Complementary to Kerckhoff's principle is the principle of least authority, which describes that in a secure system each component should only have access to the smallest set of capabilities necessary to fulfill its purpose. Applying both to a cryptographic system means that the system's design -should be transparent and not include any hiddent components or opaque parts that cannot be inspected, and that the +should be transparent and not include any hidden components or opaque parts that cannot be inspected, and that the system's keys should be scoped to place the least amount of trust possible in each participating party. -Let's take a basic videoconferencing system as an example. In our example system's deployment, users logen to a central +Let's take a basic videoconferencing system as an example. In our example system's deployment, users log on to a central conference server, which receives and distributes the users' video streams. Allowing backdoor access to the video streams to some third party like a datacenter operator or a state would violate Kerckhoffs' principle since it would have to be hidden from the systems' participants, who would therefore not have a complete view of the systems' deployed @@ -87,22 +97,22 @@ against modern software often involve taking over the hardware at some point in compromised. Corresponding \emph{digital forensics} capabilities are commonplace among state actors, and are available as a turnkey solution on the market. -\section{Building Inertial HSMs} +\section{Inertial HSMs} -Inertial HSMs fill this gap in the protection of systems that are not critical enough to warrant the expensive existing -solutions such as conventional HSMs, while still handling highly sensitive data. In a system with a secure software -stack, the role of a HSM is to secure the hardware part of the stack. The basic approach of a HSM is to combine a secure -software stack with a fast self-destruct mechanism and tamper sensors. The self-destruct mechanism can be hardware or -software that quickly, securely destroys all cryptographic secrets, rendering the device worthless to an attacker. The -tamper sensors are tasked with detecting any physical attack an attacker could mount on the device. Common classes of -such sensors include environmental sensors such as temperature or radiation sensors that detect attempts at causing -controllable faults in the HSM by heating, cooling or irradiating it. Building on the basic protection offered by such -sensors, \emph{tamper-sensing meshes} are often employed. These \emph{meshes} are flexible foils containing circuit -traces that are attached to the HSM's enclosure to detect attempts at penetrating the shell of the device with probes. -Tamper-sensing meshes usually are the primary line of defense against most physical attacks. They are very effective at -mitigating a large variety of physical attacks, but they are difficult to construct securely as they usually require -bespoke manufacturing processes. As a result, they are currently only used in niche applications, and even there not -every realization is equally secure. +In this thesis, we propose Inertial HSMs to fill this gap in the protection of systems that are not critical enough to +warrant the expensive existing solutions such as conventional HSMs, while still handling highly sensitive data. In a +system with a secure software stack, the role of a HSM is to secure the hardware part of the stack. The basic approach +of a HSM is to combine a secure software stack with a fast self-destruct mechanism and tamper sensors. The self-destruct +mechanism can be hardware or software that quickly and securely destroys all cryptographic secrets, thereby rendering +the device worthless to an attacker. The tamper sensors are tasked with detecting any physical attack an attacker could +mount on the device. Common classes of such sensors include environmental sensors such as temperature or radiation +sensors that detect attempts at causing controllable faults in the HSM by heating, cooling or irradiating it. Building +on the basic protection offered by such sensors, \emph{tamper-sensing meshes} are often employed. These \emph{meshes} +are flexible foils containing circuit traces that are attached to the HSM's enclosure to detect attempts at penetrating +the shell of the device with probes. Tamper-sensing meshes usually are the primary line of defense against most physical +attacks. They are very effective at mitigating a large variety of physical attacks, but they are difficult to construct +securely as they usually require bespoke manufacturing processes. As a result, they are currently only used in niche +applications, and even there not every realization is equally secure. Inertial HSMs are a new design approach that utilizes mechanical motion to create secure tamper-sensing meshes from simple components. IHSMs solve the issue of creating an impenetrable tamper-sensing envelope by replacing the bespoke @@ -116,8 +126,6 @@ IHSMs enable the protection of much larger payloads compared to conventional mes power dissipation. This and their low cost enables the implementation of high-level hardware security in applications that previously would not have been possible to secure. -\section{Inertial HSM Applications} - Inertial HSMs are the first fully open source HSM with advanced tamper sensing features. Across application domains, Inertial HSMs can be applied to gain resistance to physical attacks in scenarios where conventional HSMs were not used because of cost, computing power or implementation effort. Where conventional HSMs come as fully integrated devices that @@ -129,7 +137,19 @@ it can have gaps that allow for air flow between outside and inside, enabling ac cooling capability sharply increases computing power by increasing feasible payload power dissipation by two orders of magnitude. -\section{A Note on Hardware Security Module Terminology} +\section{Conclusion} + +Looking at the practice of applied hardware security, we observe that despite ample availability of commercial solutions +promising easy hardware security, clearly there is still a lack of solutions that provide the adaptability necessary for +some real use cases at low enough cost. By publishing the tamper-sensing technology we developed during the making of +this thesis as open source hardware designs, we wish to provide this missing building block to provide high-level +hardware security in real-world applications. Our hardware designs can be adapted to a devices ranging from Single-Board +Computers (SBCs) to servers, they are compatible with non-computing applications like Quantum Key Distribution (QKD) and +their design approaches can even be integrated into existing HSM designs to provide better security at little additional +cost. + +\section*{A Note on Hardware Security Module Terminology} +\addcontentsline{toc}{section}{A Note on Hardware Security Module Terminology} In this thesis, we use the term \emph{Hardware Security Module (HSM)} to refer to a security device that has the following three properties. @@ -190,7 +210,8 @@ SSC includes smartcards and card payment terminals in this category. Card paymen response functionality including partial coverage of areas like they system's main cryptographic processor and smart card reader by battery-backed tamper-sensing meshes. -\section{Tamper-Sensing Meshes} +\subsection*{Tamper-Sensing Meshes} +\addcontentsline{toc}{subsection}{Tamper-Sensing Meshes} In this thesis, we use the terms \emph{Tamper-Sensing Mesh} and \emph{Security Mesh} synonymous. We use both terms to refer to any electrical circuit whose path is laid out to cover a surface with the intent of detecting attempts at @@ -199,326 +220,4 @@ less clear to people unfamiliar with the matter. It is also polysemous, and depe or stamped metal meshes used as fences or as screens in front of windows to prevent break-ins. As a result, it is harder to use in online searches, and when using Large Language Models (LLMs), it frequently leads to amusing hallucinations. -%In the early days of mass-market computing, the expectations towards this new tool were high. Even before people -%realized the potential of computers and the internet for commercial gain, there was widespread optimism about the -%potential of globally networked computing to liberate ideas and better humanity. People imagined a future where any -%information would be available at a mere thought, where cultural and language barriers were eroded by technological -%advances, and where technology served as a universal equalizer, narrowing socioeconomic gaps and enhancing the quality -%of life for everybody. -% -%Needless to say, things did not turn out that way. After initially, home computers and the internet were briefly the -%domain of a particular brand of free-spirited enthusiast, it did not take long until the domain was captured by -%commercial interests. The dotcom bubble inflated and burst, and the introduction of smartphones catalyzed the rise of -%the social web, bringing computing to the masses. While by itself the democratization and the widespread adoption of -%computing is a good thing, the capitalist environment caused it to coincide with an overal drift of the industry away -%from the libertarian principles that were characteristic for its beginning. -% -%Specifically, throughout the past thirty years, computing ecosystems have continued a gradual evolution into walled -%gardens, primarily serving not their users anymore, but the interests of whoever owns the place that hired the place -%that made them. While in the 90ies, owning a computer meant you would be able to run any piece of software on it, -%today's platform business model means that every program requires prior approval by the platform's owners. The publicly -%stated motivation for this gradual creep invariably is security or protection from harm by bad people writing software, -%while the actual motivation is without doubt the tremendous monetary gain an operator can obtain by seeking rent for its -%platform. -% -%The platformization of computing has captured all levels of the industry, from backend systems running on hyperscale -%cloud platforms, through desktop computers running only vendor-approved operating systems through secure boot chains, up -%to low-cost smartphones containing highly secure enclaves tasked with the protection of Digital Restrictions Management -%(DRM) keys aimed at stopping the user from copying media played back on the device. Increasingly, this trend towards -%platform owners having the ultimate authority on users's computers is becoming a practical issue in high-risk settings. - - - -% Cypherpunks -% ACAB is a anti-authoritarian sentiment -% In anarchist discourse, "cops" are not just policemen and -women, but also other means of centralized control. -% Anarchism rejects centralized authority in favor of the freedom of individuals because it recognizes the dangers -% inherent in centralized authority - -% While anarchism is one extreme of the spectrum, the dangers of centralized control are well-established. -% The constitutions of all modern democracies recognize these dangers, and contain elaborate provisions such as a -% separation of powers, and extensive protections for civil society and journalism -% While modern democratic policy rejects anarchism, it embraces it's criticism of power in some vital niches. -% Examples: Whistleblower protection, attorney-client privilege, doctor-patient confidentiality and protections on state -% agents such as judges or politicians - -% Centralized authority promises efficiency, but it has a tendency to go awry. -% These sanctuaries carved out from the state's authority in democracies are vital to the functioning of the system -% In today's computing environment, we observe some parallels to this limitation of centralized authority -% In classical computing, centralized control was used abundantly to create order -% Like absolute political authority becomes dangerous when subverted, centralized control in computing becomes dangerous -% when systems are compromised through hacking. -% Allocating control can be done using cryptography -% Cryptography provides near-perfect mathematical solutions to almost any control problem -% However, as anyone who has taken an introductory crypto course knows, encrypting things isn't the hard part. The hard -% part is managing keys. - -% computing solutions to these problems include: Air-gapping, separation of concerns, extreme case: HSMs and TEEs -% provide security even during compromise -% interesting parallel to state control / anarchy discourse above: -% they are secure even against the state/police if implemented correctly -% observation: competent hackers are about as competent as competent police -% observation: cannot digitally encode ethics or legal stuff, so no "good guys only" backdoors - -% other applications of this principle of distrusting systems are (perfect) forward secrecy -% see signal -% however, system such as TEEs and HSMs are largely a niche solution -% while some are widely deployed, e.g. TEEs for DRM and as secure boot root of trust in phones, desktops -% they are not usually democratic. despite wide deployment authority is with their manufacturer. -% To ordinary users, these capabilities are distant -% EU regulation was necessary to force apple to open up some APIs cf. nfc payment -% normal users are shit out of luck - -% Thus, we need new tools. Tools that enable normal people / small orgs to assume control of their data/keys/etc. -% we need to open up the power of TEEs to everybody -% right now, open source is often less secure than closed-source -% trusted boot rarely implemented (right) in open source -% no TEE security at all because of lack of access -% we want to create democratic, open source HSMs - -% open source HSMs enable many use cases to the public and small orgs that up to now only large corps or states could do -% email encryption -% secure group messaging -% signing key servers -% secure video / audio calls -% private data storage -% things like that twitter/x protocol for pin-based key recovery -% timestamping / attestation services -% base for distributed consensus protocols -% might have applications in cryptocurrencies when operated as heterogenous cluster - -% but beyond that, they enable entirely new use cases. -% conventional hsms limited in computing power, crippled for the purpose of market segmentation -% ours are much more powerful, enable much higher computation crypto such as generic smpc -% generic smpc can do things like key management, pin-based security, secret statistics etc. -% furthermore, above we noted parallel between anarchist distrust of authority and core cryptographic principles -% our hsms not only protect against classical attackers, but also against states -% can be used as democratic check and balance -% example: secure comms that cannot be accessed by the state / police -% example: secure, authenticated photo and video capture -% that's especially relevant in the age of ai - -%\section{The Trust Perspective} - -\section{A Motivating Counter-Example} - -\todo{FIXME: Proper citation here} -\sourceattrib{This part is based on a short paper presented at the HS3 workshop at ESORICS 2025.} -Looking at the landscape of computer security solutions, we are presented with a wide variety of vendors and products -that may give the impression that hardware security is a solved problem. Vendors sell various claims rangning from -\emph{You don't need hardware security, just do it in the cloud!} to \emph{Buy our HSM and you will be secure!}. In -practice, things are not as easy and even well-intentioned projects still often go awry on the hardware security -dimension. Concluding this chapter, we will now have a look at one such project that was done by capable people with the -best intentions, yet it resulted in a hardware security design that is dangerously inadequate for the purpose. - -Beginning May 2025, after several delays, Germany has started the nation-scale rollout of its new electronic medical -record system. The system aims to create a national database accessible to all healthcare providers that holds the -complete electronic medical records of all publically insured people living in Germany. The system aims to replace -paper-based workflows that are error-prone and lead to healthcare providers often only having access to a subset of -patient's medical records. Data in scope for the system includes medical letters, laboratory results, and medical -imaging files. - -Due to Germany's mandatory health insurance laws, the system's user base encompasses the majority of all German -residents. People who have replaced their public health insurance with private insurance as of now are not subject to -the system. In Germany, by law private health insurance is only available to people from the top 10th percentile of -household income. This means that the system disproportionally affects people who have low income, creating an equity -issue. While it is possible to opt out from the use of the system, the process of opting out is difficult. Additionally, -the government and health insurance providers have publically depicted the system in a one-sidedly positive way, meaning -that it is unlikely the majority of people subject to the system have a comprehensive understanding of the system's -benefits and risks that would be necessary for an informed decision. - -While there has been loud criticism of the system's security from civil society organizations such as digital rights -nonprofit organization Chaos Computer Club (CCC) \cite{kochMoreMoreExperts2025} and several severe security flaws have -been demonstrated practically, this criticism has largely been ignored by the political structures in charge. We observe -that despite this civil society outrage and the system's large scale, it has received little attention from the academic -cryptography and information security community. - -In this section, we aim to point out some perplexing cryptographic engineering decisions in the system. In particular, -we point out that the system's core per-user secrets are kept in a rudimentary key escrow system whose security is based -on engineering assumptions, not on cryptographic principles. Furthermore, we observe that by specification, the -individual user keys of the system are derived from a per-user cleartext salt based on a system-wide long-term secret -with only 256 bits of entropy\footnote{ - In previous versions of the standard \cite{ - gematikSpezifikationSchluesselgenerierungsdienstEPA2023, - gematikUebergreifendeSpezifikationVerwendung2025, - }, there were two escrow services, with both keys used in layers to reduce the risk of a compromise of either one. - The current standard only requires one escrow service, and drops the entropy requirement of the root keys from 512 - bits to 256 bits. The apparent reason for the long-term nature of these keys is that they are updated manually. -}. Finally, we note that according to specification, the only physical security requirement for the protection of this -highly sensitive secret is a ``hard, opaque potting material'', with no tamper detection and response required. - -We base our analysis on the system's publicly available standards in their latest version as of the writing of the paper -underlying this section in April 2025, describing version 3.0 of the healthcare record system \cite{ - gematikSpezifikationAktensystemEPA2025, - gematikUbergreifendeSpezifikationVerwendung2024, -}. We note that the implementation might well deviate from these standards and be more secure--however, with the -system's history of flaws, we believe this is unlikely to be the case. The reference implementation provided by the -specification authority \cite{GithubRepositoryERPFD} follows the specified minimum requirements closely. As of now, -there is no meaningful way for either the public or for researchers such as us to ascertain the concrete implementation -security of the system. - -\subsection{The Design of ePA} - -ePA (short for \emph{elektronische Patientenakte}, ``electronic patient record''), is embedded into Germany's national -public healthcare backend system ``Telematikinfrastruktur'' (TI). TI is a highly complex system, and a detailed -description would exceed the limits of this analysis. Briefly put, TI consists of a shared DMZ that parties like -insurance providers and healthcare providers connect to through a VPN. At the client location, usually an individual -doctor's office or a hospital, this VPN connection is terminated by a specialized VPN appliance named ``Konnektor'' that -simultaneously acts as a trusted component inside the client network hosting some software for purposes such as -authentication. The Konnektor contains several smart cards that store keys used for authentication. Konnektor devices -are offered by several vendors and healthcare providers like doctor's offices are indivudally responsible for purchasing -and maintaining a Konnektor. - -% FIXME: Is there a threat/trust model of the system that you could summarise in a few sentences? - -Every person enrolled in the system as well as every healthcare professional providing services under it is issued an ID -card that contains a smart card that contains keys used to authenticate towards the central infrastructure. The primary -use of these smart cards up to now is that when someone visits a healthcare provider, they will insert their ID card -into a terminal so the healthcare provider can automatically fetch their personal information such as name, birth date, -address and enrollment status from their insurance provider. - -ePA is implemented inside the TI system. Its centralized services are accessed by healthcare providers through the TI's -VPN. Patient records are encrypted and decrypted inside TI's backend systems. Smart cards authenticate parties and -hardware devices to each other. Each insurance provider picks one of several implementations of ePA's server-side -infrastructure to run for its clients. Currently, there are two approved implementations of this server-side -infrastructure. - -With the current version of the specificatoin, the overall architecture of ePA heavily relies on Trusted Execution -Environments (TEEs). Data processing on the server side is done in plaintext inside TEEs, with some cryptographic key -management delegated to a Hardware Security Module. While attacks on the TEEs are considered in the system, the HSMs are -assumed to be perfectly secure, and the system does not include mitigations for a compromised HSM. The primary -motivation for plaintext processing seems to be to enable large-scale data analysis for research purposes without -requiring consent or cooperation of the people whose records are being processed. - -The primary services offered by the server side are authentication services, key escrow, and a database storing the -encrypted records themselves. Records are symmetrically encrypted with keys that are derived from system-wide secrets -inside an HSM. The primary motivation behind the use of a key escrow service seems to be to enable the creation of a -duplicate patient ID smartcard in case a person looses theirs. While the current version of the standard is unclear on -the exact mechanism of key derivation, in previous versions of the standard, the escrow service's root key, a random -salt, and the healthcare ID number of the person owning the record was used in SHA256-HKDF. The specification requires -that a new root key is generated once a year, but as far as we can tell, record key rollover is not done automatically -but is only meant to be done when the \emph{user} requests it, and old root keys must be retained forever to ensure old -records can be accessed. - -\subsection{Related Work} - -The state-owned company specifying the system commissioned several security assessments of the system relating to the -key escrow service. \textcite{fischlinKryptographischeAnalyseSpezifikation2021} focuses on the cryptographic -dimension of the key escrow service used in an older version of the standard, and is now obsolete. -\textcite{slanySicherheitsanalyseZurSicherheit2020} approaches the system at a higher level, and focuses on the -cryptography of the inner protocol layers spoken between the system's components. Industry research organization -Fraunhofer SIT was comissioned for a structured, theoretical assessment of attack paths to the system -\cite{fraunhofersitAbschlussberichtSicherheitsanalyseGesamtsystems2024}. We are not currently aware of -independent academic security research on the system. - -The design and operation of the system have been independently described in detail by civil society activists, who have -demonstrated several successful attacks on the system. \textcite{tschirsichHackerHinOder0100} demonstrated how they -could trivially acquire each of the smartcards as well as the Konnektor necessary for accessing the system. -\textcite{tschirsichKonnteBisherNoch0100} summarize the history of attacks demonstrated on the system and show multiple -practical attacks on various parts of the system's implementation. - -\subsection{Concerning Cryptographic Engineering Choices} - -We wish to highlight some of the design choices in the system that we believe stray from current best practice. This is -by no means an exhaustive list, and is only meant to underscore why we believe the system deserves more scrutiny. - -\subsubsection{Use of Key Escrow} - -First, the system's general approach of using a key escrow service instead of securely storing the keys inside the -system's already existing smart card infrastructure is concerning, given that this key escrow service poses a -centralized security risk. The system's designers made this decision since it was deemed important that access to an -encrypted record can be restored quickly after an insurance ID card is lost, without requiring the cooperation of the -healthcare providers holding the primary copies of the person's medical records. - -While key escrow services have been a topic of political debate in decades past, in the cryptographic community, -consensus generally is that they are a bad idea since they pose a centralized target for attack, and increase attack -surface \cite{ - abelsonRisksKeyRecovery1997, - abelsonKeysDoormats2015, - andersonSecurityEngineeringGuide2020, -}. - -\subsubsection{Cryptographic Design} - -The system's overall cryptographic design is intentionally kept simple. The standard explicitly mentions that symmetric -primitives have been preferred over asymmetric primitives in the core key escrow functions due to the risk of an attack -on asymmetric primitives in the long term. Notably, other advanced cryptographic techniques such as secret sharing -schemes, oblivious pseudo-random functions, or multiparty computation that could help with the security and privacy of -the key escrow service by reducing trust placed in any single component of the service are also absent while the system -relies extensively on the engineering-based security guarantees of TEEs and HSMs. Given that the ePA system trusts its -HSMs as unconditionally secure, it is unclear what purpose the manual yearly root key renewal serves, especially absent -an automatic way to roll over the wrapped record keys. - -A consequence of the systems' simple cryptographic design is that the system trusts its components to a large degree. -For instance, the system leaks a person's insurance ID number to the key escrow HSM every time record keys are -requested. Along with the timing and frequency of these requests, this leaks information on the person's condition to -the key escrow service in an identifiable way. - -% TODO I feel that this section is a mix-up of critique on the cryptographic design and the approach to privacy -% protection and data minimisation. How are they linked? I'm missing some discussion here. - -\subsubsection{A Realistic Attacker Model} - -We observe that the system as a whole does not appear to be designed to defend against well-resourced adversaries. The -series of practical attacks that have been demonstrated on the system confirm this impression. In -\textcite{tschirsichKonnteBisherNoch0100} summarize a series of successful attacks. Attacks include social engineering -resulting in access to copies of smartcards enabling accessing patient records, using misconfigured Konnektor VPN -appliances with their LAN DMZ and authentication interface exposed on the public internet, circumventing video-based -authentication processes resulting in duplicate file keys being provided, classis SQL injection on a backend service -maintaining an authentication database, accessing all national patient records through brute-force enumeration of weak -identifiers, and several more. - -We believe that a system like this must be designed to withstand well-resourced adversaries such as enemy secret -services, since the medical data stored in such as information on chronic illness, sexually transmittable disease or -severe food allergies has intelligence value. Repeated breaches of national digital infrastructure such as the 2015 -breach of the US Office of Personnel Management \cite{barrettUSSuspectsHackers2015} or the 2024 compromise of US -telecommunications wiretapping systems \cite{mennChineseGovernmentHackers2024} demonstrate that such state-sponsored -attacks on national digital infrastructure are a realistic concern. A possible scenario in the ePA system would be an -enemy secret service gaining access to one of the HSMs storing the systems' root secrets, extracting the root secret by -an advanced physical attack, then being able to decrypt captured encrypted health records at will. Similarly, a -nation-state adversary might have access to an exploit allowing the compromise of the system's TEEs, which would enable -the extraction of any patient records being processed in plaintext inside these TEEs. - -\subsubsection{Physical Security} - -Physical security has received some consideration in the system's specification. First, smart cards are used extensively -for authentication. Second, Hardware Security Modules are used in key locations of the system to process some -cryptographic secrets. The core of the system's key escrow service is implemented inside an HSM. However, it is notable -that the actual security level required for this HSM is only FIPS 140-2 level -3 \cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2002}. Not only has FIPS 140-2 -been superseded by FIPS 140-3 since -2019 \cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2019}, its security level 3 -mostly provides logical separation of cryptographic functions from other logic and is not very meaningful in the context -of physical attacks. The only physical requirement of FIPS 140-2 level 3 is that the HSM has a hard, opaque coating. -This coating is specified to be tamper-evident, but notably no active tamper detection or response features are required -by this standard. In contrast to the newer FIPS 140-3 standard and the related ISO/IEC 19790 \cite{ISOIEC19790} as well -as ISO/IEC 24759 \cite{ISOIEC24759} standards, FIPS 140-2 does not make any particular requirements regarding resistance -to side-channel attacks. The lack of tamper response, unspecified resistance to side-channel attacks and the fact that -the ePA specification only requires the long-lived key escrow root key inside the HSM to have 256 bits of entropy lead -to an unsatisfactory overall constellation. - -\subsection{Conclusion} - -In conclusion, we observe that in Germany's ePA national medical record database, despite the decade-long -standardization and implementation process, several cryptographic compromises ended up in the system's final deployment. -Even assuming that nation-scale key escrow is a good idea, the implementation of this key escrow system seems to stray -from current best practice. The system uses a secret key with only 256 bits of entropy to derive highly sensitive secret -keys for potentially tens of millions of people sharing an insurance provider. The cryptographic design of this escrow -system is unsophisticated, ignoring the past three decades in cryptographic developments particularly in multiparty -computation (MPC) and other secret sharing techniques in favor of an engineering approach. In the engineering dimension, -the system's physical security is only held to the basic level 3 of the obsolete FIPS 140-2 standard, which is -considerably less secure than an average credit card payment terminal. The system's root keys are only protected by a -``hard, opaque potting material'' and no tamper detection and response is required. We estimate that the system poses an -attractive and soft target to nation-state adversaries. The system's shortcomings are made more severe by the fact that -the system disproportionally affects the lives of people with low income. - -%FIXME work in rogawayMoralCharacterCryptographic? -Looking at the practice of applied hardware security, we observe that despite ample availability of commercial solutions -promising easy hardware security, clearly there is still a lack of solutions that provide the adaptability necessary for -some real use cases at low enough cost. By publishing the tamper-sensing technology we developed during the making of -this thesis as open source hardware designs, we wish to provide this missing building block to provide high-level -hardware security in real-world applications. Our hardware designs can be adapted to a devices ranging from Single-Board -Computers (SBCs) to servers, they are compatible with non-computing applications like Quantum Key Distribution (QKD) and -their design approaches can even be integrated into existing HSM designs to provide better security at little additional -cost. diff --git a/common-defs.tex b/common-defs.tex index 1488a7a..c0ce60b 100644 --- a/common-defs.tex +++ b/common-defs.tex @@ -151,14 +151,14 @@ \DeclareFieldFormat{labelprefix}{\textsuperscript{\sffamily#1}} \newcommand{\chapterbibliography}{ - \clearpage % clearpage flushes all figures. force this here so we don't get figures floating in between references. + \FloatBarrier \addcontentsline{toc}{section}{References} \newrefcontext{webref} - \printbibliography[type={online},title={Web sources},heading=none,resetnumbers=false,segment=\therefsegment] - \newrefcontext{defref} - \printbibliography[nottype={online},nottype={patent},heading=none,resetnumbers=false,segment=\therefsegment] + \printbibliography[type={online},title={Web sources},heading=subbibliography,resetnumbers=false,segment=\therefsegment] \newrefcontext{patref} - \printbibliography[type={patent},title={Patent References},heading=none,resetnumbers=false,segment=\therefsegment] + \printbibliography[type={patent},title={Patent References},heading=subbibliography,resetnumbers=false,segment=\therefsegment] + \newrefcontext{defref} + \printbibliography[nottype={online},nottype={patent},heading=subbibliography,resetnumbers=false,segment=\therefsegment] } \newrefcontext{defref} diff --git a/common-packages.tex b/common-packages.tex index a62b036..e729d4f 100644 --- a/common-packages.tex +++ b/common-packages.tex @@ -36,6 +36,7 @@ \usepackage{colortbl} \usepackage{rotating} \usepackage{minitoc} +\usepackage{placeins} \usepackage{minted} % pygmentized source code %\usepackage[pdftex]{graphicx,color} %\usepackage{showframe} % Useful for page layout debugging diff --git a/thesis.tex b/thesis.tex index 4992665..ba1516c 100644 --- a/thesis.tex +++ b/thesis.tex @@ -35,6 +35,7 @@ \listoftables \dochapter{chapter-introduction} % Status: In pretty good shape +\dochapter{chapter-epa} % Status: In pretty good shape \dochapter{chapter-hsms} % Status: TODO \dochapter{chapter-ihsm} % Status: Copy-paste done, build works, integration TODO \dochapter{chapter-sampling-mesh-monitor} % Status: Copy-paste done, build works, integration TODO