sampling-mesh-monitor/paper/paper.tex

\documentclass[submission]{iacrtrans}

\usepackage[T1]{fontenc}
\usepackage[
    backend=biber,
    style=numeric,
    natbib=true,
    url=false,
    doi=true,
    eprint=false
    ]{biblatex}
\addbibresource{paper.bib}
\usepackage{amssymb,amsmath}
\usepackage{eurosym}
\usepackage{wasysym}
\usepackage[binary-units]{siunitx}
\usepackage{commath}
\usepackage{graphicx,color}
\usepackage{colortbl}
\usepackage{subcaption}
\usepackage{placeins}
\usepackage{array}
\usepackage{censor}
\usepackage{hyperref}
\usepackage{makecell}

\DeclareSIUnit{\baud}{Bd}
\DeclareSIUnit{\year}{a}
\DeclareSIUnit{\rpm}{rpm}
\renewcommand{\floatpagefraction}{.8}
\newcommand{\degree}{\ensuremath{^\circ}}
\newcolumntype{P}[1]{>{\centering\arraybackslash}p{#1}}
\newcommand{\partno}[1]{\textsf{\small#1}}
\newcommand{\price}[2]{#1 #2}
\newcommand{\todo}[1]{\textbf{TODO}\footnote{#1}}
% Set to 1.0 for final two-column export
\newlength{\figurescale}
\setlength{\figurescale}{0.75\textwidth}

\begin{document}

\author{Jan Sebastian Götte\inst{1} \and Björn Scheuermann\inst{2}}
\institute{Technical University of Darmstadt, Darmstadt, Germany, \email{jan.goette@tu-darmstadt.de}\and
    Technical University of Darmstadt, Darmstadt, Germany, \email{bjoern.scheuermann@kom.tu-darmstadt.de}}
\title{High Fidelity Security Mesh Monitoring using Low-Cost, Embedded Time Domain Reflectometry}
\maketitle

% FIXME maybe don't use HSM, maybe use active tamper sensing? envelope protection?

\begin{abstract}
    Security Meshes are patterns of sensing traces covering an area that are used in Hardware Security Modules (HSMs) to
    detect attempts at physical intrusion into the HSM's protective shell. In this paper, we present an optimized,
    embeddable security mesh monitoring circuit that applies the principles behind Time Domain Reflectometry (TDR) to
    create a unique fingerprint of a mesh, and to detect not only DC faults, but also attempts at bridging and removing
    parts of the mesh. We demonstrate a working prototype of our TDR circuit, which improves over previous low-cost TDR
    approaches by utilizing exclusively inexpensive, consumer-grade components with a total Bill of Materials (BoM) cost
    of less than 10\euro while achieving a time resolution better than \qty{200}{\pico\second}.
\end{abstract}

\section{Introduction}

Security meshes continue to be the state of the art for tamper sensing in in applications where sophisticated physical
attacks such as attempts at drilling or sawing through the device's enclosure to place probes must be prevented. Common
applications for such meshes include Hardware Security Modules (HSMs) used to store and process cryptographic keys while
applying certain security standards such as
FIPS-140-2\cite{usnationalinstituteofstandardsandtechnologySecurityRequirementsCryptographic2002} or ISO/IEC
24759\cite{ISOIEC24759}, as well as card payment terminals where PCI PTS HSM
standards\cite{pcisecuritystandardscouncilPaymentCardIndustry2021} are applicable. Security meshes usually consist of
two or more conductive traces that are laid out in a meandering pattern to cover a surface, and which are monitored
electrically to detect attempts at penetrating this surface. While commercial designs often only monitor for short
circuits or breaks in the mesh traces, monitoring this coarse is incapable of detecting even less sophisticated attacks
attempting to circumvent part of the mesh, thus requring the mesh to be made from a special material that is difficult
to manipulate without breaking it.

To enable the ues of less expensive, commodity materials such as Printed Circuit Boards (PCBs), the mesh's integrity
must be monitored with higher fidelity. In this paper, we present a low-cost monitoring circuit for security meshes
based on a Time-Domain Reflectometry (TDR) approach that provides such improved measurement fidelity compared to
commercial systems, and enables the use of less sophisticated meshes made from less expensive materials.

Our circuit generates a very fast pulse with a rise time better than \qty{200}{\pico\second} that is broadcast into the
mesh. While the pulse traverses the mesh, parts of it are reflected on imperfections inside the mesh. Our circuit
receives, amplifies and records these reflections with better than \qty{200}{\pico\second} time resolution.

We demonstrate a working prototype of our design, and present practical measurements of its electrical parameters as
well as its performance under several practical attack scenarios. A photo of our prototype setup including a security
mesh specimen is shown in Figure\ \ref{fig_pic_board}.

Compared to previous academic designs, our approach can be implemented at lower cost since it exclusively uses
inexpensive, commercially available mass-market components. Utilizing a TDR frontend, we improve over previous,
delay-based approaches in monitoring fidelity, achieving sufficient sensitivity for the detection of high-impedance
oscilloscope probes despite such probes being specifically designed to conduct measurements without disturbing the
circuit under test. Unlike previous, capacitance-based approaches, our design is compatible with inexpensive signal
switch ICs, enabling the protection of arbitrarily large meshes at minimal cost without compromising sensitivity.

\begin{figure}
    \centering
    \includegraphics[width=0.6\textwidth]{pic_board_setup_2_small_censored.jpg}
    \caption{Measurement setup. Shown are the test specimen board on the left, and the frontend board with one of the
    four pulse amplifiers in the center. The frontend board is powered through a USB-C connection, and data is sent to a
    computer through an Single-Wire Debug (SWD) interface. The grid in the background has \qty{10}{\milli\meter} pitch.
    Note: Author names and institutional affiliation were removed from this picture for peer review.}
    \label{fig_pic_board}
\end{figure}

Security meshes can be implemented at the macro scale, covering entire Printed Circuit Board Assemblies
(PCBAs) in applications such as Hardware Security Modules (HSMs) or card payment terminals, or they can be implemented
at the micro scale to prevent the readout of secrets from Integrated Circuits (ICs) such as smartcards or Trusted
Platform Modules (TPMs). Commercial implementations of macro-scale security mesh monitoring circuits are largely limited
to simple trace continuity monitoring due to cost constraints. A limited amount of academic work on higher-fidelity
monitoring approaches exists, but comes with the use of expensive, specialty components and has not yet found widespread
adoption.

Micro-scale tamper sensing meshes are usually implemented as passive sensors without a continuous power supply, and are
only checked once during system powerup, while macro-scale meshes are usually implemented as active sensors with a
continuous backup power supply so as to not give the attacker a window of attack when the remaining system is powered
down. There are academic works proposing the use of security meshes as Physically Uncloneable Functions (PUFs) to
provide a high-fidelity tamper sensor that can even detect attempts at patching the mesh to fix traces broken in a
drilling attack\cite{
    immlerBTREPIDBatterylessTamperresistant2018,
    immlerSecurePhysicalEnclosures2018,
    garbTamperSensitiveDesignPUFBased}.

As is often the case with security technologies, in practice a tension exists between the level of security offered by a
particular security mesh implementation, and its implementation cost. The most secure meshes require specialized
manufacturing techniques that aim to produce what is essentially a Flexible Printed Circuit (FPC) whose materials are
specifically chosen to be as fragile as possible such that it breaks even during careful manipulation by an attacker. In
contrast to this, industrially simpler approaches are still commonly used for their ease of implementation. Often,
standard copper/polyimide FPCs are used because of the wide availability of manufacturing services. In some
lower-security applications such as card payment terminals, meshes manufactured from simple PCBs are used.

In this paper, we introduce an approach for the design of improved, higher fidelity security mesh monitoring circuitry
and present a practical prototype demonstrating our design's capabilities. The contributions of our work are as follows:

\begin{itemize}
    \item Our approach provides higher fidelity compared to state-of-the-art security mesh conductivity monitoring and
        improves the sensitivity of meshes including when manufactured using less advanced technologies such as standard
        FPC or PCB processes. Our TDR frontend produces 70 data points for each meter of mesh length, resulting in a
        measurement density per mesh area of \qty{150}{\bit\per\centi\meter^2} when using a mesh manufactured in a
        standard low-cost commercial PCB process.
    \item Our approach consists of an optimized, low-cost differential Time Domain Reflectometry (TDR) frontend built
        around a commodity microcontroller and an amplifier IC originally intended for digital video applications. Our
        design achieve pulse risetimes below \qty{200}{\pico\second}, corresponding to only \qty{3}{\centi\meter} of
        wave propagation inside the mesh at the speed of light in PCB material, a $25\times$ improvement over the
        closest previous work\cite{vasileActiveTamperDetection2017,vasileTemperatureSensitiveActive2017}.
    \item We explain the design rationale behind our design. Our design is based entirely around commercially available,
        inexpensive mass-market components, which means our design can be replicated and extended by anyone, without
        necessitating access to bespoke production equipment or semiconductor manufacturing capabilitiese. To facilitate
        further research and practical applications, we publish our prototype under an Open Source license.
    \item We present a working prototype along extensive experimental results, including laboratory measurements of the
        technical performance of our design. Furthermore, we practically demonstrate that our design is able to not only
        detect, but distinguish and even localize faults in several realistic attack scenarios. We demonstrate that our
        design shows sufficient sensitivity to detect and localize an attack using a commercial, high-impedance
        oscilloscope probe.
\end{itemize}

\section{Related Work}

A general introduction into Hardware Security Modules can be found in
\textcite{andersonCryptographicProcessorsASurvey2006} as well as \textcite{tehranipoorHardwareSecurityPrimitives2023}.
While security meshes are widely used in practice, their design is only covered by a sparse research corpus. Research in
the area spans both improvements to security meshes
\textcite{immlerBTREPIDBatterylessTamperresistant2018,garbTamperSensitiveDesignPUFBased,vasileProtectingSecretsAdvanced2019},
as well as monitoring approaches that attempt to entirely replace security meshes using other
primitives\cite{vaiSecureArchitectureEmbedded2015,vaiSecureArchitectureEmbedded2015}.

As \textcite{andersonSecurityEngineeringGuide2020} notes, while this area is actively researched commercially, there,
security-by-obscurity is often considered a good idea and with few exceptions, little detail is published on physical
security implementations. The academic work listed below should be understood with this caveat in mind. One of the goals
of this paper is raising the bar in the academic state of the art to a level that likely lies beyond the current state
of the art in the commercial sphere.

Patent literature gives a partial view on commercial developments in this area. Even recent patents such as\cite{
    longMichaelFisherPoughkeepsie,
    nortonTamperDetectingCases2019,
    razaghiTamperDetectionSystem2020,
    wesselhoffTamperResponsiveSensor2020,
    hall72InventorsAlan,
    wesselhoffTamperResponsiveSensor2018,
    dangler54METHODMANUFACTURING,
    wadeTamperProtectionMesh2016,
    wadeMagneticStripeReader2015,
    wernerFabricatingTamperrespondentSensors2024,
    busbyTamperDetectionEnclosuretoboard2020,
    chockPointSaleTerminal2009}
\todo{Individually closely check each of these!} from HSM manufacturers IBM and HP, ATM component manufacturer Cryptera,
payment terminal manufacturer Stripe as well as industry publications\cite{nisargaSystemLevelTamperProtection2016}
continue to cite security mesh monitoring techniques that are no more sophisticated than trace resistance monitoring at
best, suggesting that commercial systems might not be more sophisticated than current academic proposals.

\subsection{Security Mesh Monitoring and Design}

\paragraph{Meshes as capacitive PUFs.}
\textcite{immlerBTREPIDBatterylessTamperresistant2018,obermaierMeasurementSystemCapacitive2018,garbTamperSensitiveDesignPUFBased}
propose one of the most advanced security mesh designs in the current academic state of the art. They use a specialized
security mesh as a Physically Uncloneable Function (PUF), combining tamper sensing with cryptographic key storage. In
their design, the mesh consists of a cross-hatch pattern made from several dozen individually adressable capacitive
electrodes. Their analog frontend measures the precise mutual capacitance of each pair of electrodes using an approach
similar to \textcite{satoToucheEnhancingTouch2012}, and they use the resulting capacitance matrix as the basis of their
PUF.

Advantages of their system include high sensitivity to modifications, as well as that as a PUF, the system does not
require a continuous power supply. However, there are several significant differences between their proposed system and
our design.

\begin{itemize}
    \item Their system is limited by sensing circuit dynamic range, which they compensate by using a large number (32)
        of electrodes in parallel. Covering larger volumes with such a system would require increasing electrode count
        further, resulting in a linear increase in frontend cost when targeting the same scanning speed. In contrast to
        this, our system can cover larger volumes by the addition of inexpensive signal switches.
    \item Their system requires a mesh manufactured in a specialized manufacturing process. Additionally, precise
        control of this process is critical to maintain the PUF property of the device. In particular, if the
        manufacturing process is \emph{too consistent}, it could result in multiple PUFs exhibiting the same or similar
        responses.
    \item Their system requires a complex frontend circuit. Initial prototypes used a large number (one per channel) of
        not inexpensive operational amplifiers along with a particular Junction Field Effect Transistor (JFET) that has
        since become unavailable due to obsolescence. Later, they developed a custom IC containing the frontend circuit
        for an envelope foil measuring approximately \qty{18}{\centi\meter} by
        \qty{10}{\centi\meter}\cite{obermaierMeasurementSystemCapacitive2018,garbFORTRESSFORtifiedTamperResistant2021}.
        In contrast, our system requires only widely available, low-cost commodity components, for each of which
        alternative substitutes from other manufacturers are available. Furthermore, in our design, a single sensing
        frontend can be shared among multiple meshes covering a large area by daisychaining the meshes or by
        using inexpensive signal switch ICs.
\end{itemize}

\paragraph{Bridge measurement of capacitive interdigital meshes.}
\textcite{dupontMiniaturizedUltraLowPowerTamper2022} introduce a simple analog circuit approach for monitoring meshes
laid out as a set of capacitive interdigital structures not unlike the combs found in Micro-Electromechanical System
(MEMS) accelerometers and gyroscopes. They subdivide the mesh into four equal-sized quadrants, each containing two
equal-size interdigital electrodes. They connect the resulting eight electrodes in a capacitive bridge configuration,
and measure the bridge's balance using a simple analog monitoring circuit.

Advantages of their system include the simple, low power monitoring circuit made from basic, cheap components and the
capability to work with single-layer meshes such as those produced using Laser Direct Structuring (LDS).

\paragraph{Frequency-domain mesh characterization.}
\textcite{vasileProtectingSecretsAdvanced2019} introduce a monitoring method where they feed a variable-frequency signal
into one end of a continuous mesh trace, and measure the power of the signal coming out of the other end. In essence,
their setup measures $S_{12}$ magnitude in a similar way to a network analyzer.

Advantages of their design include the simple implementation, and the potentially robust nature of frequency-domain
measurements. Disadvantages include a nonstandard three-layer mesh stackup, as well as the susceptibility of the system
to attack by emulation given that the log power sensor they are using at the mesh output is designed to be insensitive
to any signal characteristics apart from total signal power.

\paragraph{Time-domain mesh monitoring.}
The prior work in the academic corpus that is probably closes to our proposal is the work of
\textcite{vasileActiveTamperDetection2017,vasileTemperatureSensitiveActive2017}, where they propose monitoring the
time-domain response of a mesh using a circuit made from a pulse generator and a fast Analog-to-Digital Converter (ADC).
To avoid the need for a full high-speed data processing pipeline, their design is centered around a specialized
high-speed ADC that has a small built-in sample memory, allowing them to capture a pulse at high speed before slowly
processing it from sample memory.

Advantages of their design include better sensitivity to changes in total mesh trace length compared to simple
continuity monitoring and the low complexity of their analog frontend. However, their proposed design differs from our
work in several fundamental aspects.

\begin{itemize}
    \item The design from \textcite{vasileActiveTamperDetection2017} is hinges on a specialized high-speed ADC
        that has a large internal sample buffer. Not only is this part expensive at \price{15.95}{\euro} at quantity
        1000, to our knowledge it is also the only part of its kind available on the market. Foregoing this part, and
        going for a comparable fast ADC without this sample buffer would require a fast digital processing frontend,
        resulting in greater system cost. In contrast, our design uses widely available parts, all of which can easily
        be substituted for other, similar parts from different manufacturers.
    \item Their system is limited in time resolution by their choice of ADC. Despite using a high performance ADC, their
        system only achieves a time resolution of \qty{5}{\nano\second}, less than $\frac{1}{25}$ of our design. Because
        the cost of ADCs quickly escalates with sampling speed, achieving sub-nanosecond resolution would be difficult
        to achieve with their approach. For instance, the cheapest ADC available at distributor digikey that would
        enable \qty{1}{\nano\second} resolution--still less than $\frac{1}{5}$ of our design--would already cost more
        than \price{110}{\euro} at quantity 1000 and due to its relevance to electronic warfare and radar applications
        might require specialized clearance for export from countries such as the USA.
    \item Their system only measures the mesh's \emph{transmission} characteristic, corresponding to a a $S_{12}$
        S-parameter measurement configuration. This configuration is sensitive to changes in total mesh length, but is
        insensitive to changes in impedance along this length. While the transmitted signal strength will be affected by
        changes in impedance, such changes manifest only in the height of the output pulse, resulting in the whole
        information being mapped to a small number of ADC samples. Using such a measurement, it is not possible to
        localize faults. In contrast, our approach measures the signal's \emph{reflected} component, which is sensitive
        to both length, and to changes in impedance along the length. Additionally, our approach enables the
        localization of faults.
\end{itemize}

\subsection{Equivalent-Time Sampling}

Today, systems that digitize high-speed signals usually use a fast ADC, sometimes preceded by one or several
downconverting mixers. This development was enabled by both the increasing availability of ADCs capable of digitizing
hundreds of megasamples per second at a reasonable resolution, and by the increase in speed and capability of CPUs,
FPGAs and other digital components enabling the processing of the large amounts of data generated by such converters in
real time. However, this is largely a development of this millennium--meanwhile, signals far into the gigahertz range
have been studied since the advent of radar technology in the second world war\cite{kahrs50YearsRF2003}. Enabled by the
progress from vacuum tubes to semiconductor devices, equivalent-time sampling became the technology of choice for the
latter half of the twentieth century until around the turn of the millenium the introduction of high-speed digital
processing and fast ADCs enabled real-time conversion up into higher microwave frequencies, today reaching beyond the
\qty{100}{\giga\hertz} boundary.

\textcite{kahrs50YearsRF2003} trace back the style of four-diode balanced bridge sampling gate that we use to a vacuum
tube implementation presented in \textcite{chanceWaveforms1949}. This style of sampling gate found application in a
number of sampling oscilloscopes throughout the twentieth century in several oscilloscope sampling frontends such as
HP's 187B\cite{HP187BDualTrace1962}.

While initially equivalent-time sampling was used to circumvent technological limitations, more recently it has also
been used to achieve cost-optimized designs\cite{houtman1GHzSamplingOscilloscope2000}. Going along similar principles,
\textcite{polasekReflektometrCasoveOblasti2020} presents a design for a minimal sampling TDR circuit that uses a CMOS
clock generator IC along with a CML fanout buffer for pulse generation. The circuit improves upon the double sampling
design first presented by \textcite{houtman1GHzSamplingOscilloscope2000} to reconstruct a downsampled copy of the input
signal in the analog domain before digitization.

\subsection{Low-Cost Time Domain Reflectometry}

\textcite{bencivenniTimeDomainReflectometer2013} present an FPGA-based embedded reflectometer design. Since their design
is based on an early FPGA family dating back to 2003 that lacked the speed and the adjustable I/O delay features of more
modern FPGA families, their design uses the FPGA's logic resources to achieve adjustable delays.
\textcite{negreaSequentialSamplingTime2009} show an equvalent-time sampling TDR that uses specialized adjustable delay
line ICs for pulse generation. \textcite{lee16psresolutionRandomEquivalent2003} achieve very high time resolution in an
equivalent-time sampling TDR system by using a vernier approach to pulse generation, such that their system is limited
by analog bandwidth, not time resolution. \textcite{trebbelsMiniaturizedFPGABasedHighResolution2013} show another
FPGA-based TDR. Their system also uses a part from the same early FPGA family as
\textcite{bencivenniTimeDomainReflectometer2013}, and they work around its lack of precise timing primitives by
generating a low-frequency sine wave through DDS, which they filter, and then sample using a comparator - a similar
approach to the timing generation in \textcite{houtman1GHzSamplingOscilloscope2000}. Additionally, they avoid the need
for a discrete ADC by implementing a $\Delta\Sigma$ loop around a fast comparator, trading off slower acquisition time
for lower hardware complexity. They use a \qty{5.5}{\volt\per\nano\second} wideband amplifier IC to generate their
stimulus pulse, achieving a rise time of \qty{2}{\nano\second}. As a result, similar to
\textcite{lee16psresolutionRandomEquivalent2003}, their design is limited by analog bandwidth--here resulting from the
nanosecond-scale stimulus risetime--not by frontend time resolution.

\section{Monitoring a Security Mesh using Time-Domain Reflectometry}

Time-Domain Reflectometry (TDR) is a well-known technique that is used to locate faults along a signal channel such as a
copper cable, or an optical fiber. In TDR, a pulse is sent into the beginning of the channel. While the pulse traverses
the channel, any fault such as a discontinuity in electrical impedance or optical density causes part of the pulse to
travel back in a partial reflection. TDR monitors these reflections returning to the beginning of the channel by
recording the signal measured at it after the pulse has been sent. When the pulse reaches the end of the channel,
depending on termination it can be reflected to travel back to the beginning, which allows measurement of the channel's
length.

\subsection{Attacks on a Security Mesh Viewed Using TDR}

In this paper, we apply TDR to monitor a security mesh for changes caused by an attack. Our prototype setup consists of
a custom circuit board containing a low-cost embedded TDR frontend that can be connected to a security mesh specimen to
measure its response. We construct a security mesh with a ground plane underneath similar to previous
work\cite{immlerBTREPIDBatterylessTamperresistant2018,
obermaierMeasurementSystemCapacitive2018,
garbTamperSensitiveDesignPUFBased}, which when viewed in the microwave domain constitutes what is essentially a delay
line. Security meshes commonly use a pair of two traces to capture short circuit condition between adjacent traces,
which we treat as a differential pair for improved resiliency against electromagnetic interference. We constructed our
frontend such that it excites the two traces differentially, but allows for both single-ended and for differential
measurements.

In an intact mesh, we expect our frontend to record no significant reflections until the stimulus pulse has traversed
the mesh's traces both ways, at which point we expect a large response whose polarity and amplitude depends on the
termination on the far end of the mesh. In our prototype circuit, we made this termination configurable to expand the
range of possible measurement configurations and to enable self-calibration of the circuit.

When an attacker attempts to tamper with the mesh, they will cause an impedance discontinuity. Cuts of one or both
traces, or a short circuit between both traces will result in a total reflection of the incident pulse at the location
of the fault, which our circuit will easily detect as the delay of the response changes. However, beyond these simple
cases, our approach can also detect more subtle changes. For instance, short circuit between two points along the same
mesh trace will also result in a change in delay along this trace. Furthermore, even just probing a mesh trace with an
oscilloscope probe will add the probe's input capacitance, which is usually in the order of several Picofarad, to one
point along the trace, result in an impedance step that can be detected by TDR. The TDR approach is thus able to not
only detect, but distinguish and even localize several types of faults or attacks in a mesh.

\section{Circuit Design and Driving Approach}

\begin{figure}
    \centering
    \hspace*{-7mm}
    \includegraphics[height=80mm]{block_diagram.pdf}
    \caption{Block diagram of our prototype sampling TDR security mesh monitoring circuit.}
    \label{fig_block_diagram}
\end{figure}

A TDR can be broken down into three basic components. First, we need a source of fast pulses (or fast edges!) to
stimulate the mesh. Second, we need a coupler that allows us to couple the stimulus pulses into the mesh, and their
reflections out of it. Finally, we need a fast ADC to capture the reflections.

Figure\ \ref{fig_block_diagram} shows a block diagram of our design\footnote{Full schematics are available in this
paper's supplementary material.}. At the core of our design lies an equivalent-time sampling setup, where two
diode bridge sampling gates alternately sample the two traces of the mesh.
Since physical attacks happen on a time scale of minutes or hours, we do not need a fast acquisition rate. Equivalent
time sampling uses fast sampling gates to sample a high-frequency signal at a low frequency that is suitable for direct
conversion through an ADC. This reduces the requirements of our data acquisition and signal processing fronted from
gigasamples per second to mere megasamples, well within the range what a commodity microcontroller can handle.

A challenge in equivalent-time sampling is precisely phase-synchronizing the sampling pulse to the fundamental frequency
of the input signal, which is usually implemented by using a high-speed comparator. In a TDR-style frontend like ours,
this expensive component can be avoided because the stimulus signal is generated in the frontend, simplifying the
challenge to generating a synchronized sampling pulse at an adjustable phase to the stimulus pulse.

Since an intact mesh has low insertion loss, the amplitude of the response of an intact mesh is large. Thus, we do not
need a high dynamic range in either the frontend amplifiers nor in the ADC, enabling the use of commodity operational
amplifiers (opamps) and the built-in ADC of a commodity microcontroller. Further, the strong signal allows us to use a
comparativeky lossy \qty{-6}{\deci\bel} resistive tee instead of a directional coupler. A resistive tee does not provide
directionality, but in our case the incident pulse can never interfere with reflections at the sampling output of the
divider because of causality.

To implement our sub-nanosecond sampler, we chose a simple four-diode bridge sampling gate made from contemporary
commodity \partno{BAT17-04W} RF schottky diodes, which offer turn-on times better than \qty{100}{\pico\second} at
\price{0.13}{\euro} per device at quantity 1000. The four-diode configuration requires only two dual diode packages. In
contrast to \textcite{polasekReflektometrCasoveOblasti2020,houtman1GHzSamplingOscilloscope2000}, in our system, double
sampling is not necessary - instead, we follow the sampling gate directly with an amplifier feeding into the internal
ADC of our microcontroller. We use an internal timer peripheral of the same microcontroller to generate both stimulus
and sample pulses such that we can easily phase-lock the internal ADC to the same timer.

We base our circuit around a \partno{STM32G474RB} microcontroller, a \price{5}{\euro}-class commodity ARM
microcontroller. Besides adequate processing speed for its price class, this microcontroller offers two features that
are critical to our design. First, its internal ADCs are both higher resolution and faster than those of older parts.
Second, it is one of a few parts in its series that include a \emph{high-resolution timer} (\partno{HRTIM}) peripheral
that provides several outputs that can be controlled with better than \qty{200}{\pico\second} resolution through
per-output, self-calibrating delay line circuitry. We use this peripheral to produce both the stimulus pulse and the
phase-adjustable sampling pulse.

While the HRTIM peripheral allows us to finely adjust the phase of its output waveform, the digital output structures of
the \partno{STM32G4} series are still limited to nanosecond-scale rise and fall times with the datasheet quoting
$t_r=t_f=\qty{1.7}{\nano\second}$ into a \qty{10}{\pico\farad} load when using the fastest GPIO output drive strength
setting and a \qty{3.3}{\volt} supply\cite{stmicroelectronicsSTM32G474xBDatasheet2021}. We work around this issue
applying two circuit tricks. First, we send its output through a fast amplifier to square up the edges to a rise time
better than \qty{500}{\pico\second}. The remaining challenge is that while we now have pulses with crisp edges, due to
constraints of the HRTIM peripheral, at more than \qty{10}{\nano\second}, these pulses are still too wide to be useful.
We solve this issue by applying a clip line\cite{tektronixinc.TektronixS6Sampling1982} pulse forming network at the
output of the amplifier--i.e.\ we connect the amplifier's output to the load in parallel with a short, terminated
transmission line stub. The length of this stub determines pulse width.

\subsection{Driver Selection}

There are several types of amplifiers that can be used in our pulse shaping application. Common to all options, we
require differential outputs. In practice, for most parts this means we are looking for a part with Current Mode Logic
(CML) outputs. CML is a differential signaling standard that is widely used in high-speed logic. In CML, a current
source feeds a pair of transistors that steer current between the two outputs of the differential pair. By steering
current between the two outputs, common-mode currents are minimized which both reduces the effect of power supply
impedance at the transmitter, and reduces electromagnetic emissions from the differential pair's PCB traces. In our
experiments, we considered a number of parts and settled on four parts for evaluation in this paper: A
\partno{74LVC2G157} standard logic IC, two display protocol redrivers, \partno{PI3HDX12211} and \partno{TDP0604}, as
well as \partno{MAX3748}, a limiting amplifier for optical networking applications. We implemented four variants of our
prototype using a steady hand under a microscope as shown in Figure\ \ref{fig_pic_amps}.

One notable omission from our tests was the series of CML-output comparators such as \partno{ADCMP606} made by Analog
Devices. These parts are easily available and are easy to interface and are popular with other designs. However, we
avoided them in our design due to cost.

\begin{figure}
    \centering
    \begin{subfigure}{0.23\textwidth}
        \centering
        \includegraphics[width=0.9\textwidth]{pic_74lvc_small.jpg}
        \caption{74LVC2G157}
    \end{subfigure}
    \begin{subfigure}{0.23\textwidth}
        \centering
        \includegraphics[width=0.9\textwidth]{pic_max3748_small.jpg}
        \caption{MAX3748}
    \end{subfigure}
    \begin{subfigure}{0.23\textwidth}
        \centering
        \includegraphics[width=0.9\textwidth]{pic_tdp0604_small.jpg}
        \caption{TDP0604}
    \end{subfigure}
    \begin{subfigure}{0.23\textwidth}
        \centering
        \includegraphics[width=0.9\textwidth]{pic_pi3hdx_small.jpg}
        \caption{PI3HDX12211}
    \end{subfigure}
    \caption{Circuit-board implementation of the four pulse amplifier variants of the design. Amplifiers were mounted
    dead bug style on a piece of copper tape connected to one of the supply rails, and hooked up with
    \qty{120}{\micro\meter} diameter wire according to their respective datasheets. Supply rails were hooked up using
    copper tape where possible to reduce series impedance. Additional \qty{10}{\micro\farad} MLCC power supply
    decoupling capacitors were placed close to the ICs on the copper tape to reduce loop area..}
    \label{fig_pic_amps}
\end{figure}

\paragraph{Standard logic ICs.}
As a baseline, we evaluated the \partno{74LVC2G157} standard logic IC. This IC contains a single multiplexer, however,
we are not interested in the multiplexer functionality. The interesting trivia about this chip is that it also is one of
the only \partno{74} series standard logic parts that has complimentary outputs. According to manufacturer
specifications, at a comparable \qty{20}{\pico\farad} load, \partno{74LVC} series parts have slightly faster rise and
fall times compared to our \partno{STM32} micrcontroller's digital IO
pins\cite{renesaselectronicscorporationApplicationNoteAN2242019}.

\paragraph{Optical Networking Chipsets.}
Another category of CML-output drivers suitable for our application are a class of optical networking chipset ICs. While
today, the construction of optical transmitters has moved to direct bonding of optical components and driver ICs to
minimize parasitics, discrete driver ICs for some chipsets from the mid-2000s era are still available at reasonable
cost. Both the laser driver used to drive the transmitter laser diode, and the limiting amplifier used to amplify the
receiver photodiode's output can be used in our application, with the limiting amplifier part requiring less additional
circuitry in our application due to its lack of output bias control. In our evaluation below, we include the
\partno{MAX3748} limiting amplifier as a representative part from this category that is still commercially available. A
drawback of relying on a part like this is that its future availability is uncertain given the evolution of the
industry.

\paragraph{Bus Redrivers.}
The final category of amplifiers suitable for our pulse shaping needs is redrivers intended for high-speed data
interfaces such as USB 3, PCI express, HDMI or DisplayPort. All of these interfaces use CML drivers, with differential
voltage levels usually in the order of \qtyrange{600}{1000}{\milli\volt}. \emph{Redriver} ICs are intended to be used to
amplify the sensitive high-speed bus signal at the edge of a PCBA, either before it leaves the board through a connector
to ensure adequate signal levels at the connector, or after it enters through a connector to compensate for loss in the
PCB traces between the connector and the signal's destination. For our application, redrivers intended for HDMI and
DisplayPort applications are most suitable, as they can usually be configured to act as simple amplifiers without
processing any protocol logic on the signals that are amplified. In contrast, both USB 3 and PCIe redrivers usually
implement power saving features that try to parse parts of the actual signal transmitted through them, which are hard to
bypass in our application.

Redrivers can be classified according to their way of operation. \emph{Retimers} include include a full
serialization/deserialization (SerDes) setup and parse the low-level protocol of the bus in order to reconstruct
bit-level timing. We focus only on simpler redrivers that only contain amplifiers and (analog) equalizers here.

Amplifying redrivers can be separated into two classes: Limiting and linear redrivers. A limiting redriver is configured
to have a high gain such that a small input signal will be amplified to the full output voltage swing. Limiting
redrivers are well-suited for our application, but they have come out of fashion since they interfere with link training
and with power saving features of protocols like USB-3.

Linear redrivers are constructed with a low gain instead. Sufficient to compensate wiring losses, their gain is low
enough tobut leave them transparent to bus protocol features such as link training or power saving features. To
compensate for their reduced gain, linear redrivers usually contain configurable equalizers that can be used to apply
targeted enhancements for particular signal defects, such as boosting high-frequency gain or providing a set amount of
overshoot.

In our evaluation below, we include \partno{PI3HDX12211} as a linear redriver intended for DisplayPort and HDMI
applications, as well as \partno{TPD0604} as a ``hybrid'' linear or limiting redriver for HDMI applications, configured
for limiting mode in our experiments. An attractive feature of both of these chips as well as comparable devices is that
they usually include at least four independent channels, so only one chip is needed for both pulse paths. Additionally,
they are consumer mass market parts, resulting in a low price. For instance, \partno{PI3HDX12211} is available at
\price{2.11}{\euro} in single quantity and less than \price{1.30}{\euro} at several hundred quantity at distributor
LCSC, and \partno{TPD0604} is available at \price{4.72}{\euro} and \price{3.44}{\euro}, respectively, at distributor
Mouser.

\subsection{Cost Breakdown}

Table\ \ref{tab_bom} shows a breakdown of the cost of the main components of our prototype, resulting in a total
component cost of less than \price{10}{\euro}. We did not include power supply components in this breakdown as our
circuit is meant to be embedded into a payload circuit that will already have sufficient power supplies.

Due to its \partno{HRTIM} peripheral, the \partno{STM32G4} microcontroller is the component of our design that is
hardest to replace. However, this part can still be replaced with a wide range of FPGAs, which commonly include
digitally configurable delay lines on their IO pins for signal de-skewing. For instance, the \partno{ODELAY} primitive
of Xilinx 7 Series FPGAs provides the same $\frac{1}{32}$ clock cycle resolution that the \partno{STM32G4}
\partno{HRTIM} peripheral provides.

\begin{table}
    \begin{tabular}{c|c|c|l}
        \textbf{Part number}&\textbf{Amount}&\textbf{Cost in \euro}&\textbf{Description}\\\hline
        PI3HDX12211&1&1.37&Pulse amplifier\\
        STM32G474RB&1&3.51&Main microcontroller\\
        OPA1656&1&1.25&Sampling post-amplifier\\
        TMUXHS4212&2&0.64&Signal routing switch\\
        SKYA21003&2&0.49&Termination switch\\
        74LVC2G157&2&0.15&Pulse pre-conditioning\\
        BAT17-04W&4&0.12&Sampling gates\\
        &25&0.01&Various MLCC capacitors\\
        &25&0.01&Various resistors\\\hline
        \multicolumn{2}{r}{}&\textbf{9.67}&\textbf{Total}
    \end{tabular}
    \caption{A cost breakdown of the major components of our design. Listed prices are for 1000 pieces order quantity to
    make prices more comparable between distributors. The number of switches necessary for signal routing and
    termination depends on the specific mesh signal routing of the application. Numbers shown here are for our
    prototype, which can measure a mesh from both ends and supports short, open and matched termination.}
    \label{tab_bom}
\end{table}

\subsection{Measurement Principle and Scan Scheduling}
\label{sec_scan_schedule}
\todo{Mention measurement speed!}

The goal of a time-domain reflectometer is to send a pulse into the Device Under Test (DUT)--i.e.\ in our application,
the mesh--and to record all reflections returning from the DUT afterwards. In something like a security mesh whose
traces might only be a few meters long in total, the time span between the pulse being sent and the last reflections
from the very end of the mesh arriving is in the order of several tens of nanoseconds. Directly recording a response at
this timescale would be infeasible using a commodity microcontroller, so we utilize an equivalent-time sampling
approach.

Our analog frontend contains amplifiers that produce the stimulus pulse, a sampling gate with amplifiers, and a coupler
that couples the pulse into the mesh and that couples the reflections back into the sampling gate. The microcontroller
controls this frontend with two primary signals: A stimulus pulse, and a sampling pulse. By adjusting the timing between
these two pulses every time a stimulus pulse is sent, the microcontroller can select a particular point in time after
the stimulus pulse to record using the sampling gate. By slowly sweeping across the whole timespan, the microcontroller
can reconstruct the waveform of the reflected signal at the sampling gate across one period of the stimulus pulse. The
recording rate of this waveform is limited by the repetition rate of the stimulus pulse as well as the time step size.

The attainable repetition rate of our stimulus and sampling circuits is limited by two main components. First, the
sampling post-amplifier's bandwidth limits the maximum sample rate. In our design, we chose an \partno{OPA1656}
\qty{50}{\mega\hertz} Gain-Bandwidth Product (GBP) FET input low noise operational amplifier. We need a FET input part
to avoid loading the sampling gate. The comparatively high GBP and low noise input stage of this device allow us to
amplify small signals that could result from weak reflections in small impedance discontinuities inside the mesh.

The second major factor limiting repetition rate is the microcontroller's ADC speed, as well as the speed of the
software processing the ADC's output. At full \qty{12}{b} resolution, this corresponds to a sampling rate of
approximately \qty{4}{MSps}.

Combining these factors, we settled for a sampling rate of \qty{1}{MSps} across both channels of the differential pair.
At this sampling rate, it is feasible to control the sample timing on a sample-by-sample basis. For all measurements in
this paper, we use a sequential sampling approach where the microcontroller takes a series of measurements for
oversampling at a particular delay, then increases the delay by one \partno{HRTIM} output clock interval.

While for our development, this sequential scanning method is adequate, in a practical security mesh monitoring
application, there are two simple optimizations that would decrease the time to detection for an attack. First, in a
practical application, the range of scanned delays should be adjusted to the length of the particular security mesh in
use. For this paper, we always scanned a time range starting before one stimulus pulse and ending shortly before the
next stimulus pulse so that any waveform artifacts will be visible. In a practical application, there would be little
information gained by sampling much beyond the edges of the expected mesh response, so the scan window should be kept
small to increase scan rate.

Secondly, in a practical application, the feature that is most relevant to detect tamper attempts is the trailing edge
of the mesh's response. This trailing edge corresponds to the return of the stimulus pulse's reflection at the far end
of the mesh. Any attack that affects the impedance even only of part of the mesh has a high chance to affect its delay,
and thus this trailing edge is likely to move. In a practical application, it would thus be efficient to use a heuristic
scan schedule instead of the sequential scan we are using in our research prototype. Such a heuristic schedule would
sample delays near the expected trailing edge of the particular mesh in use more frequently compared to delays that lie
somewhere else, such as in the middle of the mesh's return window.

\section{Experimental Evaluation}

To validate our design, we will perform a two-fold evaluation. First, we want to measure the performance of our sampling
circuit as a time-doimain reflectometer. The most relevant figure to our mesh monitoring application is the pulse
generators' rise time, which determines the frontend's sampling speed and consequently the level of detail that we are
able to extract from a connected mesh during one scan. Since we aim at fingerprinting a connected mesh, not at
performing absolute measurements, we do not need to characterize the transfer function of our TDR frontend.

Second, we will characterize the end-to-end performance of our design on a mesh test specimen, and we will evaluate its
performance on several realistic tamper attempts. As a baseline characterization, we will show measurements of both
short and open mesh traces, allowing us to evaluate our designs' capacity to spatially localize faults. Building upon
this baseline, we will then demonstrate a probing attack, in which we will measure our design's response to a standard
\qty{100}{\mega\hertz} bandwidth $\qty{10}{\mega\ohm}||\qty{10}{\pico\farad}$ oscilloscope probe. Compared to the
baseline open/short test, this provides a greater challenge due to the probe's intentionally high impedance and minimal
capacitive loading.

\subsection{Rise Time Measurement}

We measure two figures of merit to characterize frontend speed. First, as shown in Section\ \ref{sec_spec_risetime}
below, we measure pulse rise time at the mesh interface using a Keysight N9020A MXA \qty{26.5}{\giga\hertz} signal
analyzer to evaluate the rise time of our pulse generator. This figure gives an indication of the raw performance of our
pulse generator. Second, we use our circuit to perform a TDR measurement of a mesh test specimen, and measure the rise
time of the sampling pulse as seen by the circuit itself. This figure gives an indication of the actual measurement
performance of our circuit. In general, this rise time will be faster than the pulse rise time because of the non-linear
characteristic of the sampling schottky pairs. Depending on the IC, our pules generator produces output waveforms with
\qtyrange{1200}{2400}{\milli\volt} differential voltage swing. Since the sampling diode pairs start to conduct at a
combined forward voltage of approximately \qty{500}{\milli\volt}, they will transition from high impedance to low
impedance during a corresponding \qty{500}{\milli\volt} window at the middle of the strobe pulse's edge. Thus, even if
the strobe pulse shows a low-pass response with rounding at both ends, as long as its slew rate
$\frac{\mathrm{d}V}{\mathrm{d}t}$ during the zero crossing is fast enough, the pulse will still result in a sharp
turn-on knee of the sampling diodes.

\subsubsection{Stimulus Pulse Rise Time at the Mesh}
\label{sec_spec_risetime}

\begin{figure}
    \begin{center}
        \begin{subfigure}{0.48\textwidth}
            \centering
            \includegraphics[width=\textwidth]{fig_spec_risetime_74lvc.pdf}
            \caption{74LVC2G157}
            \label{fig_spec_risetime_74lvc}
        \end{subfigure}
        \unskip\begin{subfigure}{0.48\textwidth}
            \centering
            \includegraphics[width=\textwidth]{fig_spec_risetime_max3748.pdf}
            \caption{MAX3748}
            \label{fig_spec_risetime_max3748}
        \end{subfigure}

        \begin{subfigure}{0.48\textwidth}
            \centering
            \includegraphics[width=\textwidth]{fig_spec_risetime_tdp0604.pdf}
            \caption{TDP0604}
            \label{fig_spec_risetime_tdp0604}
        \end{subfigure}
        \unskip\begin{subfigure}{0.48\textwidth}
            \centering
            \includegraphics[width=\textwidth]{fig_spec_risetime_pi3hdx.pdf}
            \caption{PI3HDX12211}
            \label{fig_spec_risetime_pi3hdx}
        \end{subfigure}
    \end{center}
    \caption{Spectrum measurements and re-constructed time-domain pulse edge shape of the stimulus pulse measured at the
    mesh interface for each of the four driver ICs. Amplitudes were normalized for risetime plots. The $\frac{1}{f}$
    curve in the spectrum plots shows the peak amplitude of the frequency components of an ideal infinite-bandwidth
    square wave. The horizontal gray lines in the time-domain plots show thresholds used for risetime calculation.}
    \label{fig_edge_risetime}
\end{figure}

To measure the rise time of our frontend's pulse generator, we measured the stimulus output at the mesh interface using
a Keysight N9020A MXA \qty{26.5}{\giga\hertz} signal analyzer\footnote{The spectrum analyzer used was significantly
faster than the fastest oscilloscopes we had access to, so it was the more appropriate choice of measurement
instrument.}. All measurements were taken with the prototype's mesh interface connected to the spectrum analyzer through
a bias tee configured for DC blocking followed by a \qty{20}{\deci\bel} attenuator for protection. Since both stimulus
and sampling pulses are generated using identical circuits, we can transfer those results to the sampling pulse modulo
amplifier output loading effects.

Figure\ \ref{fig_edge_risetime} and Table\ \ref{tab_edge_risetime} show the resulting measurements. For ease of
interpretation, we projected the measurements from the frequency domain (upper traces) back into the time domain (lower
traces), and extracted rise time measurements from those traces. Our measurements show that, as expected, the bare
\partno{74LVC}-series logic gate has the slowest rise time at approximately \qty{500}{\pico\second}. All three amplifier
variants we implemented showed significantly improved risetime, with the \partno{PI4HDX12211} clocking in at below
\qty{200}{\pico\second}, and the other two showing around \qty{120}{\pico\second}. A noteworthy detail is that
\partno{MAX3748} and \partno{TDP0604} only achieved a low output signal amplitude, which stems from a combination of
them having low output amplitude by design and of our circuit loading their outputs heavily. Since their amplitude is
only marginally within the knee region of the RF schottky diodes used in the sampling bridges, in these variants,
sampling gates are slower than the raw pulse risetime value alone would suggest.

\subsubsection{Self-Characterization}

\begin{figure}
    \begin{center}
        \includegraphics[width=\textwidth]{fig_edge_risetime.pdf}
    \end{center}
    \caption{The trailing edge of the stimulus pulse with no mesh connected measured by the board itself, using
    different amplifier ICs. Both positive and negative channels of the differential pair are shown individually.
    Vertical scale is in Volts at the sampling amplifier output.}
    \label{fig_edge_risetime}
\end{figure}

\begin{table}
    \begin{center}
        \begin{tabular}{r|cccc}
            \textbf{IC}
            &\partno{74LVC2G157}
            &\partno{MAX3748}
            &\partno{TDP0604}
            &\partno{PI3HDX12211}\\\hline

            \textbf{$t_r$ (Self-Characterization)}&
            \qty{497}{\pico\second}&
            \qty{998}{\pico\second}&
            \qty{951}{\pico\second}&
            \qty{145}{\pico\second}\\

            \textbf{$t_r$ (Stimulus at Mesh)}&
            \qty{573}{\pico\second}&
            \qty{125}{\pico\second}&
            \qty{119}{\pico\second}&
            \qty{191}{\pico\second}\\

            \textbf{Stimulus Pulse $V_{pp}$}&
            \qty{1600}{\milli\volt}&
            \qty{236}{\milli\volt}&
            \qty{254}{\milli\volt}&
            \qty{430}{\milli\volt}\\

            \textbf{Effective Slew Rate}&
            \qty{2.79}{\volt\per\nano\second}&
            \qty{1.89}{\volt\per\nano\second}&
            \qty{2.13}{\volt\per\nano\second}&
            \qty{2.25}{\volt\per\nano\second}
        \end{tabular}
    \end{center}
    \caption{Single-ended stimulus edge rise times for different amplifier ICs. The single-ended rise times of both
    positive and negative half of the differential pair have been averaged. External measurements are from Figure\
    \ref{fig_edge_risetime}, measuring the stimulus pulse at the mesh interface. $V_{pp}$ measurements are taken at the
    mesh interface. Effective slew rates are calculated from the external measurements and pulse $V{pp}$.}
    \label{tab_edge_risetime}
\end{table}

Figure\ \ref{fig_edge_risetime} shows the result of our self-characterization experiments. In these experiments, we ran
a measurement using $256\times$ oversampling at \qty{12}{b} ADC resolution. The plots show voltage at the amplifier
output voltage against time in \unit{\nano\second}. The absolute value of the amplifier output voltage is not relevant
here - only the rise time is. Since we use some of these amplifiers--particularly the redriver ICs--well outside of
their intended application, the actual voltage they develop across the nonlinear load our sampling gate's diode bridge
presents depends on implementation details of the amplifiers's CML output stage. To maximize ADC resolution and minimize
ringing, we tuned gain and bandwidth of each post-sampling amplifier for each IC. Ringing in the amplifier output leads
tojitter in the ADC's sampling period to directly feeding through to the ADC output value. Since in \partno{STM32} MCUs,
the ADC is clocked independently of the rest of the system, its sampling timing is poorly controlled and this jitter
causes a significant error unless the amplifier is well-compensated. The key figure for us is how fast our sampling gate
turns on, not how hard, so we can largely ignore the units on the graph's vertical scale.

Table\ \ref{tab_edge_risetime} shows rise times calculated from each trace, averaged across both traces of the
differential pair. From these results and from graphs in Figure\ \ref{fig_edge_risetime} we can see that both the
optical networking limiting amplifier as well as the \partno{TDP0604} ``hybrid'' redriver produce comparatively slow
edges with almost \qty{1}{\nano\second} rise time. We suspect that in both cases, this is caused by a combination of the
slow input signal transition as well as that these IC's CML output structures are poorly matched to the nonlinear
impedance presented by our sampling gate's diode bridges. \partno{MAX3748} also has the lowest output voltage swing of
all parts tested with only \qty{780}{\milli\volt} typical listed in its datasheet. Surprisingly, the straight
\partno{74LVC2G157} baseline unit has a rise time of only about \qty{500}{\pico\second}, improving over both previous
parts by almost a factor of two. We suspect this is largely caused by the large output voltage swing of this part, going
from ground to its $V_{CC}$ at \qty{3.3}{\volt}. Due to the construction of our sampling gate, its switching happens in
the short period between its input differential voltage crossing zero and it rising above the combined forward voltage
of both series schottky diodes. Thus, while the \partno{74LVC} might have rather slow edges when looking at it as a whole
including the transitions at both ends of the edge, its slew rate in the critical region in the middle of its output
transition might rival the two preivously mentioned, ostensibly faster parts simply due to its large output voltage
swing.

Finally, we observed the best result overall with the \partno{PI3HDX12211} redriver, resulting in a rise time of
\qty{145}{\pico\second}. In this test specimen, we fed the pulse through the amplifier twice since we had two unused
channels, and we used \qty{200}{\pico\second} clip lines on the amplifier's output for pulse shaping. We could only use
the clip lines in this specimen as in all other specimens, the amplifiers' output did not contain sufficient harmonic
content such that it was still able to turn on the sampling gate's diode bridge when used with the clip line.

\subsection{Mesh Specimen Characterization}

\begin{table}
    \begin{center}
        \begin{tabular}{r|cccc}
            \textbf{Specimen}
            &1
            &2
            &3
            &4\\\hline

            \textbf{Size}&
            $35\times\qty{70}{\milli\meter}$&
            $35\times\qty{70}{\milli\meter}$&
            $35\times\qty{70}{\milli\meter}$&
            $35\times\qty{70}{\milli\meter}$\\

            \textbf{Area}&
            $\qty{24.5}{\centi\meter^2}$&
            $\qty{24.5}{\centi\meter^2}$&
            $\qty{24.5}{\centi\meter^2}$&
            $\qty{24.5}{\centi\meter^2}$\\\hline

            \textbf{Trace width}&
            \qty{150}{\micro\meter}&
            \qty{200}{\micro\meter}&
            \qty{300}{\micro\meter}&
            \qty{500}{\micro\meter}\\

            \textbf{Trace spacing}&
            \qty{150}{\micro\meter}&
            \qty{200}{\micro\meter}&
            \qty{300}{\micro\meter}&
            \qty{500}{\micro\meter}\\

            \textbf{Trace pitch}&
            \qty{300}{\micro\meter}&
            \qty{400}{\micro\meter}&
            \qty{600}{\micro\meter}&
            \qty{1.00}{\milli\meter}\\\hline

            \textbf{Trace length}&
            \qty{1.07}{\meter}&
            \qty{1.93}{\meter}&
            \qty{2.86}{\meter}&
            \qty{3.86}{\meter}\\

            \textbf{Approximate Delay}&
            \qty{7.1}{\nano\second}&
            \qty{13}{\nano\second}&
            \qty{19}{\nano\second}&
            \qty{26}{\nano\second}\\
        \end{tabular}
    \end{center}
    \caption{Specifications of mesh test specimens used in the experiments in this paper. All four speciments were
    placed on a single, four-layer, \qty{1.0}{\milli\meter} thickness PCB. The meshes were placed two per side on the
    outer layers, and the inner layers were used as ground. Approximate signal delays were calculated using wave
    velocity $v=\frac{c}{\sqrt{\epsilon_r}}\approx\frac{c}{2}$\cite{wheelerTransmissionLinePropertiesParallel1965}
    assuming $\epsilon_r\approx 4$\cite{mumbyDielectricPropertiesFR41989} for the test specimens' \partno{FR-4}
    substrate.}
    \label{tab_mesh_spec}
\end{table}

To measure the practical performance of our prototype, we created a set of security mesh test specimens. Four specimens
each cover the same area using four different mesh pitches using two, looped mesh traces according to the design
specifications listed in Table\ \ref{tab_mesh_spec}. The four specimens have a trace length ratio of approximately
$1:2:3:4$. As a baseline validation of our prototype as well as the mesh design, we performed TDR measurements of each
mesh specimen using each amplifier variant of our prototype. Figure\ \ref{fig_mesh_length} shows the results of these
measurements. The graphs show the step response resulting from an edge entering the mesh, and its reflection arriving
back at the start after traversing the mesh back and forth.

We validated the results from Figure\ \ref{fig_mesh_length} by calculating speed of light in our mesh specimen's
substrate based on them. The resulting measurements are shown in Table\ \ref{tab_speed_of_light}. All amplifier
configurations yield comparable measurements of approximately \qty{1.6}{\meter\per\second}, which corresponds well with
the expected signal propagation velocity in \partno{FR-4} PCB material of
\qty{1.5d8}{\meter\per\second}\cite{wheelerTransmissionLinePropertiesParallel1965,mumbyDielectricPropertiesFR41989}.

An interesting aspect of the graphs in Figure\ \ref{fig_mesh_length} is that all except the \partno{74LVC} graph show a
dispersion effect increasingly rounding out the trailing edge of the response with longer mesh lengths. We suspect this
effect stems from higher-frequency components coupling into adjacent trace segments further up or down the mesh more
easily, spreading high-frequency components of the response signal out throughtout time and effectively creating a
low-pass response. We suspect the poor visibility of this effect in the \partno{74LVC} measurements is a result of this
variant's pulse amplifier output amplitude being very large, allowing reflected response components to forward-bias the
sampling gate's diode bridges, resulting in amplitude clipping.

From this dispersion effect follows a key point for the design of practical security meshes: To increase the temporal
resolution of TDR mesh monitoring, meshes should be broken up into relatively short segments that are multiplexed
through signal switching. Where this is not desirable, treating the mesh as a microwave circuit design and optimizing it
by applying the same electronic CAD/electromagnetic simulation co-design approach used for such circuits.

\begin{figure}
    \begin{center}
        \includegraphics[width=\textwidth]{fig_mesh_length.pdf}
    \end{center}
    \caption{TDR responses captured using our design with each of four candidate pulse amplifier ICs and four mesh test
    speciments. The shown time range covers the primary reflection of the stimulus pulse's falling edge.
    The vertical scale of all four graphs is in Volts at the ADC.}
    \label{fig_mesh_length}
\end{figure}

\begin{table}
    \begin{center}
        \begin{tabular}{r|cccc|c}
            &\multicolumn{4}{c|}{Specimen}&\\
            Pulse amplifier IC&
            1&
            2&
            3&
            4&
            Calculated speed of light $c$
            \\\hline

            \partno{PI3HDX12211}&
            \qty{16.9}{\nano\second}&
            \qty{26.0}{\nano\second}&
            \qty{36.4}{\nano\second}&
            \qty{46.1}{\nano\second}&
            $\qty{1.59d8}{\meter\per\second}$\\

            \partno{74LVC2G157}&
            \qty{17.1}{\nano\second}&
            \qty{26.4}{\nano\second}&
            \qty{36.6}{\nano\second}&
            \qty{48.2}{\nano\second}&
            $\qty{1.55d8}{\meter\per\second}$\\

            \partno{MAX3748}&
            \qty{17.2}{\nano\second}&
            \qty{26.4}{\nano\second}&
            \qty{36.6}{\nano\second}&
            \qty{45.6}{\nano\second}&
            $\qty{1.59d8}{\meter\per\second}$\\

            \partno{TDP0604}&
            \qty{17.0}{\nano\second}&
            \qty{26.2}{\nano\second}&
            \qty{36.5}{\nano\second}&
            \qty{45.8}{\nano\second}&
            $\qty{1.59d8}{\meter\per\second}$\\
        \end{tabular}
    \end{center}
    \caption{Speed of light and time offset calculated from delays read from the graphs in Figure\
    \ref{fig_mesh_length}. $c$ is the speed of light determined by linear fit. $\Delta t$ is a residual time offset
    common to all four mesh measurements.}
    \label{tab_speed_of_light}
\end{table}

\subsection{Tamper tests}

\begin{figure}
    \centering
    \begin{subfigure}{0.45\textwidth}
        \centering
        \includegraphics[width=0.8\textwidth]{pic_short_2_small.jpg}
        \ref{fig_pic_speciments_short}
        \caption{Short circuit test specimen}
    \end{subfigure}
    \begin{subfigure}{0.45\textwidth}
        \centering
        \includegraphics[width=0.8\textwidth]{pic_cut_1_small.jpg}
        \ref{fig_pic_speciments_open}
        \caption{Cut trace test specimen}
    \end{subfigure}
    \caption{Photos of the short circuit and cut trace test specimens. To measure short circuit response, one of the
    three marked locations on the test specimen was shorted using a soldering iron. To measure baseline values, the
    short circuit specimen was used without placing a short.}
    \label{fig_pic_specimens}
\end{figure}

After validating our prototype's electrical performance as well as our mesh specimen designs in the previous sections,
we performed a series of experiments where we performed tampering attempts on a mesh specimen while monitoring it using
our TDR prototype, capturing responses both before and after tampering. We performed two sets of experiments.

\subsubsection{Short and Open Circuits}

\begin{figure}
    \begin{center}
        \includegraphics[width=\textwidth]{fig_manip_shape.pdf}
    \end{center}
    \caption{TDR responses captured using our design under three short- and one open circuit scenario. The distance from
    mesh start to Location 1, 2 and 3  is \qty{558}{\milli\meter}, \qty{125}{\milli\meter} and \qty{850}{\milli\meter},
    respectively. The cut is approximately halfway through the mesh. Left and right plots show the positive and negative
    trace of the differential pair, respectively. Black traces show baseline measurements in between attacks. The
    baselines show vertical offsets due to temperature drift, which causes a small DC offset in our design. The vertical
    scale is in Volts at the ADC.}
    \label{fig_manip_shape}
\end{figure}

In our first experiment, we tested both short and open circuit conditions. We tested a short circuit between the two
mesh traces in each of three locations as shown in Figure\ \ref{fig_pic_specimens}, as well as a cut trace halfway
through the mesh. Figure\ \ref{fig_manip_shape} shows the result of our experiment. The graphs show a clear response of
our monitoring circuit to all four tampering scenarios. Short and open circuit conditions can clearly be distinguished
from each other, and in all cases, the fault location can be determined with sub-nanosecond precision, corresponding to
several centimeters in distance along the mesh.

\subsubsection{Probing by Oscilloscope Probe}

\begin{figure}
    \begin{center}
        \includegraphics[width=\textwidth]{fig_probe_shape.pdf}
    \end{center}
    \caption{The circuit's TDR response under a probing attack using an oscilloscope probe. Black traces are a series of
    un-probed baseline measurements taken between attacks. All traces are plotted relative to a separate baseline trace
    taken at the begginning of the experiment. }
    \label{fig_probe_shape}
\end{figure}

In our second experiment, we probed each of the three locations from the test specimen shown in Figure\
\ref{fig_pic_speciments_short} once at each trace of the trace pair using a Rigol \partno{PVP3150} $\times 1/\times 10$
oscilloscope probe set to $\times 10$ mode. We grounded the probe's ground clip to the mesh ground and used the probe
without tip attachment.

Using the \partno{PI3HDX12211} variant of our prototype, we measured the mesh's TDR response while probing. Figure\
\ref{fig_manip_shape} shows the resulting TDR traces. Oscilloscope probes are specifically designed to disturb the
circuit under test as little as possible, with this one being specified as presenting as a \qty{10}{\mega\ohm} resistive
load in parallel with a \qty{10}{\pico\farad} capacitance when used in $\times 10$ mode as we did here. Since the
resulting disturbance to the TDR traces is smaller than those in Figure\ \ref{fig_manip_shape}, we post-processed the
traces by subtracting a baseline trace taken before the measurements. To highlight drift in the baseline trace, we
include additional baseline traces taken in between and after measurements using the same post-processing.

In each traces, the mesh was probed in one of three locations as in Figure\ \ref{fig_manip_shape}, and on one of the
two mesh traces. The shown time range in the graph shows the primary reflection of the stimulus pulse's rising edge. We
can clearly se a distinct response to each of the three probing attempts with the only caveat being that the response of
the two mesh traces is asymmetrical due to asymmetry in our sampling frontend when measuring such low signal levels.
Interestingly, this asymmetry is fully compensated by the fact that we excite the mesh differentially, and as a result
probing either trace distorts their shared electromagnetic field, and impacts measurements on \emph{both} traces.
Particularly on the first trace, we can distinguish which trace was probed, as well as where it was probed, in a single
measurement.

\subsubsection{Circumvention Through Microsoldering}

\begin{figure}
    \centering
    \begin{subfigure}{0.78\textwidth}
        \centering
        \includegraphics[width=\textwidth]{fig_drill_mod_shape.pdf}
        \label{fig_drill_mod_shape_plot}
        \caption{}
    \end{subfigure}
    \begin{subfigure}{0.2\textwidth}
        \centering
        \includegraphics[width=\textwidth]{pic_manip_microsoldering_small.jpg}
        \vspace*{2mm}
        \label{fig_drill_mod_shape_pic}
        \caption{}
    \end{subfigure}
    \caption{The circuit's TDR response under a manipulation attack attempting to bridge part of a trace to allow a
        \qty{300}{\micro\meter} drill to penetrate. The mesh pitch used is \qty{240}{\micro\meter}. Red traces show
        measurements with a looped wire patch comparable to \textcite{immlerSecurePhysicalEnclosures2018}, black traces
        show the same gap bridged with a minimally short straight piece of wire. The photo shows the looped wire patch
        with a \qty{1}{\milli\meter} pitch ruler for reference. Traces are normalized as in Figure\
        \ref{fig_probe_shape}.}
    \label{fig_drill_mod_shape}
\end{figure}

While our proposed measurement setup significantly increases the level of effort required from an attacker, as long as
standard PCBs are used, PCB rework techniques that are widely used in industry for PCB repair can be applied. If we
assume a standard PCB process with \qty{100}{\micro\meter} trace/space design rules, a drilling attack targeting a
\qty{300}{\micro\meter} hole size as proposed by \textcite{immlerSecurePhysicalEnclosures2018}, at least one trace will
need to be broken during drilling. Patching the resulting break using a wire is possible, but with increasing wire
length, the TDR response of the mesh is increasingly distorted. We experimentally performed an attack comparable to the
one shown by \textcite{immlerSecurePhysicalEnclosures2018} on a \qty{240}{\micro\meter} pitch mesh specimen. Figure\
\ref{fig_drill_mod_shape} shows our modification and the resulting change in TDR response. As we can see, adding even
just a few millimeters of wire will measurably and consistently distort the TDR response.

\subsection{Countermeasures}

As shown above, PCB security meshes can be manipulated using industry-standard microsoldering techniques. Keeping the
length of any patch wires as short as possible, it is conceivable that impact on TDR response could be kept below
detection thresholds. Our setup provides increased resistance against such attacks since the entire attack would have to
be carried out without electrically contacting either mesh trace. In particular, soldering would have to be done using a
minimal amount of solder as well as a bespoke, insulated soldering iron tip. While manufacturing such a tool out of a
material like sintered ceramic is conceivable, to our knowledge, no such tool exists on the market.

Furthermore, the actual drilling would have to happen with a dielectric drill bit, placing special attention on
evacuating conductive copper chips before they can create shorts to nearby traces. Again, it is conceivable that such a
tool could be manufactured, but to our knowledge, such a tool is not currently available as a standard component on the
market.

Finally, any probes penetrating the mesh would have to be placed such that their presence in the vicinity of the mesh
traces does not disturb the TDR response. In particular, we have observed that even touching the mesh will distort the
response, so modifications would have to be carried out with great care, likely using micromanipulators or similar
specialized equipment.

\textcite{pcisecuritystandardscouncilPaymentCardIndustry2021a} contains a useful framework for thinking about attacker
capabilities. Applying their taxonomy, applying our monitoring system raises the skill level required for a patching
attack from a skilled attacker to an expert attacker, and the equipment requirement from standard equipment to bespoke
equipment such as dielectric drill bits, ceramic soldering tips etc.

\section{Future Work}

\paragraph{Design variants.} While the \partno{STM32G4}'s \partno{HRTIM} peripheral offers edge position control at a
precision of $\frac{1}{32}$ system clock cycle using an automatically adjusted delay-locked loop at each output driver,
due to the comparatively slow maximum system clock speed of \qty{168}{\mega\hertz}, this still only results in a timing
resolution of \qty{184}{\pico\second}. While we have demonstrated this is sufficient to detect and localize several
attack variants, it would be interesting to increase time resolution since in our measurements, we observed that the
end-to-end jitter of our sampler is low enough that our circuit would benefit from finer delay control. In our
prototype, we implemented a--so far unused--adjustable power supply for the \partno{74LVC} series buffer in between of
the \partno{HRTIM} outputs and the pulse amplifier. By adjusting this buffer's power supply through one of the
microcontroller's digital-to-analog converter (DAC) channels, we expect that it should be possible to exploit the supply
voltage dependency of the propagation delay of \partno{74LVC} series CMOS logic to create a digitally controllable delay
with picosecond resolution. It is likely that the internal DLL of the \partno{HRTIM} peripheral is implemented in a
similar way.

\paragraph{System design.} The work we presented in this paper is complementary to the work previously presented by
\textcite{gotteCantTouchThis2022}, where the authors improved security of a simple security mesh made from standard PCBs
through mechanical motion. We are currently working on a prototype combining both approaches for a cost-efficient yet
powerful physical security primitive.

\paragraph{Auxiliary applications.} In this work, we have presented a design for a low-cost, embedded TDR frontend.
Besides security mesh monitoring, through multiplexing this TDR frontend could be used for other system monitoring
tasks from tamper sensing to system health monitoring. For instance, \textcite{vaiSecureArchitectureEmbedded2015}
propose an approach for checking the integrity of a PCBA using an external Vector Network Analyzer (VNA) attached to
test points on the PCBA's Power Distribution Network (PDN). TDR can produce fingerprints similar to a VNA, and it would
be interesting to use the TDR frontend to measure parts of the secure subsystem other than its security mesh.

\paragraph{Heuristic Scan Scheduling.} As presented in Section\ \ref{sec_scan_schedule}, our prototype allows for
improved measurement latency using more advanced scan scheduling. In particular, it would be interesting to dynamically
adjust the TDR scan schedule based on concrete mesh characteristics such as re-scanning time delays near the trailing
edge of a mesh's response more frequently than those outside the primary reflection part of the response. However, this
optimization depends on mesh lengths and signal routing in a particular application and thus is subject to future work.

\section{Conclusion}

In this paper, we presented a design for a low-cost frontend for the integrity monitoring security meshes in
applications such as HSMs based on the principles of sub-nanosecond Time-Domain Reflectometry. Our design
repurposes an inexpensive HDMI redriver IC to produce sharp edges for the TDR stimulus, and applies a microwave clip
line to form fast pulses for TDR sampling. Our design not only enables the monitoring of continuity and length of the
mesh's traces, but also allows monitoring the impedance at every point along the mesh. Beyond simply detecting faults or
manipulations that disturb the mesh without causing breaks, we have demonstrated our prototype circuit's capability to
distinguish and physically localize faults inside the mesh in several practical attack scenarios. Compared to previous
work, our approach provides an additional time dimension in its characterization of a security mesh while simultaneously
being less expensive, enabling more sophisticated tamper detection algorithms.

\section*{Availability}
This is version \texttt{\input{version.tex}\unskip} of this paper, generated on \today. The git repository with the
LaTeX source for this paper, all hardware design files, and firmware and analysis source code can be found at:

\center{Note: URL elided for peer review}
% \center{\url{https://git.jaseg.de/ihsm-sampling-mesh-monitor-hw.git}}

\FloatBarrier
\printbibliography[heading=bibintoc]
\end{document}