Cut down a bunch of stuff, make space for majR measurements
This commit is contained in:
parent
7642a0e3ee
commit
3a287db5e4
1 changed files with 113 additions and 190 deletions
303
paper/paper.tex
303
paper/paper.tex
|
|
@ -29,7 +29,8 @@
|
|||
\tcbuselibrary{breakable}
|
||||
\usepackage{float}
|
||||
|
||||
\definecolor{highlightgreen}{rgb}{0.18 0.4 0.13}
|
||||
\definecolor{highlightred}{rgb}{0.6 0.1 0.1}
|
||||
\definecolor{highlightgreen}{rgb}{0.12 0.5 0.07}
|
||||
\DeclareSIUnit{\baud}{Bd}
|
||||
\DeclareSIUnit{\year}{a}
|
||||
\DeclareSIUnit{\rpm}{rpm}
|
||||
|
|
@ -408,6 +409,8 @@ multiplexers.
|
|||
|
||||
\section{Circuit Design and Driving Approach}
|
||||
|
||||
% FIXME peer review only, for major revision @ TCHES
|
||||
\color{highlightred}
|
||||
\begin{figure}
|
||||
\centering
|
||||
\hspace*{-7mm}
|
||||
|
|
@ -416,72 +419,60 @@ multiplexers.
|
|||
\label{fig_block_diagram}
|
||||
\end{figure}
|
||||
|
||||
A TDR can be broken down into three basic components. First, we need a source of fast pulses (or fast edges!) to
|
||||
stimulate the mesh. Second, we need a coupler that allows us to couple the stimulus pulses into the mesh, and their
|
||||
reflections out of it. Finally, we need a fast ADC to capture the reflections.
|
||||
A TDR can be broken down into three basic components: A source of fast stimulus pulses (or edges!), a coupler that
|
||||
separates stimulus pulses and their reflection at the output, and a fast ADC to capture the reflections.
|
||||
|
||||
Figure\ \ref{fig_block_diagram} shows a block diagram of our design\footnote{Full schematics are available in this
|
||||
paper's supplementary material.}. At the core of our design lies an equivalent time sampling setup, where two
|
||||
diode bridge sampling gates alternately sample the two traces of the mesh.
|
||||
Since physical attacks happen on a time scale of minutes or hours, we do not need a fast acquisition rate. Equivalent
|
||||
time sampling uses fast sampling gates to sample a high-frequency signal at a low frequency that is suitable for direct
|
||||
conversion through an ADC. This reduces the requirements of our data acquisition and signal processing fronted from
|
||||
gigasamples per second to mere megasamples, well within the range that a commodity microcontroller can handle.
|
||||
conversion through an ADC. Using equivalent-time sampling, we can sample \unit{\giga\hertz}-Scale signals at the
|
||||
\unit{\mega\hertz}-scale sampling rate of the internal ADCs of the commodity microcontroller we use. We use two of the
|
||||
microcontroller's ADCs interleaved, each of which provides approximately \qty{1.7}{\mega Sp\per\second} at
|
||||
\qty{12}{\bit} resolution. Due to the high conversion speed of the modern ADC cores in this microcontroller, we are able
|
||||
to use up to $384\times$ oversampling for increased precision without unduly affecting measurement times.
|
||||
|
||||
A challenge in equivalent time sampling is precisely phase-synchronizing the sampling pulse to the fundamental frequency
|
||||
of the input signal, which is usually implemented by using a high-speed comparator. In a TDR-style frontend like ours,
|
||||
this expensive component can be avoided because the stimulus signal is generated in the frontend, simplifying the
|
||||
challenge of generating a synchronized sampling pulse at an adjustable phase to the stimulus pulse.
|
||||
%A challenge in equivalent time sampling is precisely phase-synchronizing the sampling pulse to the fundamental
|
||||
%frequency of the input signal, which is usually implemented by using a high-speed comparator. In a TDR-style frontend
|
||||
%like ours, this expensive component can be avoided because the stimulus signal is generated in the frontend,
|
||||
%simplifying the challenge of generating a synchronized sampling pulse at an adjustable phase to the stimulus pulse.
|
||||
|
||||
Since an intact mesh has low insertion loss, the amplitude of the response of an intact mesh is large. Thus, we do not
|
||||
need a high dynamic range in either the frontend amplifiers or in the ADC, enabling the use of commodity operational
|
||||
amplifiers (opamps) and the built-in ADC of a commodity microcontroller. Further, the strong signal allows us to use a
|
||||
comparatively lossy \qty{-6}{\deci\bel} resistive tee instead of a directional coupler. A resistive tee does not provide
|
||||
directionality, but in our case, the incident pulse can never interfere with reflections at the sampling output of the
|
||||
divider because of causality.
|
||||
The mesh has low insertion loss. Thanks to the resulting large amplitude of the reflection signal, the noise floor of
|
||||
our frontend based on commodity operational amplifiers (opamps) is below the resolution limit of the built-in ADCs of
|
||||
our chosen microcontroller. The main source of frontend noise stems from timing jitter between the sampling gate and the
|
||||
ADC due to the clock generation of the ADC, which could be reduced through firmware changes. The strong signal allows us
|
||||
to use a comparatively lossy but simple \qty{-6}{\deci\bel} resistive tee instead of a directional coupler.
|
||||
|
||||
To implement our sub-nanosecond sampler, we chose a simple four-diode bridge sampling gate made from commodity
|
||||
We implemented the sub-nanosecond sampler using a simple four-diode bridge sampling gate made from commodity
|
||||
\partno{BAT17-04W} RF Schottky diodes, which offer turn-on times better than \qty{100}{\pico\second} at
|
||||
\price{0.13}{\euro} per device at quantity 1000. The four-diode configuration requires only two dual diode packages. In
|
||||
contrast to \textcite{polasekReflektometrCasoveOblasti2020,houtman1GHzSamplingOscilloscope2000}, in our system, double
|
||||
sampling is not necessary - instead, we follow the sampling gate directly with an amplifier feeding into the internal
|
||||
ADC of our microcontroller. We use an internal timer peripheral of the same microcontroller to generate both stimulus
|
||||
and sample pulses such that we can easily phase-lock the internal ADC to the same timer.
|
||||
\price{0.13}{\euro} per device at quantity 1000. In contrast to prior
|
||||
work\cite{polasekReflektometrCasoveOblasti2020,houtman1GHzSamplingOscilloscope2000}, we precisely control the timing of
|
||||
our ADC and avoid the need for a second sampling stage.
|
||||
|
||||
We base our circuit around an \partno{STM32G474RB} microcontroller, a \price{5}{\euro}-class commodity ARM
|
||||
microcontroller. Besides adequate processing speed for its price class, this microcontroller offers two features that
|
||||
are critical to our design. First, its internal ADCs are both higher resolution and faster than those of older parts.
|
||||
Second, it is one of a few parts in its series that include a \emph{high-resolution timer} (\partno{HRTIM}) peripheral
|
||||
that provides several outputs that can be controlled with better than \qty{200}{\pico\second} resolution through
|
||||
per-output, self-calibrating delay line circuitry. We use this peripheral to produce both the stimulus pulse and the
|
||||
phase-adjustable sampling pulse.
|
||||
We base our circuit around an \partno{STM32G474RB} microcontroller, \price{5}{\euro}-class commodity ARM
|
||||
microcontroller. This is a recent part, which has internal ADCs that are both higher resolution and faster than those of
|
||||
older parts. Furthermore, it includes a \emph{high-resolution timer} (\partno{HRTIM}) peripheral that provides better
|
||||
than \qty{200}{\pico\second} timing resolution through self-calibrating delay lines. We use this peripheral to produce
|
||||
adjustable, phase-locked stimulus and sampling pulses.
|
||||
|
||||
While the HRTIM peripheral allows us to finely adjust the phase of its output waveform, the digital output structures of
|
||||
the \partno{STM32G4} series are still limited to nanosecond-scale rise and fall times with the datasheet quoting
|
||||
$t_r=t_f=\qty{1.7}{\nano\second}$ into a \qty{10}{\pico\farad} load when using the fastest GPIO output drive strength
|
||||
setting and a \qty{3.3}{\volt} supply\cite{stmicroelectronicsSTM32G474xBDatasheet2021}. We work around this issue by
|
||||
applying two circuit tricks. First, we send its output through a fast amplifier to square up the edges to a rise time
|
||||
better than \qty{500}{\pico\second}. The remaining challenge is that while we now have pulses with crisp edges, due to
|
||||
constraints of the HRTIM peripheral, at more than \qty{10}{\nano\second}, these pulses are still too wide to be useful.
|
||||
We solve this issue by applying a clip line\cite{tektronixinc.TektronixS6Sampling1982} pulse forming network at the
|
||||
output of the amplifier--i.e.\ we connect the amplifier's output to the load in parallel with a short, terminated
|
||||
transmission line stub. The length of this stub determines the pulse width.
|
||||
While the HRTIM peripheral provides sub-nanosecond phase adjustment, the digital outputs of the \partno{STM32G4} series
|
||||
are limited to a minimum transition time of $t_r=t_f=\qty{1.7}{\nano\second}$\footnote{Datasheet specification, when
|
||||
driving a \qty{10}{\pico\farad} load\cite{stmicroelectronicsSTM32G474xBDatasheet2021}.}. We work around this issue with
|
||||
two circuit tricks. First, we send the output through a fast amplifier to square up the edges to a rise time better than
|
||||
\qty{500}{\pico\second}. We then reduce the \qty{10}{\nano\second} minimum pulse width supported by the \partno{HRTIM}
|
||||
peripheral by applying a clip line\cite{tektronixinc.TektronixS6Sampling1982} pulse forming network--i.e.\ we connect
|
||||
the amplifier's output to the load in parallel with a short, terminated transmission line stub. The length of this stub
|
||||
determines the pulse width.
|
||||
|
||||
\subsection{Driver Selection}
|
||||
|
||||
Several types of amplifiers can be used in our pulse shaping application. Common to all options, we require differential
|
||||
outputs. In practice, for most parts, this means we are looking for a part with Current Mode Logic (CML) outputs. CML is
|
||||
a differential signaling standard that is widely used in high-speed logic. In CML, a current source feeds a pair of
|
||||
transistors that steer current between the two outputs of the differential pair. By steering current between the two
|
||||
outputs, common-mode currents are minimized which both reduces the effect of power supply impedance at the transmitter
|
||||
and reduces electromagnetic emissions from the differential pair's PCB traces. In our experiments, we considered several
|
||||
parts and settled on four parts for evaluation in this paper: A \partno{74LVC2G157} standard logic IC, two display
|
||||
protocol redrivers, \partno{PI3HDX12211} and \partno{TDP0604}, as well as \partno{MAX3748}, a limiting amplifier for
|
||||
optical networking applications. We implemented four variants of our prototype using a steady hand under a microscope as
|
||||
shown in Figure\ \ref{fig_pic_amps}.
|
||||
|
||||
One notable omission from our tests was the series of CML-output comparators made by Analog Devices due to the cost of
|
||||
these devices.
|
||||
We evaluated multiple options for the pulse shaping amplifier in our design. For both sampling and stimulus, we work
|
||||
with fully differential signals, so Current Mode Logic (CML) devices, which are widely used in high-speed logic, are a
|
||||
natural fit. We settled on four parts for evaluation in this paper: A \partno{74LVC2G157} standard logic IC, two
|
||||
HDMI/DisplayPort redrivers, \partno{PI3HDX12211} and \partno{TDP0604}, as well as \partno{MAX3748}, a limiting amplifier
|
||||
for optical networking. Figure\ \ref{fig_pic_amps} shows the four hand-soldered prototypes. We avoided specialty parts
|
||||
such as the CML-output comparators made by Analog Devices due to cost.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
|
|
@ -505,74 +496,41 @@ these devices.
|
|||
\includegraphics[width=0.9\textwidth]{pic_pi3hdx_small.jpg}
|
||||
\caption{PI3HDX12211}
|
||||
\end{subfigure}
|
||||
\caption{Circuit-board implementation of the four pulse amplifier variants of the design. Amplifiers were mounted
|
||||
dead bug style on a piece of copper tape connected to one of the supply rails and hooked up with
|
||||
\qty{120}{\micro\meter} diameter wire according to their respective datasheets. Supply rails were hooked up using
|
||||
copper tape where possible to reduce series impedance. Additional \qty{10}{\micro\farad} MLCC power supply
|
||||
decoupling capacitors were placed close to the ICs on the copper tape to reduce loop area.}
|
||||
\caption{Implementation of the pulse amplifier variants of the design. Amplifiers were mounted dead bug style on
|
||||
copper tape and connected with \qty{120}{\micro\meter} wire. Supply rails were connected with copper tape where
|
||||
possible to reduce impedance. MLCC power supply decoupling capacitors were placed on the copper tape to reduce loop
|
||||
area.}
|
||||
\label{fig_pic_amps}
|
||||
\end{figure}
|
||||
|
||||
\paragraph{Standard logic ICs.}
|
||||
As a baseline, we evaluated the \partno{74LVC2G157} standard logic IC. This IC contains a single multiplexer, however,
|
||||
we are not interested in the multiplexer functionality. The interesting trivia about this chip is that it also is one of
|
||||
the only \partno{74} series standard logic parts that have complimentary outputs. According to manufacturer
|
||||
specifications, at a comparable \qty{20}{\pico\farad} load, \partno{74LVC} series parts have slightly faster rise and
|
||||
fall times compared to our \partno{STM32} microcontroller's digital IO
|
||||
pins\cite{renesaselectronicscorporationApplicationNoteAN2242019}.
|
||||
As a baseline, we evaluated the \partno{74LVC2G157} CMOS multiplexer configured to provide complementary outputs.
|
||||
According to manufacturer specifications, this part provides slightly faster rise and fall times than
|
||||
oumicrocontroller\cite{renesaselectronicscorporationApplicationNoteAN2242019}.
|
||||
|
||||
\paragraph{Optical Networking Chipsets.}
|
||||
A category of CML-output drivers suitable for our application is a class of optical networking chipset ICs. While
|
||||
today, the construction of optical transmitters has moved to direct bonding of optical components and driver ICs to
|
||||
minimize parasitics, discrete driver ICs for some chipsets from the mid-2000s era are still available at reasonable
|
||||
cost. Both the laser driver used to drive the transmitter laser diode, and the limiting amplifier used to amplify the
|
||||
receiver photodiode's output can be used in our application, with the limiting amplifier part requiring less additional
|
||||
circuitry in our application due to its lack of output bias control. In our evaluation below, we include the
|
||||
\partno{MAX3748} limiting amplifier as a representative part from this category that is still commercially available. A
|
||||
drawback of relying on a part like this is that its future availability is uncertain given the evolution of the
|
||||
industry.
|
||||
Optical transceivers use CML-output limiting amplifiers and laser drivers, some of which are still available as discrete
|
||||
components despite the industry moving from PCB implementations to direct bonding. We evaluated the \partno{MAX3748}
|
||||
limiting amplifier as a representative part from this category.
|
||||
|
||||
\paragraph{Bus Redrivers.}
|
||||
The final category of amplifiers suitable for our pulse shaping needs is redrivers intended for high-speed data
|
||||
interfaces such as USB 3, PCI Express, HDMI, or DisplayPort. All of these interfaces use CML drivers, with differential
|
||||
voltage levels usually in the order of \qtyrange{600}{1000}{\milli\volt}. \emph{Redriver} ICs are intended to be used to
|
||||
amplify the sensitive high-speed bus signal at the edge of a PCBA, either before it leaves the board through a connector
|
||||
to ensure adequate signal levels at the connector, or after it enters through a connector to compensate for loss in the
|
||||
PCB traces between the connector and the signal's destination. For our application, redrivers intended for HDMI and
|
||||
DisplayPort applications are most suitable, as they can usually be configured to act as simple amplifiers without
|
||||
processing any protocol logic on the signals that are amplified. In contrast, both USB 3 and PCIe redrivers often
|
||||
implement power saving features that try to parse parts of the actual signal transmitted through them, which are hard to
|
||||
bypass in our application.
|
||||
Most modern, high-speed buses like USB 3, PCI Express, HDMI, and Display Port use CML drivers. \emph{Redriver} ICs
|
||||
intended to amplify such signals to compensate for loss in connectors or cables contain amplifiers that are suitable for
|
||||
our application. HDMI/DisplayPort redrivers are most suitable since they can be configured as simple amplifiers,
|
||||
turning off any signal-dependent power saving features.
|
||||
|
||||
Redrivers can be classified according to their way of operation. \emph{Retimers} include a full
|
||||
serialization/deserialization (SerDes) setup and parse the low-level protocol of the bus to reconstruct bit-level
|
||||
timing. We focus only on simpler redrivers that only contain amplifiers and (analog) equalizers here.
|
||||
|
||||
Amplifying redrivers can be separated into two classes: Limiting and linear redrivers. A limiting redriver is configured
|
||||
to have a high gain such that a small input signal will be amplified to the full output voltage swing. Limiting
|
||||
redrivers are well-suited for our application, but they have come out of fashion since they interfere with link training
|
||||
and with power saving features of protocols like USB 3.
|
||||
|
||||
Linear redrivers are constructed with a low gain instead. Sufficient to compensate for wiring losses, their gain is low
|
||||
enough to leave them transparent to bus protocol features such as link training or power saving features. To compensate
|
||||
for their reduced gain, linear redrivers usually contain configurable equalizers that can be used to apply targeted
|
||||
enhancements for particular signal defects, such as boosting high-frequency gain or providing a set amount of overshoot.
|
||||
Where available, in our prototype variants we set these equalization features to provide maximum gain.
|
||||
|
||||
In our evaluation below, we include \partno{PI3HDX12211} as a linear redriver intended for DisplayPort and HDMI
|
||||
applications, as well as \partno{TPD0604} as a ``hybrid'' linear or limiting redriver for HDMI applications, configured
|
||||
for limiting mode in our experiments. An attractive feature of both of these chips as well as comparable devices is that
|
||||
they usually include at least four independent channels, so only one chip is needed for both pulse paths. Additionally,
|
||||
they are consumer mass market parts, resulting in a low price. For instance, \partno{PI3HDX12211} is available at
|
||||
\price{2.11}{\euro} in single quantity and less than \price{1.30}{\euro} at a quantity of several hundred at distributor
|
||||
LCSC, and \partno{TPD0604} is available at \price{4.72}{\euro} and \price{3.44}{\euro}, respectively, at distributor
|
||||
Mouser.
|
||||
In our evaluation below, we include \partno{PI3HDX12211} and \partno{TPD0604}, two inexpensive, consumer mass market
|
||||
redrivers\footnote{
|
||||
\partno{PI3HDX12211} is available at \price{2.11}{\euro} in single quantity and less than \price{1.30}{\euro} at a
|
||||
quantity of several hundred at distributor LCSC, and \partno{TPD0604} is available at \price{4.72}{\euro} and
|
||||
\price{3.44}{\euro}, respectively, at distributor Mouser}.
|
||||
Both parts have four independent channels, so only one chip is needed for the two pulse paths.
|
||||
|
||||
\subsection{Cost Breakdown}
|
||||
|
||||
Table\ \ref{tab_bom} shows a breakdown of the cost of the main components of our prototype, resulting in a total
|
||||
component cost of less than \price{10}{\euro}. We did not include power supply components in this breakdown as our
|
||||
circuit is meant to be embedded into a payload circuit that will already have sufficient power supplies.
|
||||
Table\ \ref{tab_bom} shows a breakdown of the cost of the main components of our prototype, totalling less than
|
||||
\price{10}{\euro}. We did not include power supply components in this breakdown since our circuit is meant to be
|
||||
embedded into a payload circuit that will already have sufficient power supplies.
|
||||
|
||||
Due to its \partno{HRTIM} peripheral, the \partno{STM32G4} microcontroller is the component of our design that is
|
||||
hardest to replace. However, this part can still be replaced with a wide range of FPGAs, which commonly include
|
||||
|
|
@ -595,10 +553,8 @@ of Xilinx 7 Series FPGAs provides the same $\frac{1}{32}$ clock cycle resolution
|
|||
&25&0.01&Various resistors\\\hline
|
||||
\multicolumn{2}{r}{}&\textbf{9.67}&\textbf{Total}
|
||||
\end{tabular}
|
||||
\caption{A cost breakdown of the major components of our design. Listed prices are for 1000 pieces order quantity to
|
||||
make prices more comparable between distributors. The number of switches necessary for signal routing and
|
||||
termination depends on the specific mesh signal routing of the application. Numbers shown here are for our
|
||||
prototype, which can measure a mesh from both ends and supports short, open and matched termination.}
|
||||
\caption{Cost breakdown of our prototype design. Prices are listed at order quantity 1000 to make prices more
|
||||
comparable between distributors.}
|
||||
\label{tab_bom}
|
||||
\end{table}
|
||||
|
||||
|
|
@ -606,61 +562,34 @@ of Xilinx 7 Series FPGAs provides the same $\frac{1}{32}$ clock cycle resolution
|
|||
\label{sec_scan_schedule}
|
||||
|
||||
The goal of a time domain reflectometer is to send a pulse into the Device Under Test (DUT)--i.e.\ in our application,
|
||||
the mesh--and to record all reflections returning from the DUT afterwards. In something like a security mesh whose
|
||||
traces might only be a few meters long in total, the time span between the pulse being sent and the last reflections
|
||||
from the very end of the mesh arriving is in the order of several tens of nanoseconds. Directly recording a response at
|
||||
this timescale would be infeasible using a commodity microcontroller, so we utilize an equivalent time sampling
|
||||
approach.
|
||||
the mesh--and to record all reflections returning from the DUT afterwards. In a security mesh with a few meters of total
|
||||
trace length, the time span between the pulse being sent and the last reflections arriving from the end of the mesh is
|
||||
in the order of tens of nanoseconds. Directly recording a response at this timescale would be infeasible in a commodity
|
||||
microcontroller, so we use equivalent time sampling.
|
||||
|
||||
As shown in Figure\ \ref{fig_block_diagram}, our analog frontend contains amplifiers that produce the stimulus pulse, a
|
||||
sampling gate with amplifiers, and a coupler that couples the pulse into the mesh and couples the reflections back into
|
||||
the sampling gate. A microcontroller controls this frontend with two primary signals: A stimulus pulse, and a sampling
|
||||
the sampling gate. A microcontroller controls this frontend with two main signals: A stimulus pulse, and a sampling
|
||||
pulse. By adjusting the timing between these two pulses every time a stimulus pulse is sent, the microcontroller can
|
||||
select a particular point in time after the stimulus pulse to record using the sampling gate. By slowly sweeping across
|
||||
the whole time span, the microcontroller can reconstruct the waveform of the reflected signal at the sampling gate
|
||||
across one period of the stimulus pulse. The recording rate of this waveform is limited by the repetition rate of the
|
||||
stimulus pulse as well as the time step size.
|
||||
sample the response at any chosen point in time. By sweeping across the whole time span, the microcontroller can
|
||||
reconstruct the waveform of the reflected signal at the sampling gate.
|
||||
|
||||
The attainable repetition rate of our stimulus and sampling circuits is limited by two main components. First, the
|
||||
sampling post-amplifier's bandwidth limits the maximum sample rate. In our design, we chose an \partno{OPA1656}
|
||||
\qty{50}{\mega\hertz} Gain-Bandwidth Product (GBP) FET input low noise operational amplifier. We need a FET input part
|
||||
to avoid loading the sampling gate. The comparatively high GBP and the low noise input stage of this device allow us to
|
||||
amplify small signals that could result from weak reflections in small impedance discontinuities inside the mesh.
|
||||
In our prototype, we sample the response once after each stimulus pulse. We conservatively decided on a sampling rate of
|
||||
\qty{1}{MSps} across both channels of the mesh's differential pair. This sampling rate leaves some headroom to the
|
||||
\qty{50}{\mega\hertz} Gain-Bandwidth Product (GBP) of the \partno{OPA1656} frontend opamp, as well as the \qty{4}{MSps}
|
||||
that the ADCs can reach. The processing speed of the microcontroller allows individual control of the timing of each
|
||||
sampling pulse.
|
||||
|
||||
The second major factor limiting repetition rate is the microcontroller's ADC speed, as well as the speed of the
|
||||
software processing the ADC's output. At full \qty{12}{b} resolution, this corresponds to a sampling rate of
|
||||
approximately \qty{4}{MSps}. The microcontroller contains five ADCs, which can be interleaved to achieve higher rates.
|
||||
|
||||
Combining these factors, we conservatively decided on a sampling rate of \qty{1}{MSps} across both channels of the
|
||||
differential pair. At this sampling rate, it is feasible to control the sample timing on a sample-by-sample basis. For
|
||||
all measurements in this paper, we use a sequential sampling approach where the microcontroller takes a series of
|
||||
measurements for oversampling at a particular delay, and then increases the delay by one \partno{HRTIM} output clock
|
||||
interval.
|
||||
|
||||
In our prototype, one sweep of a \qty{188}{\nano\second} time span consisting of $1024$ data points took
|
||||
\qty{710}{\milli\second} at $256\times$ oversampling and \qty{1.1}{\second} at $384\times$ oversampling. The time span
|
||||
corresponds to \qty{28}{\meter} of mesh length, which at a \qty{200}{\micro\meter} pitch corresponds to a mesh area of
|
||||
\qty{113}{\centi\meter\squared} and at a \qty{1}{\milli\meter} pitch corresponds to
|
||||
\qty{565}{\centi\meter\squared}. Using the same microcontroller, by optimizing timing, moving oversampling processing
|
||||
out of the interrupt handler, and by interleaving four of the microcontroller's five ADC peripherals, the lower limit of
|
||||
acquisition time of a $1024$-point scan is \qty{33}{\milli\second} for $256\times$ oversampling and
|
||||
\qty{49}{\milli\second} for $384\times$ oversampling.
|
||||
|
||||
While for our development, sequential scanning is adequate, in a future practical application, two simple optimizations
|
||||
would decrease the time to detection for an attack. First, in a practical application, the range of scanned delays
|
||||
should be adjusted to the length of the particular security mesh in use. For this paper, we always
|
||||
scanned a time range of $1024$ points at \qty{184}{\pico\second} spacing starting before one stimulus pulse and ending
|
||||
shortly before the next stimulus pulse so that any waveform artifacts will be visible. In a practical application, there
|
||||
would be little information gained by sampling much beyond the edges of the expected mesh response, so the scan window
|
||||
should be kept small to increase scan rate.
|
||||
|
||||
Secondly, in a practical application, the feature that is most relevant to detect tamper attempts is the trailing edge
|
||||
of the mesh's response. This trailing edge corresponds to the return of the stimulus pulse's reflection at the far end
|
||||
of the mesh. Any attack that affects the impedance even only of part of the mesh has a high chance of affecting its
|
||||
delay, and thus this trailing edge is likely to move. In a practical application, it would thus be efficient to use a
|
||||
heuristic scan schedule instead of the sequential scan we are using in our research prototype. Such a heuristic schedule
|
||||
would sample delays near the expected trailing edge of the particular mesh in use more frequently compared to delays
|
||||
that lie somewhere else, such as in the middle of the mesh's return window.
|
||||
% major revision: Since we did all measurements for the majR with only 768 samples, we re-scaled the numbers in this
|
||||
% paragraph accordingly.
|
||||
% FIXME mention in majR letter.
|
||||
In our prototype, one sweep of a \qty{141}{\nano\second} time span consisting of $768$ data points took
|
||||
\qty{825}{\milli\second} at $384\times$ oversampling. The time span corresponds to \qty{21}{\meter} of mesh length,
|
||||
which at a \qty{200}{\micro\meter} pitch corresponds to a mesh area of \qty{85}{\centi\meter\squared} and at a
|
||||
\qty{1}{\milli\meter} pitch corresponds to \qty{426}{\centi\meter\squared}. By optimizing timing, moving oversampling
|
||||
processing out of the interrupt handler, and by interleaving four instead of two of the microcontroller's five ADC
|
||||
peripherals, the lower limit of acquisition time of a $768$-point scan is \qty{37}{\milli\second} for $384\times$
|
||||
oversampling.
|
||||
|
||||
\section{Experimental Evaluation}
|
||||
|
||||
|
|
@ -1109,13 +1038,8 @@ thinking about attacker capabilities. Applying their taxonomy, our monitoring sy
|
|||
a patching attack from a \emph{skilled} attacker to an \emph{expert} attacker, and the equipment requirement from
|
||||
\emph{standard} equipment to \emph{bespoke} equipment such as dielectric drill bits and ceramic soldering tips.
|
||||
|
||||
% https://tex.stackexchange.com/questions/336201/vertical-highlight-of-a-paragraph
|
||||
\begin{tcolorbox}[breakable,
|
||||
enhanced,
|
||||
colback=yellow!10!white,
|
||||
boxrule=0pt,frame hidden,
|
||||
borderline west={1mm}{-2mm}{highlightgreen}]
|
||||
|
||||
% FIXME peer review only, for major revision @ TCHES
|
||||
\color{highlightgreen}
|
||||
\begin{figure}[H]
|
||||
\begin{subfigure}{0.5\textwidth}
|
||||
\includegraphics[width=\textwidth]{fig_covar_patch_repeat_tridelta_all_the_data_p0.3.pdf}
|
||||
|
|
@ -1262,33 +1186,32 @@ a patching attack from a \emph{skilled} attacker to an \emph{expert} attacker, a
|
|||
\caption{}
|
||||
\label{}
|
||||
\end{figure}
|
||||
\end{tcolorbox}
|
||||
|
||||
% FIXME peer review only, for major revision @ TCHES
|
||||
\color{black}
|
||||
\section{Future Work}
|
||||
|
||||
\paragraph{Design variants.} The \partno{STM32G4}'s \partno{HRTIM} peripheral is limited by to the comparatively slow
|
||||
maximum system clock speed of \qty{168}{\mega\hertz} to a timing resolution of \qty{184}{\pico\second}. While we have
|
||||
demonstrated that this is sufficient to detect and localize several attack variants, it would be interesting to increase
|
||||
time resolution since in our measurements, we observed that the end-to-end jitter of our frontend is low enough that our
|
||||
circuit would benefit from finer delay control. In our prototype, we implemented a--so far unused--adjustable power
|
||||
supply for the \partno{74LVC} series buffer in between the \partno{HRTIM} outputs and the pulse amplifier. By adjusting
|
||||
this buffer's power supply through one of the microcontroller's digital-to-analog converter (DAC) channels, we expect
|
||||
that it should be possible to exploit the supply voltage dependency of the propagation delay of \partno{74LVC} series
|
||||
CMOS logic to create a digitally controllable delay with picosecond resolution. The internal DLL of the \partno{HRTIM}
|
||||
peripheral is likely implemented similarly.
|
||||
\paragraph{Design variants.} We found that the timing jitter of our sampling frontend is low enough to reach the
|
||||
\qty{184}{\pico\second} resolution limit of the \partno{STM32G4} \partno{HRTIM} peripheral. In our prototype, we
|
||||
implemented a -- so far unused -- adjustable power supply for the \partno{74LVC} series buffer in between the
|
||||
\partno{HRTIM} outputs and the pulse amplifier. By adjusting this buffer's power supply through one of the
|
||||
microcontroller's digital-to-analog converter (DAC) channels, we expect that it should be possible to exploit the supply
|
||||
voltage dependency of the propagation delay of \partno{74LVC} series CMOS logic to create a digitally controllable delay
|
||||
with picosecond resolution.
|
||||
|
||||
% FIXME reword for publication
|
||||
\paragraph{System design.} The work we presented in this paper is complementary to the work previously presented by
|
||||
\textcite{gotteCantTouchThis2022}, where the authors improved security of a simple security mesh made from standard PCBs
|
||||
through mechanical motion. We are currently working on a prototype combining both approaches and incorporating heuristic
|
||||
scan scheduling as mentioned in Section\ \ref{sec_scan_schedule}.
|
||||
\paragraph{Non-sequential sampling.} Not all parts of the reflected signal are equally sensitive to tampering atttempts.
|
||||
For instance, the reflection's trailing edge corresponds contains information on both the length of the mesh and on its
|
||||
attenuation. Instead of recording the response waveform in a linear scan, in a practical application, more relevant
|
||||
parts of the response such as this trailing edge could be scanned at a higher rate than other, less relevant parts.
|
||||
Similarly, fast scans at a coarse time resolution could be interleaved with slow scans at a finer time resolution to
|
||||
detect large changes more quickly.
|
||||
|
||||
\paragraph{Auxiliary applications.} In this work, we have presented a design for a low-cost, embedded TDR frontend.
|
||||
Besides security mesh monitoring, through multiplexing this TDR frontend could be used for other system monitoring
|
||||
tasks from tamper sensing to system health monitoring. For instance, \textcite{vaiSecureArchitectureEmbedded2015}
|
||||
propose an approach for checking the integrity of a PCBA using an external Vector Network Analyzer (VNA) attached to
|
||||
test points on the PCBA's Power Distribution Network (PDN). TDR can produce fingerprints similar to a VNA and it would
|
||||
be interesting to measure parts of the secure subsystem other than its security mesh using our TDR frontend.
|
||||
\paragraph{Auxiliary applications.} The low-cost, embedded TDR frontend presented in this paper could be used for other
|
||||
monitoring tasks from tamper sensing to system health monitoring. For instance,
|
||||
\textcite{vaiSecureArchitectureEmbedded2015} propose checking the integrity of a PCBA using an external Vector Network
|
||||
Analyzer (VNA) attached to test points on the PCBA's Power Distribution Network (PDN). TDR can produce fingerprints
|
||||
similar to a VNA and it would be interesting to measure parts of the secure subsystem other than its security mesh using
|
||||
our TDR frontend.
|
||||
|
||||
\section{Conclusion}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue