WIP

2022-05-31 17:50:35 +02:00 · 2022-05-31 17:50:35 +02:00 · ed459a6fea
commit ed459a6fea
parent da4afa7354
2 changed files with 84 additions and 72 deletions
--- a/paper/safety-reset-paper.tex
+++ b/paper/safety-reset-paper.tex
@ -45,21 +45,29 @@
 %things'' \and Cyber-physical systems \and Hardware security \and Network Security \and Energy systems \and Signal theory}

 \begin{abstract}
-    Previous work has explored the scenario of an attacker compromising a large number of Smart Meters that are equipped
-    with remote disconnect switches, and using these remote-controllable switches to cause a large-scale outage.
-    Previous work focuses on attack prevention. In this paper, we will instead look at recovery after a successful
+    Previous work has explored the scenario of an attacker compromising a large number of consumer devices, and
+    modulating the power of these devices to cause large load swings at particular resonant frequencies of the
+    electrical grid's control systems that ultimately cause a large-scale outage~\cite{ctap+11,wu01}. Previous work has
+    focused on attacks using smart meters with integrated remote disconnect switches as first proposed
+    in~\cite{anderson01}, but the same attack scenario also applies to large IoT devices such as IoT-equipped air
+    conditioners or central heating systems.
+
+    Prior work on mitigation of this attack scenario includes generic firmware hardening techniquies % FIXME citation
+    and reducing the susceptibility of the electrical grid towards these resonant oscillation modes~\cite{entsoe01}.
+    In this paper, we will complement these mitigation efforts by considering the recovery process after a successful
    attack. To transmission system operators (TSOs), the major challenge after such a Smart Meter-triggered outage is
    that the attacker will likely persist through the outage, and compromised Smart Meters will resume malicious
    activity after their power is restored. In the event of such an attack, TSOs would need a way to remotely put these
-    compromised devices into a \emph{safe} mode of operation. 
+    compromised devices into a \emph{safe} mode of operation. For this purpose, we propose a remote-controllable
+    \emph{Safety reest} that is designed to remain operational even during a large-scale attack.

    Given that public telecommunications networks including the internet, cellular networks, and LoRa base stations may
-    also be disrupted during a large-scale blackout, the challenging aspect of this remote \emph{Safety Reset} is the
-    communication channel between TSO and the smart meter. For this purpose, in this paper we propose a simple yet
-    effective communication channel based on modulating grid frequency by modulating the power of a connected load or
-    generator. Our proposed communciation channel (1) requires minimal infrastructure, (2) has a reach spanning the
-    entire power grid and (3) is fully independent of other telecommunication networks and functions even under severe
-    disruption of the grid.
+    also be disrupted during a blackout, the challenging aspect of this \emph{Safety Reset} is the communication channel
+    between TSO and the smart meter. For this purpose, in this paper we propose a simple yet effective communication
+    channel based on modulating grid frequency by modulating the power of a connected load or generator. Our proposed
+    communciation channel (1) requires minimal infrastructure, (2) has a reach spanning the entire power grid and (3) is
+    fully independent of other telecommunication networks and functions even under severe disruption of the grid. The
+    resulting safety reset can be applied to any grid-connected device including smart meters and IoT devices.
 \end{abstract}

 \section{Introduction}
@ -71,12 +79,12 @@ their interactions have not yet received much attention.

 In this paper, we consider the previously proposed scenario where a large number of compromised consumer devices is used
 alone or in conjunction with an attack on the grid's central SCADA systems to destabilize the grid by rapidly modulating
-the total connected load. Previous work considered compromised smart meters with integrated remote disconnect switches
-as likely candidates for such an attack, but the same attack can also be performed using compromised IoT devices.  Such
-attacks are hard to mitigate, and existing literature focuses on hardening device firmware to prevent compromise.
-Despite the infeasibility of perfect firmware security, there is little research on \emph{post-compromise} mitigation
-approaches. A core issue with post-attack mitigation is that the devices normal network connection may not work due to
-the attack and as such an out-of-band communication channel is necessary.
+the total connected load~\cite{ctap+11,wu01}. Previous work considered compromised smart meters with integrated remote
+disconnect switches as likely candidates for such an attack, but the same attack can also be performed using compromised
+IoT devices.  Such attacks are hard to mitigate, and existing literature focuses on hardening device firmware to prevent
+compromise. Despite the infeasibility of perfect firmware security, there is little research on \emph{post-compromise}
+mitigation approaches. A core issue with post-attack mitigation is that the devices normal network connection may not
+work due to the attack and as such an out-of-band communication channel is necessary.

 We propose a \emph{safety reset} controller that is controlled through a novel, resilient, grid-wide powerline
 communication technique. Our safety reset controller can be fitted into any Smart Meter or IoT device. Its purpose is to
@ -103,11 +111,12 @@ voltage, which is quickly attenuated across long distances.

 Figure~\ref{fig_intro_flowchart} shows an overview of our concept.  Two scenarios for its application are before or
 during a cyberattack, to stop an attack on the electrical grid in its tracks, and after an attack while power is being
-restored to prevent a repeated attack. In both scenarios, our concept is fully independent of all public communication
-networks (such as the Internet or mobile networks) as well as broadcast systems (such as cable television or terrestrial
-broadcast radio). A grid frequency-based system can function as long as power is still available, or as soon as power is
-restored after the attack. One powerful function this allows is ``flushing out`` an attacker from compromised smart
-meters after an attack, before restoring smart meter internet connectivity.
+restored to prevent a repeated attack. In both scenarios, our concept is independent of telecommunication networks (such
+as the internet or cellular networks) as well as broadcast systems (such as cable television or terrestrial broadcast
+radio) while requiring only inexpensive signal processing hardware and no external antennas (such as are needed for
+satellite communication). A grid frequency-based system can function as long as power is still available, or as soon as
+power is restored after the attack. One powerful function this allows is ``flushing out`` an attacker from compromised
+smart meters after an attack, before restoring smart meter internet connectivity.

 Using simulations we have determined that control of a $\SI{25}{\mega\watt}$ load such as a large aluminium smelter,
 load bank or photovoltaic farm would allow for the transmission of a crytographically secured \emph{reset} signal within
@ -220,45 +229,43 @@ is that their agility w.r.t.\ post-hoc mitigations through firmware updates is l
 %Another fundamental challenge in smart grid implementations is the central role of smart electricity meters in the
 %smart grid ecosystem. Smart meters are used both for highly-granular load measurement and in some countries also for
 %load switching~\cite{zheng01}.
-Smart electricity meters are effectively consumer devices built down to a certain price point. The small market served
-by a single smart meter implementation limits how much effort a vendor can spend on firmware security. Landis+Gyr, a
-large manufacturer that makes most of its revenue from utility meters state in their 2019 annual report that they
-invested \SI{36}{\percent} of their total R\&D budget on embedded software while spending only \SI{24}{\percent} on
-hardware R\&D~\cite{landisgyr01,landisgyr02}, indicating significant tension between firmware security and the vendor's
-bottom line. 
+Smart electricity meters are consumer devices built down to a price. Firmware security research and development budgets
+are limited by the high degree of market fragmentation that is caused by mutually incompatible national smart metering
+standards. Landis+Gyr, a large utility meter manufacturer, state in their 2019 annual report that they invested
+\SI{36}{\percent} of their total R\&D budget on embedded software while spending only \SI{24}{\percent} on hardware
+R\&D~\cite{landisgyr01,landisgyr02}, which indicates tension between firmware security and the manufacturers's bottom
+line. 

 % FIXME more sources!

 \subsection{The state of the art in embedded security}

-Embedded software security generally is much harder than security of higher-level systems. The primary two factors
-affecting this are that on one hand, embedded devices usually run highly customized firmware that (often by necessity)
-is rarely updated. On the other hand, embedded devices often lack advanced security mechanisms such as memory management
-units that are found in most higher-power devices. Even well-funded companies continue to have trouble securing their
-embedded systems. A spectacular example of this difficulty is the 2019 flaw in Apple's iPhone SoC first-stage ROM
-bootloader that allows for the full compromise of any iPhone before the iPhone X given physical access to the
-device~\cite{heise01}. iPhone 8, one of the affected models, was still being manufactured and sold by Apple until April
-2020.  In another instance in 2016, researchers found multiple flaws in Samsung's implementation of ARM TrustZone
-``secure world'' firmware that Samsung used for their own mobile phone SoCs.  The flaws they found were both severe
-architectural flaws such as secret user input being passed through untrusted userspace processes without any protection
-as well as shocking cryptographic flaws such as
-CVE-2016-1919\footnote{\url{http://cve.circl.lu/cve/CVE-2016-1919}}~\cite{kanonov01}.  And Samsung is not the only large
-multinational corporation having trouble securing their secure firmware implementation. In 2014 researchers found an
-embarrassing integer overflow flaw in the low-level code handling untrusted input in Qualcomm's QSEE
-firmware~\cite{rosenberg01}. For an overview of ARM TrustZone including a survey of academic work and past security
-vulnerabilities of TrustZone-based firmware see~\cite{pinto01}.
+Embedded software security has proven challenging compared to the security of larger computer systems. On one hand,
+embedded devices usually run highly customized firmware that is rarely updated. On the other hand, embedded devices
+often lack security mechanisms such as memory management units that are found in higher-power devices. As a result of
+these factors, even well-funded companies continue to have trouble securing their embedded systems. An example of this
+difficulty is the 2019 flaw in Apple's iPhone SoC first-stage ROM bootloader that allows for the full compromise of any
+iPhone older than iPhone X given physical access to the device~\cite{heise01}. iPhone 8, one of the affected models, was
+still being manufactured and sold by Apple until April 2020.  In another instance in 2016, researchers found multiple
+flaws in Samsung's implementation of ARM TrustZone ``secure world'' firmware that Samsung used for their own mobile
+phone SoCs.  The flaws they found were both architectural flaws such as secret user input being passed through untrusted
+userspace processes as well as cryptographic flaws such as
+CVE-2016-1919\footnote{\url{http://cve.circl.lu/cve/CVE-2016-1919}}~\cite{kanonov01}.  In a similar way, in 2014,
+researchers found an integer overflow flaw in the low-level code handling untrusted input in Qualcomm's QSEE
+firmware\footnote{For an overview of ARM TrustZone including a survey of academic work and past
+security vulnerabilities of TrustZone-based firmware see~\cite{pinto01}.}~\cite{rosenberg01}.

 If even companies with R\&D budgets that rival some countries' national budgets at mass-market consumer devices
 have trouble securing their mass market secure embedded software stacks, what is a much smaller smart meter manufacturer
-to do?  Especially if national standards mandate complex protocols such as TLS that are tricky to implement
+to do?  Especially if national standards mandate complex protocols such as TLS that are difficult to implement
 correctly~\cite{georgiev01}, this manufacturer will be short on options to secure their product.

 \subsection{Attack surface in the smart grid}

 From the incidents we outlined in the previous paragraphs we conclude that in smart metering technology, market
-incentives do not currently provide the conditions for a level of device security that will reliably last the coming
-decades. Considering this tension, in this paragraph we examine the cyberphysical risks that arise from attacks on the
-smart grid in the first place. These risks arise at three different infrastructure levels.
+incentives do not currently provide the conditions for a level of device security that will reliably last for decades
+after deployment. Considering this tension, in this paragraph we examine the cyberphysical risks that arise from attacks
+on the smart grid in the first place. These risks arise at three different infrastructure levels.

 The first level is that of attacks on centralized control systems. This type of attack is often cited in popular
 discourse and to our knowledge is the only type of attack against an electric grid that has ever been carried out in
@ -585,7 +592,7 @@ other simulations as well this equates to an overall transmission duration of ap
 the demodulator some time to settle and to produce more realistic conditions of signal reception we padded the modulated
 signal unmodulated noise on both ends.

-\section{Discussion}
+\section{Lessons learned}

 For our proof of concept, before settling on the commercial smart meter we first tried to use an \texttt{EVM430-F6779}
 smart meter evaluation kit made by Texas Instruments. This evaluation kit did not turn out well for two main reasons.
@ -604,35 +611,26 @@ to be too complex and all we wanted to know we found with just a few hours of di
 Ghidra\footnote{\url{https://ghidra-sre.org/}}.

 In the firmware development phase our approach of testing every module individually (e.g. DSSS demodulator, Reed-Solomon
-decoder, grid frequency estimation) proved to be very useful. In particular debugging benefited greatly from being able
-to run several thousand tests within seconds. In case of our DSSS demodulator, this modular testing and simulation
-architecture allowed us to simulate thousands of runs of our implementation on test data and directly compare it to our
-Jupyter/Python prototype. Since we spent more time polishing our embedded C implementation it turned out to perform
-better than our Python prototype while still exhibiting the same fundamental response to changes to its parameters.
-
-In accordance with our initial estimations we did not run into any code space nor computation bottlenecks for chosing
-floating point emulation instead of porting over our algorithms to fixed point calculations. The extremely slow sampling
-rate of our systems makes even heavyweight processing such as FFT or our brute force dynamic programming approach to
-DSSS demodulation possible well within our performance constraints.
-
-The safety reset controller does not require any peripherals except for an ADC. Thus we expect code size to be the main
-factor affecting per-unit cost in an in-field deployment of our concept. At around \SI{64}{\kilo\byte}, our unoptimized
-demonstrator firmware implementation is already on the lower end of the spectrum. Especially with some optimization we
-expect safety reset controllers to be commercially viable given adequate political incentives.
+decoder, grid frequency estimation) proved useful particularly for debugging. The modular architecture allowed us to
+directly compare our demodulator implementation to our Jupyter/Python prototype, where we found that our C
+implementation outperformed the Python prototype. Despite the algorithms's complexity, the microcontroller C
+implementation has no issues processing data in real-time due to the low sampling rate necessary.

 \section{Conclusion}
 \label{sec_conclusion} 
+\subsection{Applicability to IoT devices}

+\subsection{Discussion}
 During an emergency in the electrical grid, the ability to communicate to large numbers of end-point devices is a
 valuable tool for restoring normal operation. When a resilient communcation channel is available, loads such as smart
 meters and IoT devices can be equipped with a supervisor circuit that allows for a remote ``safety reset'' that puts the
 device into a safe operating state. Using this safety reset, an attacker that uses compromised smart meters or IoT
-devices to attack grid stability can be interrupted before the conculusion of their attack. During recover from an
-outage, a safety reset can be used to reduce stress on the system during a black start by turning of non-essential loads
-such as air conditioners.
+devices to attack grid stability can be interrupted before the can conclude their attack. During recovery from an
+outage, a safety reset can be used to reduce stress on the system during a black start by temporarily disabling
+non-essential loads such as air conditioners.

-In this paper we have developed an end-to-end design of a safety reset system that provides these capabilities.  Our
-novel broadcast data transmission system is based on intentional modulation of global grid frequency. Our system is
+In this paper we have developed an end-to-end design for a safety reset system that provides these capabilities.
+Our novel broadcast data transmission system is based on intentional modulation of global grid frequency. Our system is
 independent of normal communication networks and can operate during a cyberattack. We have shown the practical viability
 of our end-to-end design through simulations. Using our purpose-designed grid frequency recorder, we can capture and
 process real-time grid frequency data in an electrically safe way. We used data captured this way as the basis for
@ -645,13 +643,17 @@ developed a simple cryptographic protocol ready for embedded implementation in r
 triggering a safety reset with a response time of less than 30 minutes.  In this demonstration we use simulated grid
 frequency data to trigger a commercial microcontroller to perform a firmware reset of an off-the-shelf smart meter. The
 next step in our evaluation will be to conduct an experimental evaluation of our modulation scheme in collaboration with
-an utility and an operator of a multi-megawatt load.  Source code and electronics CAD designs are available at the
-public repository listed at the end of this document.
+an utility and an operator of a multi-megawatt load.  
+
+The safety reset controller does not require any peripherals except for an ADC. Thus we expect code size to be the main
+factor affecting per-unit cost in an in-field deployment of our concept. At around \SI{64}{\kilo\byte}, our demonstrator
+firmware implementation is viable on low-end microcontrollers. Thus, we expect safety reset controllers to be
+commercially viable.
+
+Source code and EDA designs are available at the public repository listed at the end of this document.

 \printbibliography[heading=bibintoc]

-%%% FIXME remove appendix and work into text.
-
 \center{
    \center{This is version \texttt{\input{version.tex}\unskip} of this paper, generated on \today. The git repository
    can be found at:}
--- a/paper/safety-reset.bib
+++ b/paper/safety-reset.bib
@ -1756,3 +1756,13 @@
 	year = {2017}
 }

+@proceedings{ctap+11,
+	author = {Mihai Costache and Valentin Tudor and Magnus Almgren and Marina Papatriantafilou and Christopher Saunders},
+	booktitle = {2011 Seventh European Conference on Computer Network Defense},
+	month = {dec},
+	publisher = {IEEE},
+	title = {Remote control of smart meters: friend or foe?},
+	url = {https://www.syssec-project.eu/m/page-media/3/costache-ec2nd11.pdf; https://doi.org/10.1109/EC2ND.2011.14},
+	year = {2011}
+}
+