This commit is contained in:
jaseg 2022-05-31 17:50:35 +02:00
parent da4afa7354
commit ed459a6fea
2 changed files with 84 additions and 72 deletions

View file

@ -45,21 +45,29 @@
%things'' \and Cyber-physical systems \and Hardware security \and Network Security \and Energy systems \and Signal theory}
\begin{abstract}
Previous work has explored the scenario of an attacker compromising a large number of Smart Meters that are equipped
with remote disconnect switches, and using these remote-controllable switches to cause a large-scale outage.
Previous work focuses on attack prevention. In this paper, we will instead look at recovery after a successful
Previous work has explored the scenario of an attacker compromising a large number of consumer devices, and
modulating the power of these devices to cause large load swings at particular resonant frequencies of the
electrical grid's control systems that ultimately cause a large-scale outage~\cite{ctap+11,wu01}. Previous work has
focused on attacks using smart meters with integrated remote disconnect switches as first proposed
in~\cite{anderson01}, but the same attack scenario also applies to large IoT devices such as IoT-equipped air
conditioners or central heating systems.
Prior work on mitigation of this attack scenario includes generic firmware hardening techniquies % FIXME citation
and reducing the susceptibility of the electrical grid towards these resonant oscillation modes~\cite{entsoe01}.
In this paper, we will complement these mitigation efforts by considering the recovery process after a successful
attack. To transmission system operators (TSOs), the major challenge after such a Smart Meter-triggered outage is
that the attacker will likely persist through the outage, and compromised Smart Meters will resume malicious
activity after their power is restored. In the event of such an attack, TSOs would need a way to remotely put these
compromised devices into a \emph{safe} mode of operation.
compromised devices into a \emph{safe} mode of operation. For this purpose, we propose a remote-controllable
\emph{Safety reest} that is designed to remain operational even during a large-scale attack.
Given that public telecommunications networks including the internet, cellular networks, and LoRa base stations may
also be disrupted during a large-scale blackout, the challenging aspect of this remote \emph{Safety Reset} is the
communication channel between TSO and the smart meter. For this purpose, in this paper we propose a simple yet
effective communication channel based on modulating grid frequency by modulating the power of a connected load or
generator. Our proposed communciation channel (1) requires minimal infrastructure, (2) has a reach spanning the
entire power grid and (3) is fully independent of other telecommunication networks and functions even under severe
disruption of the grid.
also be disrupted during a blackout, the challenging aspect of this \emph{Safety Reset} is the communication channel
between TSO and the smart meter. For this purpose, in this paper we propose a simple yet effective communication
channel based on modulating grid frequency by modulating the power of a connected load or generator. Our proposed
communciation channel (1) requires minimal infrastructure, (2) has a reach spanning the entire power grid and (3) is
fully independent of other telecommunication networks and functions even under severe disruption of the grid. The
resulting safety reset can be applied to any grid-connected device including smart meters and IoT devices.
\end{abstract}
\section{Introduction}
@ -71,12 +79,12 @@ their interactions have not yet received much attention.
In this paper, we consider the previously proposed scenario where a large number of compromised consumer devices is used
alone or in conjunction with an attack on the grid's central SCADA systems to destabilize the grid by rapidly modulating
the total connected load. Previous work considered compromised smart meters with integrated remote disconnect switches
as likely candidates for such an attack, but the same attack can also be performed using compromised IoT devices. Such
attacks are hard to mitigate, and existing literature focuses on hardening device firmware to prevent compromise.
Despite the infeasibility of perfect firmware security, there is little research on \emph{post-compromise} mitigation
approaches. A core issue with post-attack mitigation is that the devices normal network connection may not work due to
the attack and as such an out-of-band communication channel is necessary.
the total connected load~\cite{ctap+11,wu01}. Previous work considered compromised smart meters with integrated remote
disconnect switches as likely candidates for such an attack, but the same attack can also be performed using compromised
IoT devices. Such attacks are hard to mitigate, and existing literature focuses on hardening device firmware to prevent
compromise. Despite the infeasibility of perfect firmware security, there is little research on \emph{post-compromise}
mitigation approaches. A core issue with post-attack mitigation is that the devices normal network connection may not
work due to the attack and as such an out-of-band communication channel is necessary.
We propose a \emph{safety reset} controller that is controlled through a novel, resilient, grid-wide powerline
communication technique. Our safety reset controller can be fitted into any Smart Meter or IoT device. Its purpose is to
@ -103,11 +111,12 @@ voltage, which is quickly attenuated across long distances.
Figure~\ref{fig_intro_flowchart} shows an overview of our concept. Two scenarios for its application are before or
during a cyberattack, to stop an attack on the electrical grid in its tracks, and after an attack while power is being
restored to prevent a repeated attack. In both scenarios, our concept is fully independent of all public communication
networks (such as the Internet or mobile networks) as well as broadcast systems (such as cable television or terrestrial
broadcast radio). A grid frequency-based system can function as long as power is still available, or as soon as power is
restored after the attack. One powerful function this allows is ``flushing out`` an attacker from compromised smart
meters after an attack, before restoring smart meter internet connectivity.
restored to prevent a repeated attack. In both scenarios, our concept is independent of telecommunication networks (such
as the internet or cellular networks) as well as broadcast systems (such as cable television or terrestrial broadcast
radio) while requiring only inexpensive signal processing hardware and no external antennas (such as are needed for
satellite communication). A grid frequency-based system can function as long as power is still available, or as soon as
power is restored after the attack. One powerful function this allows is ``flushing out`` an attacker from compromised
smart meters after an attack, before restoring smart meter internet connectivity.
Using simulations we have determined that control of a $\SI{25}{\mega\watt}$ load such as a large aluminium smelter,
load bank or photovoltaic farm would allow for the transmission of a crytographically secured \emph{reset} signal within
@ -220,45 +229,43 @@ is that their agility w.r.t.\ post-hoc mitigations through firmware updates is l
%Another fundamental challenge in smart grid implementations is the central role of smart electricity meters in the
%smart grid ecosystem. Smart meters are used both for highly-granular load measurement and in some countries also for
%load switching~\cite{zheng01}.
Smart electricity meters are effectively consumer devices built down to a certain price point. The small market served
by a single smart meter implementation limits how much effort a vendor can spend on firmware security. Landis+Gyr, a
large manufacturer that makes most of its revenue from utility meters state in their 2019 annual report that they
invested \SI{36}{\percent} of their total R\&D budget on embedded software while spending only \SI{24}{\percent} on
hardware R\&D~\cite{landisgyr01,landisgyr02}, indicating significant tension between firmware security and the vendor's
bottom line.
Smart electricity meters are consumer devices built down to a price. Firmware security research and development budgets
are limited by the high degree of market fragmentation that is caused by mutually incompatible national smart metering
standards. Landis+Gyr, a large utility meter manufacturer, state in their 2019 annual report that they invested
\SI{36}{\percent} of their total R\&D budget on embedded software while spending only \SI{24}{\percent} on hardware
R\&D~\cite{landisgyr01,landisgyr02}, which indicates tension between firmware security and the manufacturers's bottom
line.
% FIXME more sources!
\subsection{The state of the art in embedded security}
Embedded software security generally is much harder than security of higher-level systems. The primary two factors
affecting this are that on one hand, embedded devices usually run highly customized firmware that (often by necessity)
is rarely updated. On the other hand, embedded devices often lack advanced security mechanisms such as memory management
units that are found in most higher-power devices. Even well-funded companies continue to have trouble securing their
embedded systems. A spectacular example of this difficulty is the 2019 flaw in Apple's iPhone SoC first-stage ROM
bootloader that allows for the full compromise of any iPhone before the iPhone X given physical access to the
device~\cite{heise01}. iPhone 8, one of the affected models, was still being manufactured and sold by Apple until April
2020. In another instance in 2016, researchers found multiple flaws in Samsung's implementation of ARM TrustZone
``secure world'' firmware that Samsung used for their own mobile phone SoCs. The flaws they found were both severe
architectural flaws such as secret user input being passed through untrusted userspace processes without any protection
as well as shocking cryptographic flaws such as
CVE-2016-1919\footnote{\url{http://cve.circl.lu/cve/CVE-2016-1919}}~\cite{kanonov01}. And Samsung is not the only large
multinational corporation having trouble securing their secure firmware implementation. In 2014 researchers found an
embarrassing integer overflow flaw in the low-level code handling untrusted input in Qualcomm's QSEE
firmware~\cite{rosenberg01}. For an overview of ARM TrustZone including a survey of academic work and past security
vulnerabilities of TrustZone-based firmware see~\cite{pinto01}.
Embedded software security has proven challenging compared to the security of larger computer systems. On one hand,
embedded devices usually run highly customized firmware that is rarely updated. On the other hand, embedded devices
often lack security mechanisms such as memory management units that are found in higher-power devices. As a result of
these factors, even well-funded companies continue to have trouble securing their embedded systems. An example of this
difficulty is the 2019 flaw in Apple's iPhone SoC first-stage ROM bootloader that allows for the full compromise of any
iPhone older than iPhone X given physical access to the device~\cite{heise01}. iPhone 8, one of the affected models, was
still being manufactured and sold by Apple until April 2020. In another instance in 2016, researchers found multiple
flaws in Samsung's implementation of ARM TrustZone ``secure world'' firmware that Samsung used for their own mobile
phone SoCs. The flaws they found were both architectural flaws such as secret user input being passed through untrusted
userspace processes as well as cryptographic flaws such as
CVE-2016-1919\footnote{\url{http://cve.circl.lu/cve/CVE-2016-1919}}~\cite{kanonov01}. In a similar way, in 2014,
researchers found an integer overflow flaw in the low-level code handling untrusted input in Qualcomm's QSEE
firmware\footnote{For an overview of ARM TrustZone including a survey of academic work and past
security vulnerabilities of TrustZone-based firmware see~\cite{pinto01}.}~\cite{rosenberg01}.
If even companies with R\&D budgets that rival some countries' national budgets at mass-market consumer devices
have trouble securing their mass market secure embedded software stacks, what is a much smaller smart meter manufacturer
to do? Especially if national standards mandate complex protocols such as TLS that are tricky to implement
to do? Especially if national standards mandate complex protocols such as TLS that are difficult to implement
correctly~\cite{georgiev01}, this manufacturer will be short on options to secure their product.
\subsection{Attack surface in the smart grid}
From the incidents we outlined in the previous paragraphs we conclude that in smart metering technology, market
incentives do not currently provide the conditions for a level of device security that will reliably last the coming
decades. Considering this tension, in this paragraph we examine the cyberphysical risks that arise from attacks on the
smart grid in the first place. These risks arise at three different infrastructure levels.
incentives do not currently provide the conditions for a level of device security that will reliably last for decades
after deployment. Considering this tension, in this paragraph we examine the cyberphysical risks that arise from attacks
on the smart grid in the first place. These risks arise at three different infrastructure levels.
The first level is that of attacks on centralized control systems. This type of attack is often cited in popular
discourse and to our knowledge is the only type of attack against an electric grid that has ever been carried out in
@ -585,7 +592,7 @@ other simulations as well this equates to an overall transmission duration of ap
the demodulator some time to settle and to produce more realistic conditions of signal reception we padded the modulated
signal unmodulated noise on both ends.
\section{Discussion}
\section{Lessons learned}
For our proof of concept, before settling on the commercial smart meter we first tried to use an \texttt{EVM430-F6779}
smart meter evaluation kit made by Texas Instruments. This evaluation kit did not turn out well for two main reasons.
@ -604,35 +611,26 @@ to be too complex and all we wanted to know we found with just a few hours of di
Ghidra\footnote{\url{https://ghidra-sre.org/}}.
In the firmware development phase our approach of testing every module individually (e.g. DSSS demodulator, Reed-Solomon
decoder, grid frequency estimation) proved to be very useful. In particular debugging benefited greatly from being able
to run several thousand tests within seconds. In case of our DSSS demodulator, this modular testing and simulation
architecture allowed us to simulate thousands of runs of our implementation on test data and directly compare it to our
Jupyter/Python prototype. Since we spent more time polishing our embedded C implementation it turned out to perform
better than our Python prototype while still exhibiting the same fundamental response to changes to its parameters.
In accordance with our initial estimations we did not run into any code space nor computation bottlenecks for chosing
floating point emulation instead of porting over our algorithms to fixed point calculations. The extremely slow sampling
rate of our systems makes even heavyweight processing such as FFT or our brute force dynamic programming approach to
DSSS demodulation possible well within our performance constraints.
The safety reset controller does not require any peripherals except for an ADC. Thus we expect code size to be the main
factor affecting per-unit cost in an in-field deployment of our concept. At around \SI{64}{\kilo\byte}, our unoptimized
demonstrator firmware implementation is already on the lower end of the spectrum. Especially with some optimization we
expect safety reset controllers to be commercially viable given adequate political incentives.
decoder, grid frequency estimation) proved useful particularly for debugging. The modular architecture allowed us to
directly compare our demodulator implementation to our Jupyter/Python prototype, where we found that our C
implementation outperformed the Python prototype. Despite the algorithms's complexity, the microcontroller C
implementation has no issues processing data in real-time due to the low sampling rate necessary.
\section{Conclusion}
\label{sec_conclusion}
\subsection{Applicability to IoT devices}
\subsection{Discussion}
During an emergency in the electrical grid, the ability to communicate to large numbers of end-point devices is a
valuable tool for restoring normal operation. When a resilient communcation channel is available, loads such as smart
meters and IoT devices can be equipped with a supervisor circuit that allows for a remote ``safety reset'' that puts the
device into a safe operating state. Using this safety reset, an attacker that uses compromised smart meters or IoT
devices to attack grid stability can be interrupted before the conculusion of their attack. During recover from an
outage, a safety reset can be used to reduce stress on the system during a black start by turning of non-essential loads
such as air conditioners.
devices to attack grid stability can be interrupted before the can conclude their attack. During recovery from an
outage, a safety reset can be used to reduce stress on the system during a black start by temporarily disabling
non-essential loads such as air conditioners.
In this paper we have developed an end-to-end design of a safety reset system that provides these capabilities. Our
novel broadcast data transmission system is based on intentional modulation of global grid frequency. Our system is
In this paper we have developed an end-to-end design for a safety reset system that provides these capabilities.
Our novel broadcast data transmission system is based on intentional modulation of global grid frequency. Our system is
independent of normal communication networks and can operate during a cyberattack. We have shown the practical viability
of our end-to-end design through simulations. Using our purpose-designed grid frequency recorder, we can capture and
process real-time grid frequency data in an electrically safe way. We used data captured this way as the basis for
@ -645,13 +643,17 @@ developed a simple cryptographic protocol ready for embedded implementation in r
triggering a safety reset with a response time of less than 30 minutes. In this demonstration we use simulated grid
frequency data to trigger a commercial microcontroller to perform a firmware reset of an off-the-shelf smart meter. The
next step in our evaluation will be to conduct an experimental evaluation of our modulation scheme in collaboration with
an utility and an operator of a multi-megawatt load. Source code and electronics CAD designs are available at the
public repository listed at the end of this document.
an utility and an operator of a multi-megawatt load.
The safety reset controller does not require any peripherals except for an ADC. Thus we expect code size to be the main
factor affecting per-unit cost in an in-field deployment of our concept. At around \SI{64}{\kilo\byte}, our demonstrator
firmware implementation is viable on low-end microcontrollers. Thus, we expect safety reset controllers to be
commercially viable.
Source code and EDA designs are available at the public repository listed at the end of this document.
\printbibliography[heading=bibintoc]
%%% FIXME remove appendix and work into text.
\center{
\center{This is version \texttt{\input{version.tex}\unskip} of this paper, generated on \today. The git repository
can be found at:}

View file

@ -1756,3 +1756,13 @@
year = {2017}
}
@proceedings{ctap+11,
author = {Mihai Costache and Valentin Tudor and Magnus Almgren and Marina Papatriantafilou and Christopher Saunders},
booktitle = {2011 Seventh European Conference on Computer Network Defense},
month = {dec},
publisher = {IEEE},
title = {Remote control of smart meters: friend or foe?},
url = {https://www.syssec-project.eu/m/page-media/3/costache-ec2nd11.pdf; https://doi.org/10.1109/EC2ND.2011.14},
year = {2011}
}