This commit is contained in:
jaseg 2022-04-07 17:59:50 +02:00
parent 1dfe76a4ce
commit 6fac195a97

View file

@ -1,4 +1,6 @@
\documentclass[runningheads]{llncs}
\documentclass[letterpaper,twocolumn,10pt]{article}
\usepackage{usenix}
\usepackage[T1]{fontenc}
\usepackage[
backend=biber,
@ -32,165 +34,64 @@
% https://eepublicdownloads.entsoe.eu/clean-documents/pre2015/publications/entsoe/Operation_Handbook/Policy_1_Appendix%20_final.pdf
\date{}
\title{Ripples in the Pond: Transmitting Information through Grid Frequency Modulation}
\titlerunning{Ripples in the Pond: Transmitting Information through Grid Frequency}
\author{Jan Sebastian Götte \and Liran Katzir \and Björn Scheuermann}
\institute{TU Darmstadt\\ Communication Networks Lab\\ \email{safetyreset@jaseg.de}
\and Tel Aviv University\\ Faculty of Engineering\\ \email{lirankat@tau.ac.il}
\and TU Darmstadt\\ Communication Networks Lab\\ \email{scheuermann@informatik.hu-berlin.de}}
%\institute{TU Darmstadt\\ Communication Networks Lab\\ \email{safetyreset@jaseg.de}
%\and Tel Aviv University\\ Faculty of Engineering\\ \email{lirankat@tau.ac.il}
%\and TU Darmstadt\\ Communication Networks Lab\\ \email{scheuermann@informatik.hu-berlin.de}}
\maketitle
\keywords{Security, privacy and resilience in critical infrastructures \and Security and privacy in ``internet of
things'' \and Cyber-physical systems \and Hardware security \and Network Security \and Energy systems \and Signal theory}
%\keywords{Security, privacy and resilience in critical infrastructures \and Security and privacy in ``internet of
%things'' \and Cyber-physical systems \and Hardware security \and Network Security \and Energy systems \and Signal theory}
\begin{abstract}
With the rollout of the smart grid, the IT security of electrical infrastructure has attracted increased attention
in the last years. Smart Grid IT security has two major components: The security of central SCADA systems, and
the security of equipment at the consumer premises such as smart meters and IoT devices. While there is previous
work on both sides, their interactions have not yet received much attention.
Previous work has explored the scenario of an attacker compromising a large number of Smart Meters that are equipped
with remote disconnect switches, and using these remote-controllable switches to cause a large-scale outage.
Previous work focuses on attack prevention. In this paper, we will instead look at recovery after a successful
attack. To transmission system operators (TSOs), the major challenge after such a Smart Meter-triggered outage is
that the attacker will likely persist through the outage, and compromised Smart Meters will resume malicious
activity after their power is restored. In the event of such an attack, TSOs would need a way to remotely put these
compromised devices into a \emph{safe} mode of operation.
In this paper, we consider the previously proposed scenario where a large number of compromised consumer devices is
used alone or in conjunction with an attack on the grid's central SCADA systems to destabilize the grid by rapidly
modulating the total connected load. Such attacks might include IoT devices, but they might also target Smart
Meters, which in many parts of the world now contain remote-controlled disconnect switches. Such attacks are hard to
mitigate, and existing literature focuses on hardening device firmware to prevent compromise. Although perfect
firmware security is not practically achievable, there is little research on \emph{post-compromise} mitigation
approaches. A core issue of any post-attack mitigation is that the devices normal network connection may not work
due to the attack and as such an out-of-band communication channel is necessary.
We propose a \emph{safety reset} controller that is controlled through a novel, resilient, grid-wide powerline
communication technique. Our safety reset controller can be fitted into any Smart Meter or IoT device. Its purpose
is to await an out-of-band command to put the device into a safe state (e.g. \emphp{relay on} or \emph{light on})
that interrupts attacker control over the device. The safety reset controller is separated from the system's main
application controller and does not have any conventional network connections to reduce attack surface and cost.
Our proposed resilient communication channel is a grid-wide broadcast channel based on modulating grid frequency. It
can be operated by transmission system operators (TSOs) even during black-start recovery procedures and in this
situation bridges the gap between the TSO's private network and the consumer devices. To demonstrate our proposed
channel, we have implemented a system that transmits error-corrected and cryptographically secured commands.
Our approach differs from traditional Powerline Communication (PLC) systems in that it reaches every device within
the same synchronous area as the signal is embedded into the fundamental grid frequency. Traditional PLC uses a
superimposed voltage, which is quickly attenuated across long distances.
Using simulations we have determined that control of a $\SI{25}{\mega\watt}$ load such as a large aluminium smelter,
load bank or photovoltaic farm would allow for the transmission of a crytographically secured \emph{reset} signal
within $15$ minutes. We have designed and constructed a proof-of-concept prototype receiver that demonstrates the
feasibility of decoding such signals on a resource-constrained microcontroller.
Given that public telecommunications networks including the internet, cellular networks, and LoRa base stations may
also be disrupted during a large-scale blackout, the challenging aspect of this remote \emph{Safety Reset} is the
communication channel between TSO and the smart meter. For this purpose, in this paper we propose a simple yet
effective communication channel based on modulating grid frequency by modulating the power of a connected load or
generator. Our proposed communciation channel (1) requires minimal infrastructure, (2) has a reach spanning the
entire power grid and (3) is fully independent of other telecommunication networks and functions even under severe
disruption of the grid.
\end{abstract}
\section{Introduction}
% FIXME This is meh.
% Maybe *start* with "the recovery from a blackout bla bla..."?
The power grids of the world are some of the most complex man-made technological systems. Their operation is essential
for modern human life and with the proliferation of ransomware and state-sponsored attacks their IT security has come
under close scrutiny. To grid operators, there are two main challenges that complicate IT security efforts. First, all
parts of the electrical grid are physically coupled and faults can have consequences far from their source. Second, many
of the networked devices used in grid applications are special-purpose devices built in low volumes, which limits the
amount of engineering effort that could have been spent on their firmware security.
With the rollout of the smart grid, the IT security of electrical infrastructure has attracted increased attention in
the last years. Smart Grid security has two major components: The security of central SCADA systems, and the security
of equipment at the consumer premises such as smart meters and IoT devices. While there is previous work on both sides,
their interactions have not yet received much attention.
We expect that a serious compromise can never fully be ruled out since the combined attack surface of a large number of
diverse devices is too large to effectively secure, and perimeter security measures are only effective to a point when
devices are spread out across a vast geographical area. Thus, in this paper we focus not on the prevention of an attack,
but on the recovery from one.
%The IT security of the power grid is a complicated issue. Transmission system operators are faced with multiple
%challenges.
In this paper, we consider the previously proposed scenario where a large number of compromised consumer devices is used
alone or in conjunction with an attack on the grid's central SCADA systems to destabilize the grid by rapidly modulating
the total connected load. Previous work considered compromised smart meters with integrated remote disconnect switches
as likely candidates for such an attack, but the same attack can also be performed using compromised IoT devices. Such
attacks are hard to mitigate, and existing literature focuses on hardening device firmware to prevent compromise.
Despite the infeasibility of perfect firmware security, there is little research on \emph{post-compromise} mitigation
approaches. A core issue with post-attack mitigation is that the devices normal network connection may not work due to
the attack and as such an out-of-band communication channel is necessary.
%First, the grid is composed of myriad different devices that are interconnected on a contintental scale. Since all these
%devices are physically coupled, faults in one system can have ripple effects far away. In other critical infrastructure
%such as the water supply, transportation or the public health system, a number of fundamentally independent sub-systems
%are only linked at an organizational level, which means faults due to either natural disasters or hacking attacks are
%likely to be localized. In contrast, a transmission system operator has to make sure no faults happen anywhere in the
%system for the system to be stable. Ensuring faultless operation across thousands of devices is hard.
We propose a \emph{safety reset} controller that is controlled through a novel, resilient, grid-wide powerline
communication technique. Our safety reset controller can be fitted into any Smart Meter or IoT device. Its purpose is to
await an out-of-band command to put the device into a safe state (e.g. \emph{relay on} or \emph{light on}) that
interrupts attacker control over the device. The safety reset controller is separated from the system's main application
controller and does not have any conventional network connections to reduce attack surface and cost.
%Like any other complex technological system, the components that make up the power grid are increasingly being outfitted
%with networked computer systems for monitoring and control.
%They have to secure a large and diverse fleet of networked systems, many of which are special-purpose devices customized
%for this particular application. Small production quantities
%mean that the limit of economically achievable security is already low. Coupled with the high complexity of each of
%these devices, this results in
We propose a resilient grid-wide broadcast channel based on modulating grid frequency. This channel can be operated by
transmission system operators (TSOs) even during black-start recovery procedures and in this situation bridges the gap
between the TSO's private network and the consumer devices. To demonstrate our proposed channel, we have implemented a
system that transmits error-corrected and cryptographically secured commands.
\subsection{The digitalization of the grid}
In the power grid, as in many other engineered systems, we can observe an ongoing diffusion of information systems into
the domain of industrial control. Automation of these control systems has already been practiced for the better part of a
century. Throughout the 20th century this automation was mostly limited to core components of the grid. Generators in
power stations are computer-controlled according to electromechanical and economic models. Switching in substations is
automated to allow for fast failure recovery. Human operators are still vital to these systems, but their tasks have
shifted from pure operation to engineering, maintenance and surveillance~\cite{crastan03,anderson02}.
With the turn of the century came a large-scale trend in power systems to move from a model of centralized generation,
built around massive large-scale fossil and nuclear power plants, towards a more heterogenous model of smaller-scale
generators working together. In this new model large-scale fossil power plants still serve a major role, but new
factors come into play. One such factor is the advance of renewable energies. The large-scale use of wind and solar
power in particular seems unavoidable for continued human life on this planet. For the electrical
grid these systems constitute a significant challenge. Fossil-fueled power plants can be controlled in a precise and
quick way to match energy consumption. This tracking of consumption with production is vital to the stability of the
grid. Renewable energies such as wind and solar power do not provide the same degree of controllability, and they
introduce a larger degree of uncertainty due to the unpredictability of the forces of nature~\cite{crastan03}.
Along with this change in dynamic behavior, renewable energies have brought forth the advance of distributed generation.
In distributed generation end customers that previously only consumed energy have started to feed energy into the grid
from small solar installations on their property. Distributed generation is a chance for customers to gain autonomy and
shift from a purely passive role to being active participants of the electricity market~\cite{crastan03}.
% FIXME the following paragraph is weird.
To match this new landscape unpredictable renewable resources and of decentralized generation, the utility industry has
had to adapt itself in major ways. One aspect of this adaptation that is particularly visible to energy consumers is the
computerization of end-user energy metering. Despite the widespread use of industrial control systems inside the
electrical grid and the far-reaching diffusion of computers into people's everyday lives, the energy meter has long been
one of the last remnants of an offline, analog time. Until the 2010s many households were still served through
electromechanical Ferraris-style meters that have their origin in the late 19th
century~\cite{borlase01,ukgov04,bnetza02}. Today, under the umbrella term \emph{Smart Metering}, the shift towards fully
computerized, often networked meters is well underway. The roll out of these \emph{Smart Meters} has not been very
smooth overall with some countries severely lagging behind. As a safety-critical technology, smart metering technology
is usually standardized on a per-country basis.
\subsection{Perfect firmware security}
% FIXME join these paragraphs
This leads to an inhomogenous landscape with---in some
instances---wildly incompatible systems. Often vendors only serve a single country or have separate models of a meter
for each country. This complex standardization landscape and market situation has led to a proliferation of highly
complex, custom-coded microcontroller firmware. The complexity and scale of this---often network-connected---firmware
makes for a ripe substrate for bugs to surface.
A remotely exploitable flaw inside the firmware of a component of a smart metering system could have consequences
ranging from impaired billing functionality to an existential threat to grid stability~\cite{anderson01,anderson02}. In a
country where meters commonly include disconnect switches for purposes such as prepaid tariffs, a coordinated attack
could at worst cause widespread activation of grid safety systems through oscillations caused by repeated cycling of
megawatts of load capacity at just the wrong frequency~\cite{wu01}.
Mitigation of these attacks through firmware security measures is unlikely to yield satisfactory results. The enormous
complexity of smart meter firmware makes firmware security extremely labor intensive. The diverse standardization
landscape makes a coordinated, comprehensive response unlikely.
In this paper, we introduce \emph{Grid Frequency Modulation}, a new communication channel that can be used for grid-wide
broadcast without relying on any other telecommunication networks being operational. Grid Frequency Modulation uses
Direct Sequence Spread Spectrum (DSSS) modulation carried out on grid frequency through a large controllable load such
as an aluminium smelter. Note that Grid Frequency Modulation is \emph{changing the grid frequency itself}. This is
fundamentally different in both generation and detection from systems such as traditional PLC that superimpose a signal
on grid voltage, but leave the underlying grid frequency itself unaffected. As it requires high-fidelity control over a
large load or producer connected to the grid, Grid Frequency Modulation provides a degree of implicit sender
authentication.
To illustrate the utility of Grid Frequency Modulation we propose a pragmatic solution to the---in our opinion
likely---scenario of a large-scale compromise of smart meter firmware. Instead of improving firmware security or
resiliency of public telecommunication infrastructure, both of which are hard problems, we introduce the \emph{safety
reset controller} as a fail-safe that allows an utility to flush an attacker out of their deployed smart meters even
during large-scale disruption of telecommunication networks. In our concept the components of the smart meter that are
threatened by remote compromise are equipped with a physically separate microcontroller that listens for a ``reset''
command transmitted through the electrical grid's frequency and on reception forcibly resets the smart meter's entire
firmware through a low-level programming interface such as JTAG to a known-good state that disables all network
functionality to prevent re-compromise and lock out the attackers until the device can be programmed with a patched
firmware by a service technician. As part of our prototype reset controller we have developed a simple cryptographic
command protocol based on the Lamport and Winternitz One-time Signature (OTS) schemes that our prototype reset
controller uses to receive an authenticated command to re-flashe the smart meter's main microcontroller over the
standard JTAG interface. The safety reset controller is an off-the-shelf microcontroller much smaller than the one used
for the meter's main application controller. To receive grid frequency-modulated commands, it measures grid frequency
from a voltage waveform acquired using its internal analog-to-digital-converter (ADC) directly connected to the mains
voltage input through a resistive divider chain. By using of an off-the-shelf microcontroller we keep the implementation
overhead of our solution low in engineering cost compared to an ASIC. By keeping its firmware small, we can use a
simpler and less expensive microcontroller, keeping per-unit implementation cost low.
Our approach differs from traditional Powerline Communication (PLC) systems in that it reaches every device within one
synchronous area as the signal is embedded into the fundamental grid frequency. Traditional PLC uses a superimposed
voltage, which is quickly attenuated across long distances.
\begin{figure}
\centering
@ -208,6 +109,59 @@ broadcast radio). A grid frequency-based system can function as long as power is
restored after the attack. One powerful function this allows is ``flushing out`` an attacker from compromised smart
meters after an attack, before restoring smart meter internet connectivity.
Using simulations we have determined that control of a $\SI{25}{\mega\watt}$ load such as a large aluminium smelter,
load bank or photovoltaic farm would allow for the transmission of a crytographically secured \emph{reset} signal within
$15$ minutes. We have designed and constructed a proof-of-concept prototype receiver that demonstrates the feasibility
of decoding such signals on a resource-constrained microcontroller.
\subsection{Motivation}
Consumer devices are increasingly becoming \emph{smart}. Large numbers of IoT devices are connected through the public
internet, and in several countries internet-connected Smart Meters can disconnect entire households from the grid in
case of unpaid bills. The increasing proliferation of smart devices on the consumer side presents an opportunity to grid
operators, who rely on forecasts for the cost-optimized control of generation and power flow. The core of the
\emph{Smart Grid} vision is that utilities can now gather detailed data for more accurate consumption forecasts, and in
some cases can even adjust parameters of large devices like water heaters to smooth out load spikes.
However, this increased degree of visibility and control comes with an increased IT security risk. In this paper we
focus on scenarios where an attacker compromises a large number of grid-connected remote-controllable devices. This may
be simple smart home devices such as IoT light bulbs, but it may also include Smart Meters that are outfitted with a
remote disconnect switch as is common in some countries. By rapidly switching large numbers of such devices in a
coordinated manner, the attacker has the opportunity to de-stabilize the electrical grid. % FIXME citation
Previous work on IoT and Smart Grid security has focused on the prevention of attacks though firmware security measures.
While research on prevention is undoubtably important, we estimate that its practical impact will be limited by the vast
diversity of implementations found in the field combined with the slow update cycles inherent to non-functional firmware
enhancements for consumer devices. We predict that it would be a Sisyphean task to secure sufficiently many devices
to deny an attacker the critical mass needed to cause trouble. For this reason, in this paper we focus on recovery after
an attack.
\subsection{Black-start recovery}
The recovery from a large-scale power outage is a complex operational challenge. Large outages are caused by cascading
failures. Since all consumers and producers that are connected to the electrical grid are physically coupled through the
electromotive force, a fault in one part of the grid affects all devices connected across the grid. To function, the
grid relies on a delicate balance between electricity generation, transmission and consumption. When this balance is
disturbed, cascading failures can occur. A transmission line shutting off can lead other, nearby lines to overload and
shut off. Due to the electromechanical coupling of all machines connected to the grid, a generator or consumer suddenly
shutting off causes a transient in the grid's frequency. If the frequency goes too far out of bounds, protection devices
take power plants and large industrial loads offline.
The recovery from a large-scale outage requires the grid's operators to bring generators and loads back online one by
one while continuously maintaining balance between generation and consumption to avoid their protection devices shutting
them down again. To coordinate this process, transmission system operators cannot rely on the public internet or
cellular networks, as they may not work during a large-scale power outage. Instead, they maintain private communication
infrastructure using dedicated lines rented from telecommunciations providers, fibers run along transmission lines, and
dedicated radio links.
To start from a complete outage, first a number of \emph{black start}-capable power stations that can start by
themselves without any external power are brought online. With their help, other power stations and consumers are
gradually brought online until a part of the grid has been restored to nominal operation. This process can be performed
simultaneously in different parts of the grid. After these \emph{islands} have been restored, they can then be joined to
restore the grid to its normal state.
\subsection{Contents}
Starting from a high level architecture, we have carried out simulations of our concept's performance under real-world
conditions. Based on these simulations we implemented an end-to-end prototype of our proposed safety reset controller as
part of a realistic smart meter demonstrator. Finally, we experimentally validated our results and we will conclude with
@ -222,7 +176,7 @@ This work contains the following contributions:
\item We carry out extensive simulations of our systems to determine its performance characteristics.
\end{enumerate}
\section{Notation}
\subsection{Notation}
To a computer scientist there is one confusing aspect to the theory of grid frequency modulation. GFM can be seen as a
frequency modulation (FM) with a baseband signal in the band below approximately $f_m = \SI{5}{\hertz}$ that is
@ -242,53 +196,38 @@ signal and its properties such as $f_m$.
\section{Related work}
\label{sec_related_work}
% FIXME: intro here
Previous work has analyzed Smart Grid security from numerous angles and made several suggestions towards its
improvement. Apart from the critical location that Smart Grid devices occupy, they are computer systems like many
others. Thus, for IT security purposes the Smart Grid is simply an aggregation of embedded control and measurement
devices that are part of a large control system. These devices share the same security concerns that apply to embedded
systems in general.
\subsection{Security and Privacy in the Smart Grid}
\subsection{Smart Meter Security}
The smart grid in practice is nothing more or less than an aggregation of embedded control and measurement devices that
are part of a large control system. This implies that all the same security concerns that apply to embedded systems in
general also apply to the components of a smart grid. Where programmers have been struggling for decades now with issues
such as input validation~\cite{leveson01}, the same potential issue raises security concerns in smart grid scenarios as
well~\cite{mo01, lee01}. Only, in smart grid we have two complicating factors present: many components are embedded
systems, and as such inherently hard to update. Also, the smart grid and its control algorithms act as a large partially
distributed system making problems such as input validation or authentication harder~\cite{blaze01} and adding a host of
distributed systems problems on top~\cite{lamport01}.
Where programmers have been struggling for decades now with issues such as input validation~\cite{leveson01}, the same
potential issue raises security concerns in smart grid scenarios as well~\cite{mo01, lee01}. Only, in smart grid we
have two complicating factors present: many components are embedded systems, and as such inherently hard to update.
Also, the smart grid and its control algorithms act as a large (partially) distributed system making problems such as
input validation or authentication harder~\cite{blaze01} and adding a host of distributed systems problems on
top~\cite{lamport01}.
Given that the electrical grid is essential infrastructure in our modern civilization, these problems amount to
significant issues. Attacks on the electrical grid may have grave consequences~\cite{anderson01,lee01} while the long
replacement cycles of various components make the system slow to adapt. Thus, components for the smart grid need to be
built to a much higher standard of security than most consumer devices to ensure they live up to well-funded attackers
even decades down the road. This requirement intensifies the challenges of embedded security and distributed systems
security among others that are inherent in any modern complex technological system. The safety-critical nature of the
modern smart metering ecosystem in particular was quickly recognized~\cite{anderson01}.
Given that the electrical grid is essential infrastructure, these issues are significant. Attacks on the electrical grid
may have grave consequences~\cite{anderson01,lee01} while the long replacement cycles of various components make the
system slow to adapt. Thus, components for the smart grid need to be built to a higher standard of security than e.g.\
IoT devices to live up to well-funded attackers decades down the road. Another implication of their long service life
is that their agility w.r.t.\ post-hoc mitigations through firmware updates is limited.
A point we will not consider in much depth in this work is theft of electricity. While in publications aimed towards the
general public the introduction of smart metering is always motivated with potential cost savings and ecological
benefits, in industry-internal publications the reduction of electricity theft is often cited as an
incentive~\cite{czechowski01}. Likewise, academic publications tend to either focus on other benefits such as generation
efficiency gains through better forecasting or rationalize the consumer-unfriendly aspects of smart metering with social
benefits~\cite{mcdaniel01}. They do not usually point out revenue protection mechanisms as
incentives~\cite{anderson01,anderson02}.
%Another fundamental challenge in smart grid implementations is the central role of smart electricity meters in the
%smart grid ecosystem. Smart meters are used both for highly-granular load measurement and in some countries also for
%load switching~\cite{zheng01}.
Smart electricity meters are effectively consumer devices built down to a certain price point. The small market served
by a single smart meter implementation limits how much effort a vendor can spend on firmware security. Landis+Gyr, a
large manufacturer that makes most of its revenue from utility meters state in their 2019 annual report that they
invested \SI{36}{\percent} of their total R\&D budget on embedded software while spending only \SI{24}{\percent} on
hardware R\&D~\cite{landisgyr01,landisgyr02}, indicating significant tension between firmware security and the vendor's
bottom line.
A serious issue in smart metering setups is customer privacy. Even though the meter ``only'' collects aggregate energy
consumption of a whole household, this data is highly sensitive~\cite{markham01}. This counterintuitive fact was
initially overlooked in smart meter deployments leading to outrage, delays and reduced features~\cite{cuijpers01}. The
root cause of this problem is that given sufficient time resolution these aggregate measurements contain ample
entropy. Through disaggregation algorithms, individual loads can be identified and through pattern matching even complex
usage patterns can be discerned with alarming accuracy~\cite{greveler01} in the same way that similar privacy issues
arise in many other areas of modern life through other kinds of pervasive tracking and surveillance~\cite{zuboff01}.
Another fundamental challenge in smart grid implementations is the central role of smart electricity meters in the smart
grid ecosystem. Smart meters are used both for highly-granular load measurement and in some countries also for load
switching~\cite{zheng01}. Smart electricity meters are effectively consumer devices. They are built down to a certain
price point that is measured by the burden it puts on consumers and that is divided by the relatively small market
served by a single smart meter implementation. Such cost requirements can preclude security features such as the use of
a standard hardened software environment on a high powered embedded system. Landis+Gyr, a large manufacturer that makes
most of its revenue from utility meters state in their 2019 annual report that they invested \SI{36}{\percent} of their
total R\&D budget on embedded software while spending only \SI{24}{\percent} on hardware
R\&D~\cite{landisgyr01,landisgyr02}, indicating a significant tension between firmware security and a smart meter
vendor's bottom line.
% FIXME more sources!
\subsection{The state of the art in embedded security}