Paper: WIP

This commit is contained in:
jaseg 2022-06-14 18:43:27 +02:00
parent 45972013f6
commit f5c1695898

View file

@ -1,14 +1,8 @@
\documentclass[letterpaper,twocolumn,10pt]{article}
\usepackage{usenix}
\usepackage{amssymb,amsmath}
\usepackage{eurosym}
\usepackage{wasysym}
\documentclass[sigconf,anonymous]{acmart}
\usepackage[binary-units]{siunitx}
\DeclareSIUnit{\baud}{Bd}
\DeclareSIUnit{\year}{a}
\usepackage{commath}
\usepackage{graphicx,color}
\usepackage{subcaption}
\usepackage{array}
@ -24,16 +18,6 @@
% https://eepublicdownloads.entsoe.eu/clean-documents/pre2015/publications/entsoe/Operation_Handbook/Policy_1_Appendix%20_final.pdf
\date{}
\title{\large\bf Ripples in the Pond:\\Transmitting Information through Grid Frequency Modulation}
\author{{\rm Jan Sebastian Götte}\\TU Darmstadt \and {\rm Liran Katzir}\\Tel Aviv University\and {\rm Björn Scheuermann}\\TU Darmstadt}
%\institute{TU Darmstadt\\ Communication Networks Lab\\ \email{safetyreset@jaseg.de}
%\and Tel Aviv University\\ Faculty of Engineering\\ \email{lirankat@tau.ac.il}
%\and TU Darmstadt\\ Communication Networks Lab\\ \email{scheuermann@informatik.hu-berlin.de}}
\maketitle
%\keywords{Security, privacy and resilience in critical infrastructures \and Security and privacy in ``internet of
%things'' \and Cyber-physical systems \and Hardware security \and Network Security \and Energy systems \and Signal theory}
\begin{abstract}
The dependence of the electrical grid on networked control systems is steadily rising. While utilities are defending
their side of the grid effectively through rigorous IT security measures such as physically separated control
@ -56,6 +40,16 @@
equipped with a prototype safety reset system based on an inexpensive commodity microcontroller.
\end{abstract}
\date{}
\title{\large\bf Ripples in the Pond:\\Transmitting Information through Grid Frequency Modulation}
\author{{\rm Jan Sebastian Götte}\\TU Darmstadt \and {\rm Liran Katzir}\\Tel Aviv University\and {\rm Björn Scheuermann}\\TU Darmstadt}
%\institute{TU Darmstadt\\ Communication Networks Lab\\ \email{safetyreset@jaseg.de}
%\and Tel Aviv University\\ Faculty of Engineering\\ \email{lirankat@tau.ac.il}
%\and TU Darmstadt\\ Communication Networks Lab\\ \email{scheuermann@informatik.hu-berlin.de}}
\maketitle
%\keywords{Security, privacy and resilience in critical infrastructures \and Security and privacy in ``internet of
%things'' \and Cyber-physical systems \and Hardware security \and Network Security \and Energy systems \and Signal theory}
\section{Introduction}
With the rollout of the smart grid, the IT security of electrical infrastructure has attracted increased attention in
@ -143,14 +137,83 @@ However, this increased degree of visibility and control comes with an increased
focus on scenarios where an attacker compromises a large number of grid-connected remote-controllable devices. This may
be simple smart home devices such as IoT light bulbs, but it may also include Smart Meters that are outfitted with a
remote disconnect switch as is common in some countries. By rapidly switching large numbers of such devices in a
coordinated manner, the attacker has the opportunity to de-stabilize the electrical grid. % FIXME citation
coordinated manner, the attacker has the opportunity to de-stabilize the electrical
grid~\cite{zlmz+21,kgma21,smp18,hcb19}.
Previous work on IoT and Smart Grid security has focused on the prevention of attacks though firmware security measures.
While research on prevention is undoubtably important, we estimate that its practical impact will be limited by the vast
diversity of implementations found in the field combined with the slow update cycles inherent to non-functional firmware
enhancements for consumer devices. We predict that it would be a Sisyphean task to secure sufficiently many devices
to deny an attacker the critical mass needed to cause trouble. For this reason, in this paper we focus on recovery after
an attack.
In this paper, we focus on assisting the recovery procedure after a succesful attack because we estimate that this
approach will yield a better return of investement in overall grid stability versus resources spent on security
measures. Previous work on IoT and Smart Grid security has focused on the prevention of attacks though firmware security
measures. While research on prevention is important, we estimate that its practical impact will be limited by the
diversity of implementations found in the field~\cite{nbck+19,zlmz+21}. We predict that it would be a Sisyphean task to
secure the firmware of sufficiently many devices to deny an attacker the critical mass needed to cause trouble. Even if
all flaws in the firmware of a broad range of devices would be fixed, users still have to update. In smart grid and IoT
devices, this presents a difficult problem since user awareness is low~\cite{nbck+19}.
\subsection{Contents}
Starting from a high level architecture, we have carried out simulations of our concept's performance under real-world
conditions using measured grid frequency data. Based on these simulations we implemented an end-to-end prototype of our
proposed safety reset controller as part of a realistic smart meter demonstrator. Finally, we experimentally validated
our results based on a simulated mains voltage signal and we will conclude with an outline of further steps towards a
practical implementation.
This work contains the following contributions:
\begin{enumerate}[topsep=4pt]
\item We introduce Grid Frequency Modulation (GFM) as a communication primitive. % FIXME done before in that one paper
\item We elaborate the fundamental physics underlying GFM and theorize on the constrains of a practical
implementation.
\item We design a communication system based on GFM.
\item We carry out extensive simulations of our systems to determine its performance characteristics.
\end{enumerate}
\subsection{Notation}
% FIXME drop or rework this section ; actually update notation to be consistent throughout
To a computer scientist there is one confusing aspect to the theory of grid frequency modulation. GFM can be seen as a
frequency modulation (FM) with a baseband signal in the band below approximately $f_m = \SI{5}{\hertz}$ that is
modulated on top of a carrier signal at $f_c = \SI{50}{\hertz}$ in case of the European electrical grid. The frequency
deviation $f_\Delta$ that the modulated carrier deviates from its nominal value of $f_m$ is very small at only a few
milli-Hertz.
When grid frequency is measured by first digitizing the mains voltage waveform, then de-modulating digitally, the FM's
SNR is very high and is dominated by the ADC's quantization noise and nearby mains voltage noise sources such as
resistive droop due to large inrush current of nearby machines.
Note that both the carrier signal at $f_c$ and the modulation signal at $f_m$ both have unit Hertz. To disambiguate
them, in this paper we will use \textbf{bold} letters to refer to the carrier waveform $\mathbf{U}$ or frequency
$\mathbf{f_c}$ as well as its deviation $\mathbf{f_\Delta}$, and we will use normal weight for the actual modulation
signal and its properties such as $f_m$.
\section{Background on the electrical grid}
\subsection{Components and interactions}
The electrical grid transmits alternating current electrical power from generators to loads. Any device that is
connected to the grid must run ``synchronously'' with the grid, i.e.\ it must produce or consume power following the
grid's voltage waveform. In generators and motors, the electromotive force acts to synchronize the device with the grid.
Connecting a generator that has not been synchronized to the grid leads to large currents flowing through the
generator's windings, inducing extreme forces that can mechanically destroy the generator. Similarly, if the inverters
of a solar power station would try to fight the grid, the grid would win and the inverters' power semiconductors would
release their magic smoke.
Originally, all power sources on the grid were synchronous rotating generators. Today, the shift towards renewable
energies and the introduction of high-voltage DC links has led to some of the grid's generating capacity being replaced
with inverters that electronically emulate the grid's voltage waveform to efficiently convert a DC input to the grid's
alternating current.
The generators and loads on the grid are linked through a complex network of transmission lines. Transformers are used
to couple between transmission lines operating at different voltage levels, and several types of switches allow
utilities to steer power flow throughout this network. Through the electromotive force, all synchronous generators
connected to the grid are electromechanically coupled. Transmission lines introduce a (small) phase delay to the
electric fields traversing the grid, but besides local differences in phase, all parts of the grid are synchronous.
\subsection{Grid frequency behavior}
On the electrical grid, generation and consumption of energy must be precisely matched at all times for the grid to stay
at a constant, synchronous frequency. If generation outpaces consumption, generators would provide less mechanical
resistance to their source of mechanical power, or \emph{prime mover}, which would lead the generators to spin faster
and faster. Similarly, if consumption outpaced production, the increased mechanical load would slow down generators,
ultimately leading to a collapse.
The frequency of the electrical grid is maintained at a fixed, stable level through several layers of measures.
\subsection{Black-start recovery}
@ -176,229 +239,182 @@ gradually brought online until a part of the grid has been restored to nominal o
simultaneously in different parts of the grid. After these \emph{islands} have been restored, they can then be joined to
restore the grid to its normal state.
\subsection{Contents}
\subsection{Demand-side response and Smart Metering}
Starting from a high level architecture, we have carried out simulations of our concept's performance under real-world
conditions using measured grid frequency data. Based on these simulations we implemented an end-to-end prototype of our
proposed safety reset controller as part of a realistic smart meter demonstrator. Finally, we experimentally validated
our results based on a simulated mains voltage signal and we will conclude with an outline of further steps towards a
practical implementation.
Maintaining the balance between electricity generation and consumption under varying load conditions is critical.
Utilities can access different energy sources, each of which have their own trade-off in response speed versus energy
cost. For instance, the availability of wind and solar power cannot be controlled at all, while hydroelectric power
plants can quickly regulate the speed and power output of their turbines. Combined with the complex layout of the grid's
infrastructure such as transmission lines, these economical factors lead to a complex optimization problem, the quality
of whose solution directly manifests itself in the utility's bottom line.
This work contains the following contributions:
\begin{enumerate}[topsep=4pt]
\item We introduce Grid Frequency Modulation (GFM) as a communication primitive. % FIXME done before in that one paper
\item We elaborate the fundamental physics underlying GFM and theorize on the constrains of a practical
implementation.
\item We design a communication system based on GFM.
\item We carry out extensive simulations of our systems to determine its performance characteristics.
\end{enumerate}
For decades, one solution to this issue has been demand-side response (DSR)~\cite{rs48}. In DSR, large loads such as
water heaters are centrally controlled by the utility to switch on outside of peak demand. Since the precise timing of
these loads is of no consequence to their user, users are happy to get slightly better prices from their utility while
utilities gain a degree of control allowing them to optimize their network's performance. As part of the smart grid
vision, DSR will be utilized in a larger fraction of consumer devices.
\subsection{Notation}
A core component of the smart grid is the rollout of ``Advanced Metering Infrastructure'' (AMI), colloquially known as
smart meters. Smart meters are electricity meters that use a real-time communication interface to automatically transmit
high-resolution measurements to the utility. In contrast to the yearly reading schedule of traditional electricity
meters, smart meters can provide near-realtime data that the utility can use for more accurate load forecasting.
To a computer scientist there is one confusing aspect to the theory of grid frequency modulation. GFM can be seen as a
frequency modulation (FM) with a baseband signal in the band below approximately $f_m = \SI{5}{\hertz}$ that is
modulated on top of a carrier signal at $f_c = \SI{50}{\hertz}$ in case of the European electrical grid. The frequency
deviation $f_\Delta$ that the modulated carrier deviates from its nominal value of $f_m$ is very small at only a few
milli-Hertz.
\subsection{Powerline Communication (PLC)}
When grid frequency is measured by first digitizing the mains voltage waveform, then de-modulating digitally, the FM's
SNR is very high and is dominated by the ADC's quantization noise and nearby mains voltage noise sources such as
resistive droop due to large inrush current of nearby machines.
A core issue in smart metering is the communication channel from the meter to the greater world. Smart meters are
cost-constrained devices, which limits the use of landline internet or cellular conenctions. Additionally, electricity
meters are often installed in basements, far away from the customer's router and with soil and concrete blocking radio
signals. For these reasons, in some AMI deployments, powerline communication (PLC) has been chosen for the meters'
uplink.
Note that both the carrier signal at $f_c$ and the modulation signal at $f_m$ both have unit Hertz. To disambiguate
them, in this paper we will use \textbf{bold} letters to refer to the carrier waveform $\mathbf{U}$ or frequency
$\mathbf{f_c}$ as well as its deviation $\mathbf{f_\Delta}$, and we will use normal weight for the actual modulation
signal and its properties such as $f_m$.
Since the early days of the electrical grid, powerline communication has been used to control devices spread throughout
the grid from a central transmitter~\cite{rs48}. PLC systems super-impose a modulated high-frequency signal on top of
the grid voltage. When the carrier frequency of this modulation is in the audible frequency range, low data rates can be
transmitted over distances of several tens of kilometers. By using a radio frequency carrier, higher data rates can be
achieved across shorter distances. Audio frequency PLC, called ``ripple control'', is still used today by utilities to
enable ``demand-side response'', i.e.\ the remote switching of loads such as water heaters to avoid times of peak
electricity demand.
Usually, such powerline communication systems are uni-directional but they are instance of bi-directional powerline
communication for smart meter reading such as the italian smart meter deployment~\cite{ec03,rs48,gungor01,agf16}.
\section{Related work}
\label{sec_related_work}
Previous work has analyzed Smart Grid security from numerous angles and made several suggestions towards its
improvement. Apart from the critical location that Smart Grid devices occupy, they are computer systems like many
others. Thus, for IT security purposes the Smart Grid is simply an aggregation of embedded control and measurement
devices that are part of a large control system. These devices share the same security concerns that apply to embedded
systems in general.
\subsection{IoT and Smart Grid security}
\subsection{Smart Meter Security}
The security of IoT devices as well as the smart grid has received extensive attention in the
literature~\cite{nbck+19,acsc20,smp18,ykll17,anderson01,anderson02,zlmz21,kgma21,hcb19,mpdm10,lzlw+20,chl20,lam21,olkd20,yomu+20,}.
The challenges of IoT device security and the security of smart meters and other smart grid devices are similar because
smart grid devices are essentially IoT devices in a particularly sensitive location~\cite{acsc20}. In both device types,
the challenge is that securing embedded firmware is difficult, and adding network interfaces and cost constraints only
makes the task harder.
Where programmers have been struggling for decades now with issues such as input validation~\cite{leveson01}, the same
potential issue raises security concerns in smart grid scenarios as well~\cite{mo01, lee01}. Only, in smart grid we
have two complicating factors present: many components are embedded systems, and as such inherently hard to update.
Also, the smart grid and its control algorithms act as a large (partially) distributed system making problems such as
input validation or authentication harder~\cite{blaze01} and adding a host of distributed systems problems on
top~\cite{lamport01}.
In~\cite{smp18}, Soltan, Mittal and Poor investigated an attack scenario where an attacker first gains control over a
large number of high wattage devices through an IoT security vulnerability, then uses this control to cause rapid load
spikes. The researchers performed computer simulations for a range of parameters and concluded that given sufficiently
many compromised devices, an attacker can cause issues up to a large-scale blackout.
Given that the electrical grid is essential infrastructure, these issues are significant. Attacks on the electrical grid
may have grave consequences~\cite{anderson01,lee01} while the long replacement cycles of various components make the
system slow to adapt. Thus, components for the smart grid need to be built to a higher standard of security than e.g.\
IoT devices to live up to well-funded attackers decades down the road. Another implication of their long service life
is that their agility w.r.t.\ post-hoc mitigations through firmware updates is limited.
In~\cite{hcb19}, Huang, Cardenas and Baldick raised a counter-point to the conclusions of Soltan et al., finding that
limitations of their simulations in~\cite{smp18} have lead them to over-estimate the severity of an attack. Using a more
accurate model, they confirmed that such attacks can cause problems such as localized blackouts and the decay of the
grid into islands, but they found that overall the electrical grid is less vulnerable than previously assumed and
particularly large-scale blackouts are very unlikely, primarily due to the action of protection systems such as load
shedding and over frequency protection.
%Another fundamental challenge in smart grid implementations is the central role of smart electricity meters in the
%smart grid ecosystem. Smart meters are used both for highly-granular load measurement and in some countries also for
%load switching~\cite{zheng01}.
Smart electricity meters are consumer devices built down to a price. Firmware security research and development budgets
From literature, we get the overall impression that both IoT and Smart Grid security are challenging. Both lack behind
the security standard of state of the art desktop, server and smartphone operating systems. Reasons for this are the
relatively recent nature of the IoT software ecosystem and the large number of independent implementations. A unique
challenge to Smart Grid security is that due to the fragmentation of markets along national borders, certain devices
such as smart meters or DSR implementations exist in large monocultures.
Compared to IoT and Smart Grid devices, the embedded firmware foundations of modern smartphones have received more
attention both from the industry and from academia. Pinto and Santos in~\cite{pinto01} conducted a survey of
implementations based on ARM's TrustZone embedded virtualization architecture and found a significant number of reported
vulnerabilities across different implementations. For instance, Rosenberg in~\cite{rosenberg01} found critical issues in
Qualcomm's QSEE hypervisor, and Kanonov and Wool in~\cite{kanonov01} identified a number of design weaknesses and
security vulnerabilities in Samsung's competing KNOX virtualization product. To us, the state of the field of embedded
security indicates that even if significant effort is spent on the security of IoT and Smart Grid devices to catch up
with desktop, server and smartphone security, significant vulnerabilities are likely to remain for some time to come.
In this instance, market forces do not align with the interest of the public at large. Vulnerabilities remain likely,
especially in code implementing complex network protocols such as TLS~\cite{georgiev01}, which may even be mandated by
national standards in some devices such as smart electricity meters.
\subsection{Oscillations in the electrical grid}
Common to the attacks on the electrical grid proposed in the papers discussed above is their approach of overloading
parts of the grid. However, scenarios have been proposed that go beyond a simple overload condition, and in which an
attacker exploits the physcial characteristics of the grid to cause oscillations of increasing amplitude, ultimately
triggering a cascade of protection mechanisms. The purpose of this type of attack is to use a small controllable load to
cause outsized damage.
Electro-mechanical oscillation modes between different geographical areas of an electrical grid are a well-known
phenomenon. In their book~\cite{rogers01}, Rogers and Graham provide an in-depth analysis of these oscillations and
their mitigation. In~\cite{grebe01}, Grebe, Kabouris, López Barba et al.\ analyzed modeskj inherent to the
continental european grid. A report on an event where an oscillation on one such mode caused a problem can be found in
\cite{entsoe01}.
In~\cite{zlmz+21}, Zou, Liu, Ma et al.\ analyzed the possibility of a modal attack in which electric vehicle chargers
rapidly modulate their power to force an oscillation of a poorly dampened wide-area electromechanical mode. Using
mathematical analysis, small-scale simulations and practical experiments they validated the attack scenario and
developed a countermeasure that can be implemented as part of generator control systems and that when activated can
suppress forced oscillations of wide-area electromechanical modes.
On the device side of the smart grid, research has concentrated on smart meter security. Smart meters are
architecturally similar to IoT devices~\cite{zheng01,ifixit01}, but come with different challenges. Similar to a
high-power IoT device, an attacker could use an off-switch built as part of an attack, a scenario that was investigated
by Anderson and Fuloria in~\cite{anderson01}. Unique to smart meters, an attacker could, however, also use their control
to manipulate the meter's energy accounting, quickly leading to potentially severe financial impact on the meter's
operating utility company. This scenario has received research attention~\cite{anderson02,mcdaniel01} and this is where
industry incentives are the strongest.
Smart electricity meters are consumer devices built down to a price and manufacturers' firmware security R\&D budgets
are limited by the high degree of market fragmentation that is caused by mutually incompatible national smart metering
standards. Landis+Gyr, a large utility meter manufacturer, state in their 2019 annual report that they invested
\SI{36}{\percent} of their total R\&D budget on embedded software while spending only \SI{24}{\percent} on hardware
R\&D~\cite{landisgyr01,landisgyr02}, which indicates tension between firmware security and the manufacturers's bottom
line.
\subsection{Proposed Countermeasures}
% FIXME more sources!
\subsection{The state of the art in embedded security}
Embedded software security has proven challenging compared to the security of larger computer systems. On one hand,
embedded devices usually run highly customized firmware that is rarely updated. On the other hand, embedded devices
often lack security mechanisms such as memory management units that are found in higher-power devices. As a result of
these factors, even well-funded companies continue to have trouble securing their embedded systems. An example of this
difficulty is the 2019 flaw in Apple's iPhone SoC first-stage ROM bootloader that allows for the full compromise of any
iPhone older than iPhone X given physical access to the device~\cite{heise01}. iPhone 8, one of the affected models, was
still being manufactured and sold by Apple until April 2020. In another instance in 2016, researchers found multiple
flaws in Samsung's implementation of ARM TrustZone ``secure world'' firmware that Samsung used for their own mobile
phone SoCs. The flaws they found were both architectural flaws such as secret user input being passed through untrusted
userspace processes as well as cryptographic flaws such as
CVE-2016-1919\footnote{\url{http://cve.circl.lu/cve/CVE-2016-1919}}~\cite{kanonov01}. In a similar way, in 2014,
researchers found an integer overflow flaw in the low-level code handling untrusted input in Qualcomm's QSEE
firmware\footnote{For an overview of ARM TrustZone including a survey of academic work and past
security vulnerabilities of TrustZone-based firmware see~\cite{pinto01}.}~\cite{rosenberg01}.
If even companies with R\&D budgets that rival some countries' national budgets at mass-market consumer devices
have trouble securing their mass market secure embedded software stacks, what is a much smaller smart meter manufacturer
to do? Especially if national standards mandate complex protocols such as TLS that are difficult to implement
correctly~\cite{georgiev01}, this manufacturer will be short on options to secure their product.
\subsection{Attack surface in the smart grid}
From the incidents we outlined in the previous paragraphs we conclude that in smart metering technology, market
incentives do not currently provide the conditions for a level of device security that will reliably last for decades
after deployment. Considering this tension, in this paragraph we examine the cyberphysical risks that arise from attacks
on the smart grid in the first place. These risks arise at three different infrastructure levels.
The first level is that of attacks on centralized control systems. This type of attack is often cited in popular
discourse and to our knowledge is the only type of attack against an electric grid that has ever been carried out in
practice at scale~\cite{lee01}. Despite their severity, these attacks do not pose a strictly \emph{scientific} challenge
since they are generic to any industrial control system. Their causes and countermeasures are generally well-understood
and the hardest challenge in their prevention is likely to be budgetary constraints.
Beyond the centralized control systems, the next target for an attacker may be the communication links between those
control systems and other smart grid components. While in some countries such as Italy special-purpose systems such as
PLC are common~\cite{ec03}, overall, IP-based technologies have proliferated according to the larger trend towards
IP-based communications. This proliferation of IP links brings along the possibility for the application of generic
network security measures from the IP world to the smart grid domain. In this way, a standardized, IP-based protocol
stack unlocks decades of network security improvements at little cost.
Beyond these layers towards the core of the smart grid's control infrastructure, an attacker might also corrupt the
network from the edges and target the endpoint devices itself. The large scale deployment of networked smart meters
creates an environment that is favorable to such attacks.
% FIXME cite RECESSIM landis+gyr protocol hacking wiki/youtube
\subsection{Cyberphysical threats in the smart grid}
Assuming that an attacker has compromised devices on any of these levels of smart grid infrastructure, what could they
do with their newly gained power? The obvious action would be to switch off everything. Of all scenarios,
this is both the most likely in practice---it is exactly what happened in the Russian cyberattacks on the Ukranian
grid~\cite{lee01}---but it is also the easiest to mitigate since the vulnerable components are few and centralized.
Mitigations include the installation of fail safes as well as a defense in depth approach to hardening the grid's
cyber infrastructure.
Another possible action for an attacker would be to forge energy measurements in an attempt to cause financial mayhem.
Both individual consumers as well as the utility could be targeted by such an attack. While such an attack might have
localized success, larger-scale discrepancies will likely quickly be caught by monitoring systems. For example, if a
large number of meters in an area systematically under- or over-reported their energy readings, meter readings across
the affected area would no longer add up with those of monitoring devices in other locations in the transmission and
distribution grid.
In some countries, smart meter functionality goes beyond mere monitoring devices and also includes remotely controlled
switches. There are two types of these switches: Switches to support \emph{Demand-Side Management} (DMS) and cut
off-switches that are used to punish defaulting customers. Demand Side Management is when a grid operator can remotely
control the timing of large, non-time-critical loads on the customer's premises~\cite{dzung01}. A typical example of this
is a customer using an electric water heater: The heater is outfitted with a large hot water storage tank and is
connected hooked up to the utility's DSM system. The customer does not care when exactly their water is heated as long
as there is enough of it, and the utility offers them cheaper rates for the electricity used for heating in exchange for
control over its precise timing. The utility uses this control to even out peaks in the consumption/production
imbalance, remotely enabling DSM systems during off-peak times and disabling them during peak hours. In contrast to
DSM, cut-off switches are switches placed in between the grid and the entire customer's household such that the utility
can disconnect non-paying customers without incurring the expense of sending a technician to the customer's premises.
Unlike DSM systems, cut-off switches are not opt-in~\cite{anderson01,temple01}. An attack that uses cut-off switches
would obviously immediately cause severe mayhem. Attacks on DSM may have more limited immediate impact as affected
consumers may not notice an interruption for several hours.
Instead of switching off loads outright, an attack employing DSM switches (and potentially also cut-off switches) could
choose to target the grid's stability. By synchronizing many compromised smart meters to switch on and off a large
load capacity, an attacker might cause the entire electrical grid to oscillate~\cite{kosut01,wu01,kim01}. As a large
system of coupled mechanical systems, the electrical grid exhibits a complex frequency-domain behavior. Resonance
effects, colloquially called ``modes'', are well-studied in power system
engineering~\cite{rogers01,grebe01,entsoe01,crastan03}. As they can cause issues even under normal operating conditions,
a large effort is invested in dampening these resonances. Howewer, fully eliminating them under changing load conditions
may not be achievable.
\subsection{Communication Channels on the Grid}
A core part of intervening with any such cyberattack is the ability to communicate remediary actions to the devices
under attack. There is a number of well-established technologies for communication on or along power lines. We can
distinguish three basic system categories: systems using separate wires (such as DSL over landline telephone wiring),
wireless radio systems (such as LTE) and \emph{Power Line Communication} (PLC) systems that reuse the existing mains
wiring and superimpose data transmissions onto the 50 Hz mains sine~\cite{gungor01,kabalci01}.
During a large-scale cyberattack, availability of internet and cellular connectivity cannot be relied upon. An attacker
may already have disabled such systems in a separate attack, or they may go down along with parts of the electrical
grid. Traditional powerline communication systems or an utitly's proprietary wireless systems would work, but at a range
of no more than several tens of kilometers reaching all meters in a country would require a large upfront infrastructure
investment.
\section{Grid Frequency as a Communication Channel}
We propose to approach the problem of broadcasting an emergency signal to all smart meters within a synchronous area by
using grid frequency as a communication channel. Despite the technological complexity of the grid, the physics
underlying its response to changes in load and generation is surprisingly simple. Individual machines (loads and
generators) can be approximated by a small number of differential equations and the entire grid can be modelled by
aggregating these approximations into a large system of nonlinear differential equations. As a consequence, small signal
changes in generation/consumption power balance cause an approximately proportional change in
frequency~\cite{kundur01,crastan03,entsoe02,entsoe04}. This \emph{Power Frequency Charactersistic} is about
\SI{25}{\giga\watt\per\hertz} for the continental European synchronous area according to European electricity grid
authority ENTSO-E.
During a large-scale cyberattack, availability of internet and cellular connectivity cannot be relied upon. An attacker
may already have disabled such systems in a separate attack, or they may go down along with parts of the electrical
grid. Powerline communication systems will likely be unaffected by an attack, but at a range of no more than several
tens of kilometers, covering the entire grid would require a large upfront infrastructure investment for transmitters.
If we modulate the power consumption of a large load such as a multi-megawatt aluminium smelter, this modulation will
result in a small change in frequency according to this characteristic. As long as we stay within the operational limits
set by ENTSO-E~\cite{entsoe02,entsoe03}, this change will not degrade the operation of other parts of the grid. The
advantages of grid frequency modulation are the fact that a single transmitter can cover an entire synchronous area as
well as low receiver hardware complexity.
We propose to approach the problem of broadcasting an emergency signal to all grid-connected devices such as smart
meters or IoT appliances within a synchronous area by using grid frequency as a communication channel. Despite the
technological complexity of the grid, the physics underlying its response to changes in load and generation is
surprisingly simple. Individual machines (loads and generators) can be approximated by a small number of differential
equations describing their control systems' interaction with the machine's physics, and the entire grid can be modelled
by aggregating these approximations into a large system of differential equations. As a consequence, small signal
changes in generation/consumption power balance cause an approximately proportional change in
frequency~\cite{kundur01,crastan03,entsoe02,entsoe04}. The slope of this first-order approximation is known as
\emph{Power Frequency Charactersistic}, and in case of the continental European synchronous area happens to be about
\SI{25}{\giga\watt\per\hertz} according to the European electricity grid authority, ENTSO-E.
If we modulate the power consumption of a large load, this modulation will result in a small change in frequency
according to this characteristic. As long as we stay within the operational limits set by
ENTSO-E~\cite{entsoe02,entsoe03}, this change will not degrade the operation of other parts of the grid. The advantages
of grid frequency modulation are the fact that a single transmitter can cover an entire synchronous area as well as low
receiver hardware complexity.
To the best of the authors' knowledge, grid frequency modulation has only ever been proposed as a communication channel
at very small scales in microgrids before~\cite{urtasun01} and has not yet been considered for large-scale application.
Compared to traditional channels such as DSL, LTE or LoraWAN, grid frequency as a communication channel has a large
resiliency advantage: If there is power, a grid frequency modulation system is operational. Both DSL and LTE systems not
only require power but also require large amounts of centralized infrastructure to operate. Mesh networks such as
Compared to traditional channels such as DSL, LTE or LoraWAN, grid frequency as a communication channel has a resiliency
advantage: If there is power, a grid frequency modulation system is operational. Both DSL and LTE systems not only
require power at their base stations, but also require centralized infrastructure to operate. Mesh networks such as
LoraWAN can cover short distances up to $\SI{20}{\kilo\meter}$ without requiring infrastructure to be available, but for
longer distances LoraWAN relies on the public internet for its network backbone. Additionally, systems such as DSL, LTE
and LoraWAN are built around a point-to-point communication model and usually do not support a generic broadcast
primitive. During times when a large number of devices must be reached simultaneously this can lead to congestion of
local cellular towers or gateways.
Therefore, during an ongoing cyberattack, grid frequency is promising as a communication channel as only a single
transmitter facility must be operational for it to function, and this single transmitter can reach all connected devices
simultaneously. After a power outage, it can function as soon as electrical power is restored, even while the public
internet and mobile networks are still offline and it is unaffected by cyberattacks that target telecommunication
networks.
cellular towers and servers. Therefore, during an ongoing cyberattack, grid frequency is promising as a communication
channel because only a single transmitter facility must be operational for it to function, and this single transmitter
can reach all connected devices simultaneously. After a power outage, it can resume operation as soon as electrical
power is restored, even while the public internet and mobile networks are still offline. It is unaffected by
cyberattacks that target telecommunication networks.
\subsection{Characterizing Grid Frequency}
\label{grid-freq-characterization}
To collect ground truth measurements for our analysis of grid frequency as a communication channel, we developed a
device to safely record mains voltage waveforms. Our system consists of an \texttt{STM32F030F4P6} ARM Cortex M0
microcontroller that records mains voltage using its internal 12-bit ADC and transmits measured values through a
galvanically isolated USB/serial bridge to a host computer. We derive our system's sampling clock from a crystal oven to
avoid frequency measurement noise due to thermal drift of a regular crystal: \SI{1}{ppm} of crystal drift would cause a
grid frequency error of $\SI{50}{\micro\hertz}$. We compared our oven-stabilized clock against a GPS 1 pps reference and
found that over a time span of 20 minutes both stayed stable within 5 ppb of each other, which corresponds to the drift
specification of a typical crystal oven.
Before analyzing grid frequency as a communication channel, we developed a device that allows us to collect ground truth
for our analysis by safely recording the grid voltage waveform. Our system consists of an \texttt{STM32F030F4P6} ARM
Cortex M0 microcontroller that records mains voltage using its internal 12-bit ADC and transmits measured values through
a galvanically isolated USB/serial bridge to a host computer. We derive our system's sampling clock from a crystal oven
to avoid frequency measurement noise due to thermal drift of a regular crystal: \SI{1}{ppm} of crystal drift would cause
a grid frequency error of $\SI{50}{\micro\hertz}$. We compared our oven-stabilized clock against a GPS 1 pps reference
and found that over a time span of 20 minutes both stayed stable within 5 ppb of each other, which corresponds to the
drift specification of a typical crystal oven.
In utility SCADA systems, Phasor Measurement Units (PMUs, also called \emph{synchrophasors}) are used to precisely
measure grid frequency among other parameters. Details on the inner workings of commercial phasor measurement units are
scarce but there is a large amount of academic research on measurement. PMUs employ complex signal analysis algorithms
to provide fast and precise measurements even when given a heavily distorted input
signal~\cite{narduzzi01,derviskadic01,belega01}.
In utility SCADA systems, Phasor Measurement Units (PMUs) are used to precisely measure grid frequency among other
parameters. Details on the inner workings of commercial phasor measurement units are scarce but there is a large amount
of academic research on their measurement algorithms. PMUs employ complex signal analysis algorithms to provide fast
and precise measurements even when given a heavily distorted input signal~\cite{narduzzi01,derviskadic01,belega01}.
In our application, we do not need the same level of precision. For the sake of simplicity, we use the universal
frequency estimation approach of Gasior and Gonzalez~\cite{gasior01}. In this algorithm, the windowed input signal is
@ -425,7 +441,8 @@ self-regulating effect of loads. %FIXME citation Above a $\SI{10}{\second}$ peri
thus the $1/f$ noise we observe is the result of the interaction between primary control and consumer demand. On top of
this $1/f$ behavior, the spectrum shows several sharp peaks at time intervals with a ``round'' number such as
$\SI{10}{\second}$, $\SI{60}{\second}$ or multiples of $\SI{300}{\second}$. These peaks are due to loads turning on- or
off depending on wall-clock time. Besides the narrow peaks caused by this effect we can also observe two wider bumps at
off depending on wall-clock time, and demand forecasting not being able to precisely match the amplitude of these large
changes in load. Besides the narrow peaks caused by this effect we can also observe two wider bumps at
$\SI{7.0}{\second}$ and $\SI{4.7}{\second}$. These bumps closely correlate with continental european synchonous area's
oscillation modes at $\SI{0.15}{\hertz}$ (east-west) and $\SI{0.25}{\hertz}$ (north-south)~\cite{grebe01}.
@ -439,27 +456,27 @@ by repurposing a large industrial load as a transmitter. Going through a
list of energy-intensive industries in Europe~\cite{ec01}, we found that an aluminium smelter would be a good candidate.
In aluminium smelting, aluminium is electrolytically extracted from alumina solution. High-voltage mains power is
transformed, rectified and fed into about 100 series-connected electrolytic cells forming a \emph{potline}. Inside these
pots alumina is dissolved in molten cryolite electrolyte at about \SI{1000}{\degreeCelsius} and electrolysis is
pots, alumina is dissolved in molten cryolite electrolyte at about \SI{1000}{\degreeCelsius} and electrolysis is
performed using a current of tens or hundreds of Kiloampère. The resulting pure aluminium settles at the bottom of the
cell and is tapped off for further processing.
Aluminium smelters are operated around the clock, and due to the high financial stakes their behavior under power
outages has been carefully characterized. Power outages of tens of minutes up to two hours reportedly do
not cause problems in aluminium potlines~\cite{eisma01,oye01}. Recently, even techniques for intentional power modulation
without affecting cell lifetime or product quality have been developed to take advantage of variable energy
outages has been carefully characterized. Power outages of tens of minutes up to two hours reportedly do not cause
problems in aluminium potlines~\cite{eisma01,oye01}. Recently, even techniques for intentional power modulation without
affecting cell lifetime or product quality have been developed to take advantage of variable energy
prices~\cite{duessel01,eisma01,depree01}. An aluminium plant's power supply is controlled to constantly keep all
smelter cells under optimal operating conditions. Modern power supply systems employ large banks of diodes or thyristors to
rectify low-voltage AC to DC to be fed into the potline~\cite{ayoub01}. Potline voltage is controlled through a
smelter cells under optimal operating conditions. Modern power supply systems employ large banks of diodes or thyristors
to rectify low-voltage AC to DC to be fed into the potline~\cite{ayoub01}. Potline voltage is controlled through a
combination of a tap changer and a transductor. Individual cell voltages are controlled by changing the physical
distance between anode and cathode distance. In this setup, power can be electronically modulated using the thyristor
rectifier. Since the system does not have any mechanical inertia, high modulation rates are possible.
In~\cite{depree01}, the authors describe a setup where a large Aluminium smelter in continental Europe is used as
primary control reserve for frequency \emph{regulation}. In this setup, a rise time of $\SI{15}{\second}$ was achieved
to meet the $\SI{30}{\second}$ requirement posed by local standards for primary control. In their conclusion, the
authors note that for their system, an energy storage capacity of $\SI{7.7}{\giga\watt\hour}$ is possible if all plants
of a single operator are used. Given the maximum modulation depth of $\SI{100}{\percent}$ for up to one hour that is
mentioned by the authors, this results in an effective modulation power of $\SI{7.7}{\giga\watt}$. Over a longer
primary control reserve for frequency regulation. In this setup, a rise time of $\SI{15}{\second}$ was achieved to meet
the $\SI{30}{\second}$ requirement posed by local standards for primary control. In their conclusion, the authors note
that for their system, an effective thermal energy storage capacity of $\SI{7.7}{\giga\watt\hour}$ is possible if all
plants of a single operator are used. Given the maximum modulation depth of $\SI{100}{\percent}$ for up to one hour that
is mentioned by the authors, this results in an effective modulation power of $\SI{7.7}{\giga\watt}$. Over a longer
timespan of $\SI{48}{\hour}$, they have demonstrated a $\SI{33}{\percent}$ modulation depth which would correspond to a
modulation power of $\SI{2.5}{\giga\watt}$. We conclude that a modulation of part of an aluminium smelter's power
consumption is possible at no significant production impact and at low infrastructure cost. Aluminium smelters are
@ -483,11 +500,17 @@ spread-spectrum technique. By spreading signal energy throughout a wide band, bo
minimized and the risk of mode excitation is reduced since spread-spectrum techniques minimize energy in any particular
sub-band.
In this paper, we chose to perform simulations using Direct Sequence Spread Spectrum for its simple implementation and
good overall performance. DSSS chip timing should be as fast as the transmitter's physics allow to exploit the low-noise
The spread-spectrum technique that we chose is Direct Sequence Spread Spectrum for its simple implementation and good
overall performance. DSSS chip timing should be as fast as the transmitter's physics allow to exploit the low-noise
region between $\SI{0.2}{\hertz}$ to $\SI{2.0}{\hertz}$ in Figure~\ref{fig_freq_spec}. Going past
$\approx\SI{2}{\hertz}$ would complicate frequency measurement at the receiver side.
\paragraph{Direct Sequence Spread Spectrum (DSSS) modulation}
% FIXME quickly explain DSSS here.
\paragraph{DSSS parametrization}
We simulated a proof-of-concept modulator and demodulator using data captured from our grid frequency sensor. Our
simulations covered a range of parameters in modulation amplitude, DSSS sequence bit depth, chip duration and detection
threshold. Figure~\ref{fig_ser_nbits} shows our simulation results for symbol error rate (SER) as a function of
@ -509,7 +532,8 @@ from $\SI{0.2}{\hertz}$ to $\SI{2}{\hertz}$.
\begin{figure}
\centering
\hspace*{-1cm}\includegraphics[width=0.5\textwidth]{../notebooks/fig_out/dsss_thf_amplitude_5678}
\hspace*{-5mm}\includegraphics[width=0.5\textwidth]{../notebooks/fig_out/dsss_thf_amplitude_5678}
\vspace*{-5mm}
\caption{SER vs.\ Amplitude and detection threshold. Detection threshold is set as a factor of background noise
level.}
\label{fig_ser_thf}
@ -517,8 +541,8 @@ from $\SI{0.2}{\hertz}$ to $\SI{2}{\hertz}$.
\begin{figure}
\centering
\hspace*{-1cm}\includegraphics[width=0.5\textwidth]{../notebooks/fig_out/chip_duration_sensitivity_6}
\vspace*{-1cm}
\hspace*{-5mm}\includegraphics[width=0.5\textwidth]{../notebooks/fig_out/chip_duration_sensitivity_6}
\vspace*{-5mm}
\caption{SER vs.\ DSSS chip duration.}
\label{fig_ser_chip}
\end{figure}
@ -664,6 +688,7 @@ Source code and EDA designs are available at the public repository listed at the
\bibliography{\jobname}
\center{
\footnotesize
\center{This is version \texttt{\input{version.tex}\unskip} of this paper, generated on \today. The git repository
can be found at:}