608 lines
41 KiB
TeX
608 lines
41 KiB
TeX
\documentclass[12pt,a4paper,notitlepage]{report}
|
|
\usepackage[utf8]{inputenc}
|
|
\usepackage[a4paper,textwidth=17cm, top=2cm, bottom=3.5cm]{geometry}
|
|
\usepackage[T1]{fontenc}
|
|
\usepackage[
|
|
backend=biber,
|
|
style=numeric,
|
|
natbib=true,
|
|
url=true,
|
|
doi=true,
|
|
eprint=false
|
|
]{biblatex}
|
|
\addbibresource{safety_reset.bib}
|
|
\usepackage{amssymb,amsmath}
|
|
\usepackage{listings}
|
|
\usepackage{eurosym}
|
|
\usepackage{wasysym}
|
|
\usepackage{amsthm}
|
|
\usepackage{tabularx}
|
|
\usepackage{multirow}
|
|
\usepackage{multicol}
|
|
\usepackage{tikz}
|
|
|
|
\usetikzlibrary{arrows}
|
|
\usetikzlibrary{backgrounds}
|
|
\usetikzlibrary{calc}
|
|
\usetikzlibrary{decorations.markings}
|
|
\usetikzlibrary{decorations.pathreplacing}
|
|
\usetikzlibrary{fit}
|
|
\usetikzlibrary{patterns}
|
|
\usetikzlibrary{positioning}
|
|
\usetikzlibrary{shapes}
|
|
|
|
\usepackage{hyperref}
|
|
\usepackage{tabularx}
|
|
\usepackage{commath}
|
|
\usepackage{graphicx,color}
|
|
\usepackage{subcaption}
|
|
\usepackage{float}
|
|
\usepackage{footmisc}
|
|
\usepackage{array}
|
|
\usepackage[underline=false]{pgf-umlsd}
|
|
\usetikzlibrary{calc}
|
|
%\usepackage[pdftex]{graphicx,color}
|
|
%\usepackage{epstopdf}
|
|
% Needed for murks.tex
|
|
\usepackage{setspace}
|
|
\usepackage[draft=false,babel,tracking=true,kerning=true,spacing=true]{microtype} % optischer Randausgleich etc.
|
|
% For german quotation marks
|
|
|
|
\newcommand{\foonote}[1]{\footnote{#1}}
|
|
\newcommand{\degree}{\ensuremath{^\circ}}
|
|
\newcolumntype{P}[1]{>{\centering\arraybackslash}p{#1}}
|
|
|
|
\begin{document}
|
|
|
|
% Beispielhafte Nutzung der Vorlage für die Titelseite (bitte anpassen):
|
|
\input{murks}
|
|
\titel{FIXME} % Titel der Arbeit
|
|
\typ{Masterarbeit} % Typ der Arbeit: Diplomarbeit, Masterarbeit, Bachelorarbeit
|
|
\grad{Master of Science (M. Sc.)} % erreichter Akademischer Grad
|
|
% z.B.: Master of Science (M. Sc.), Master of Education (M. Ed.), Bachelor of Science (B. Sc.), Bachelor of Arts (B. A.), Diplominformatikerin
|
|
\autor{Jan Sebastian Götte}
|
|
\gebdatum{Aus datenschutzrechtlichen Gründen nicht abgedruckt} % Geburtsdatum des Autors
|
|
\gebort{Aus datenschutzrechtlichen Gründen nicht abgedruckt} % Geburtsort des Autors
|
|
\gutachter{Prof. Dr. Björn Scheuermann}{FIXME} % Erst- und Zweitgutachter der Arbeit
|
|
\mitverteidigung % entfernen, falls keine Verteidigung erfolgt
|
|
\makeTitel
|
|
\selbstaendigkeitserklaerung{31.03.2020}
|
|
\newpage
|
|
|
|
% Hier folgt die eigentliche Arbeit (bei doppelseitigem Druck auf einem neuen Blatt):
|
|
\tableofcontents
|
|
\newpage
|
|
|
|
\chapter{Introduction}
|
|
\section{Structure and operation of the electrical grid}
|
|
\subsection{Structure of the electrical grid}
|
|
\subsubsection{Generators and loads}
|
|
\subsubsection{Transformers}
|
|
\subsubsection{Tie lines}
|
|
|
|
\subsection{Operational concerns}
|
|
\subsubsection{Modelling the electrical grid}
|
|
\subsubsection{Generator controls}
|
|
\subsubsection{Load shedding}
|
|
\subsubsection{System stability}
|
|
\subsubsection{Power System Stabilizers}
|
|
|
|
\subsubsection{Smart metering}
|
|
|
|
\section{Smart meter technology}
|
|
\subsubsection{Common components}
|
|
|
|
Smart meters usually are built around a standard microcontroller. \label{sm-cpu}
|
|
\subsubsection{Cryptographic coprocessors}
|
|
\subsubsection{Physical structure}
|
|
\subsubsection{Physical installation}
|
|
|
|
\section{Regulatory frameworks around the world}
|
|
\subsection{International standards}
|
|
\subsection{Regulations in Europe}
|
|
\subsection{The regulatory situation in Germany}
|
|
\subsection{The regulatory situation in France}
|
|
\subsection{The regulatory situation in the UK}
|
|
\subsection{The regulatory situation in Italy}
|
|
\subsection{The regulatory situation in northern America}
|
|
\subsection{The regulatory situation in Japan}
|
|
\subsection{Common themes}
|
|
|
|
\section{Security in smart grids}
|
|
The smart grid in practice is nothing more or less than an aggregation of embedded control and measurement devices that
|
|
are part of a large control system. This implies that all the same security concerns that apply to embedded systems in
|
|
general also apply to most components of a smart grid in some way. Where programmers have been struggling for decades
|
|
now with input validation\cite{leveson01}, the same potential issue raises security concerns in smart grid scenarios as
|
|
well\cite{mo01, lee01}. Only, in smart grid we have two complicating factors present: Many components are embedded
|
|
systems, and as such inherently hard to update. Also, the smart grid and its control algorithms act as a large
|
|
(partially-)distributed system, making problems such as input validation or authentication difficult to
|
|
implement\cite{blaze01} and adding a host of distributed systems problems on top\cite{lamport01}.
|
|
|
|
Given that the electrical grid is a major piece of essential infrastructure in modern civilization, these problems
|
|
amount to significant issues in practice. Attacks on the electrical grid may have grave consequences\cite{lee01} all the
|
|
while the long maintenance cycles of various components make the system slow to adapt. Thus, components for the smart
|
|
grid need to be built to a much higher standard of security than most consumer devices to ensure they live up to
|
|
well-funded attackers even decades down the road. This requirement intensifies the challenges of embedded security and
|
|
distributed systems security among others that are inherent in any modern complex technological system.
|
|
|
|
A point we will not consider in much depth is theft of electricity. A large part of the motivation of the introduction
|
|
of smart meters seems to be % TODO weak statement
|
|
to reduce the level of fraud by consumers. Academic papers tend to either focus on other benefits such as generation
|
|
efficiency gains through better forecasting or try to rationalize the funamentally anti-consumer nature of smart
|
|
metering with strenuous claims of ``enormous social benefits''\cite{mcdaniel01}. We will entirely focus on grid
|
|
stability and discard electricity theft in the context of this paper for two reasons: One, billing inaccuracies of
|
|
electricity companies are of very low urgency compared to grid stability, and the one is a precondition for the other.
|
|
Two, utility companies can already put strong bounds on the amount of theft by simply cross-refrencing meter readings
|
|
against trusted readings from upstream sections of the grid. This capability works even without smart meters and only
|
|
gains speed from smart meters, just as the old exploit of bypassing the meter with a section of wire can't be prevented
|
|
like this.
|
|
|
|
Due to these bounds on its volume, electricity theft using smart meter hacking would not scale. Hackers would simply be
|
|
rooted up one by one with no damage to consumers and very limmited damage to utility companies. Damage in these
|
|
scenarios would be a far cry from the efficiency of an exponentially growing botnet.
|
|
|
|
\subsection{Smart grid components as embedded devices}
|
|
A fundamental challenge in smart grid implementations is the central role smart electricity meters play. Smart meters
|
|
are used both for highly-granular load measurement and (in some countries) load switching\cite{zheng01}.
|
|
Smart electricity meters are effectively consumer devices. They are built down to a certain price point that is
|
|
measured by the burden it puts on consumers and that is generally fixed by regulatory authorities. % FIXME cite
|
|
This requirement precludes some hardware features such as the use of a standard hardened software environment on a
|
|
high-powerded embedded system (such as a hypervirtualized embedded linux setup) that would both increase resilience
|
|
against attacks and simplify updates. Combined with the small market sizes in smart grid deployments
|
|
\footnote{
|
|
Most vendors of smart electricity meters only serve a handful of markets. For the most part, smart meter development
|
|
cost lies in the meter's software % TODO cite?
|
|
There exist multiple competing standards applicable to various parts of a smart electricity meter. In addition,
|
|
most countries have their own certification regimen\cite{cenelec01}. This complexity creates a large development
|
|
burden for new market entrants\cite{perez01}.
|
|
}
|
|
this produces a high cost pressure on the software development process for smart electricity meters.
|
|
|
|
\subsection{The state of the art in embedded security}
|
|
Embedded security generally is much harder than security of higher-level systems. This is due to a combination of the
|
|
unique constraints of embedded devices (hard to update, usually small quantity) and their lack of capabilities
|
|
(processing power, memory protection functions, user interface devices). Even very well-funded companies continue to
|
|
have serious problems securing their embedded systems. A spectacular example of this difficulty is the recently-exposed
|
|
flaw in Apple's iPhone SoC first-stage ROM bootloader\footnote{
|
|
Modern system-on-chips integrate one or several CPUs with a multitude of peripherals, from memory and DMA
|
|
controllers over 3D graphics accelerators down to general-purpose IO modules for controlling things like indicator
|
|
LEDs. Most SoCs boot from one of several boot devices such as flash memory, ethernet or USB according to a
|
|
configuration set e.g. by connecting some SoC pins a certain way or set by device-internal write-only fuse bits.
|
|
|
|
Physically, one of the processing cores of the SoC (usually one of the main CPU cores) is connected such that it is
|
|
taken out of reset before all other devices, and is tasked with switching on and configuring all other devices of
|
|
the SoC. In order to run later intialization code or more advanced bootloaders, this core on startup runs a very
|
|
small piece of code hard-burned into the SoC in the factory. This ROM loader initializes the most basic peripherals
|
|
such as internal SRAM memory and selects a boot device for the next bootloader stage.
|
|
|
|
Apple's ROM loader performs some authorization checks, to ensure no unauthorized software is loaded. The present
|
|
flaw allows an attacker to circumvent these checks, booting code not authorized by Apple on a USB-connected iPhone,
|
|
compromising Apple's chain of trust from ROM loader to userland right at its root.
|
|
}, that allows a full compromise of any iPhone before the iPhone X. iPhone 8, one of the affected models, is still being
|
|
manufactured and sold by Apple today\footnote{
|
|
i.e. at the time this paragraph was written, on %FIXME
|
|
}. In another instance, Samsung put a flaw in their secure-world firmware used for protection of sensitive credentials
|
|
in their mobile phone SoCs in % FIXME year % .
|
|
If both of these very large companies have trouble securing parts of their secure embedded software stacks measuring a
|
|
mere few hundred bytes in Apple's case or a few kilobytes in Samsung's, what is a smart electricity meter manufacturer
|
|
to do? For their mass-market phones, these two companies have R\&D budgets that dwarf some countries' national budgets.
|
|
% FIXME hyperbole?
|
|
% FIXME cite
|
|
|
|
Since thorough formal verification of code is not yet within reach for either large-scale software development or
|
|
code heavy in side-effects such as embedded firmware or industrial control software\cite{pariente01}
|
|
the two most effective measures for embedded security is reducing the amount of code on one hand, and labour-intensively
|
|
checking and double-checking this code on the other hand. A smart electricity manufacturer does not have a say in the
|
|
former since it is bound by the official regulations it has to comply with, and will almost certainly not have sufficient
|
|
resources for the latter.
|
|
% FIXME expand?
|
|
% FIXME cite some figures on code size in smart meter firmware?
|
|
|
|
\subsection{Attack avenues in the smart grid}
|
|
If we model the smart grid as a control system responding to changes in inputs by regulating outputs, on a very high
|
|
level we can see two general categories of attacks: Attacks that directly change the state of the outputs, and attacks
|
|
that try to influence the outputs indirectly by changing the system's view of its inputs. The former would be an attack
|
|
such as one that shuts down a power plant to decrease generation capacity. The latter would be an attack such as one
|
|
that forges grid frequency measurements where they enter a power plant's control systems to provoke increasing
|
|
oscillation in the amount of power generated by the plant according to the control systems' directions.
|
|
% FIXME cite
|
|
% FIXME expand
|
|
|
|
\subsubsection{Communication channel attacks}
|
|
Communication channel attacks are attacks on the communication links between smart grid components. This could be
|
|
attacks on IP-connected parts of the core network or attacks on shared busses between smart meters and IP gateways in
|
|
substations. Generally, these attacks can be mitigated by securing the aforementioned communication links using modern
|
|
cryptography. IP links can be protected using TLS, and more low-level busses can be protected using more lightweight
|
|
Noise-based protocols. % FIXME cite
|
|
Cryptographic security transforms an attackers ability to manipulate communication contents into a mere denial of
|
|
service attack. Thus, in addition to cryptographic security safety under DoS conditions must be ensured to ensure
|
|
continued system performance under attacks. This safety property is identical with the safety required to withstand
|
|
random outages of components, such as communications link outages due to physical damage from storms, flooding etc.
|
|
% FIXME cite papers on attack impact, on coutermeasures and on attack realization
|
|
In general, attacks at the meter level may be hard to weaponize % may be -> weak statement?
|
|
since meters are used mostly for billing and forecasting purposes % FIXME cite
|
|
and for more critical grid control purposes there exist several additional layers of sensors above smart meters that
|
|
limit how much an attacker can falsify smart meter readings without the manipulation being obvious. In order for an
|
|
attack to have more far-reaching consequences the attacker would need to compromise additional grid
|
|
infrastructure\cite{kim01,kosut01}.
|
|
|
|
\subsubsection{Exploiting centralized control systems}
|
|
The type of smart grid attack most often cited in popular discourse, and to the author's knowledge % FIXME verify, cite
|
|
the only type that has so far been conducted in practice, is a direct attack on centralized control systems. In this
|
|
attack, computer components of control systems are compromised by the same techniques used to compromise any other kind
|
|
of computer system such as exploiting insecure services running on internet-exposed ports and using one compromised
|
|
system to compromised other systems connected with it through an ostensably secure internal network. These attacks are
|
|
very powerful as they yield the attacker direct control over whatever outputs the control systems are controlling. If an
|
|
attacker manages to compromise a power stations control computers, they may be able to influence generation output or
|
|
even cause an emergency shutdown. % FIXME
|
|
|
|
Despite their potentially large impact, these attacks are only moderately interesting from a scientific perspective. For
|
|
one, their mitigation mostly consists of a straightforward application of security practices well-known for decades.
|
|
Though there is room for the implementation of genuinely new, application-specific security systems in this field, the
|
|
general state of the art is lacking behind the rest of the computer industry such that the low-hanging fruit should take
|
|
priority. % FIXME cite this bold claim very properly
|
|
|
|
In addition, given political will these systems can readily be secured since there is only a comparatively small number
|
|
of them and driving a technician to every one of them in turn to install some security update is perfectly feasible.
|
|
|
|
\subsubsection{Control function exploits}
|
|
Control function exploits are attacks on the mathematical control loops used by the centralized control system. One
|
|
example of such an attack would be resonance attacks as described in \textcite{wu01}.
|
|
In this kind of attack, inputs from peripheral sensors indicating grid load to the centralized control system are
|
|
carefully modified to cause a disproportionally large oscillation in control system action. This type of attack relies
|
|
on complex resonance effects that arise when mechanical generators are electrically coupled. These resonances,
|
|
coloquially called ``modes'' are well-studied in power system engineering\cite{rogers01,grebe01,entsoe01}.
|
|
% FIXME: refer to section on stability control above here
|
|
Even disregarding modern attack scenarios, for stability electrical grids are designed with measures in place to dampen
|
|
any resonances inherent to grid structure. Still, requiring an accurate grid model these resonances are hard to analyze
|
|
and unlikely to be noiticed under normal operating conditions.
|
|
|
|
Mitigation of these attacks is most easily done by on the one hand ensuring unmodified sensor inputs to the control
|
|
systems in the first place, and on the other hand carefully designing control systems not to exhibit exploitable
|
|
behavior such as oscillations.
|
|
% FIXME cite mitigation approaches
|
|
|
|
\subsubsection{Endpoint exploits}
|
|
One rather interesting attack on smart grid systems is one exploiting the grid's endpoint devices such as smart
|
|
electricity meters\footnote{
|
|
Though potentially this could also aim at other kinds of devices distributed on a large scale such as sensors in
|
|
unmanned substations. % FIXME cite verify
|
|
}
|
|
These meters are deployed on a massive scale, with several thousand meters deployed for every substation.
|
|
% FIXME cite (this should be straightforward)
|
|
Thus, once compromised restoration to an uncompromised state can be potentially very difficult if it requires physical
|
|
access to thousands of devices hidden inaccessible in private homes.
|
|
|
|
By compromising smart electricity meters, an attacker can trivially forge the distributed energy measurements these
|
|
devices perform. In a best-case scenario, this might only affect billing and lead to customers being under- or
|
|
over-charged if the attack is not noticed in time. However, in a less ideal scenario the energy measurements taken by
|
|
these devices migth be used to inform the grid centralized control systems % FIXME cite
|
|
and a falsification of these measurements might lead to inefficiency.
|
|
|
|
In some countries and for some customers, these smart meters have one additional function that is highly useful to an
|
|
attacker: They contain high-current load switches to disconnect the entire household or business in case electricity
|
|
bills are left unpaid for a certain period. In countries that use these kinds of systems, the load disconnect is often
|
|
simply hooked up to one of the smart merter's central microcontroller's general-purpose IO pins, allowing anyone
|
|
compromising this microcontroller's firmware to actuate the load switch at will. % FIXME validate cite add pictures
|
|
|
|
Given control over a large number of network-connected smart meters, an attacker might thus be able to cause large-scale
|
|
disruptions of power consumption by repeatedly disconnecting and re-connecting a large number of consumers.
|
|
% FIXME cite some analysis of this
|
|
Combined with an attack method such as the resonance attack from \textcite{wu01}
|
|
that was mentioned above, this scenario poses a serious danger to grid stability.
|
|
|
|
% FIXME add small-scale load shedding for heaters etc.
|
|
|
|
\subsection{Attacker models in the smart grid}
|
|
\subsection{Practical attacks}
|
|
\subsection{Practical threats}
|
|
\subsection{Conclusion, or why we are doomed}
|
|
We can conclude that a compromise of a large number of smart electricity meters cannot be ruled out. The complexity of
|
|
network-connected smart meter firmware makes it exceedingly unlikely that it is in fact flawless. Large-scale
|
|
deployments of these devices under some circumstances such as where they are used with load disconnect relays make them
|
|
an attractive target for attackers interested in causing grid instability. The attacker model for these devices very
|
|
definitely includes enemy states, who have considerable resources at their disposal.
|
|
|
|
For a reasonable guarantee that no large-scale compromises of hard- and software built today will happen over a span of
|
|
some decades, we would have to radically simplify its design and limit attack surface. Unfortunately, the complexity of
|
|
smart electricity meter implementations mostly stems from the large list of requirements these devices have to conform
|
|
with. Additionally, standards have already been written and changes that reduce scope or functionality have become
|
|
exceedingly unlikely at this point.
|
|
|
|
A general observation with smart grid systems of any kind is that they comprise a zealous departure of the decentralized
|
|
control structure of yesterday's dumb grid and the advent of centralization at an enormous scale. This modern,
|
|
centralized infrastructure has been carefully designed to defend against malicious actors%FIXME cite
|
|
and all involved parties have an interest in keeping it secure. Still, like in any other system this centralization also
|
|
makes a very attractive target for attackers since an attacker can likewise employ this centralized control to their
|
|
goals. Fundamentally, decentralized systems tend to make attacks of any kind a lot more costly and one might question
|
|
whether security has truly been gained during smart grid rollout. % FIXME hot take maybe
|
|
|
|
\chapter{Restoring endpoint safety in an age of smart devices}
|
|
If as layed out in the previous paragraph we cannot rule out a large-scale compromise of smart energy meters, we have to
|
|
rephrase our claim to security. If we cannot rule out exploitation, we have to limit its impact. If we assume that we
|
|
cannot strip any functionality from smart meters since it may be required by standards or for enormous social
|
|
benefits\cite{mcdaniel01} % FIXME is sarcasm ok here?
|
|
all we can do is to flush out an attacker once they are in.
|
|
|
|
In a worst-case scenario an attacker would gain unconstrained code execution e.g. by exploiting a flaw in a network
|
|
protocol implentation. Since smart meters use standard microcontrollers that do not have advanced memory protection
|
|
functions (see pg. \ref{sm-cpu}), at this point we can assume the attacker has full control over the main
|
|
microcontroller. With this control they can actuate the load switch if present, transmit data through the device's
|
|
communication interfaces or use the user interface components such as LEDs and the LCD. Using the self-programming
|
|
capabilities of modern flash microcontrollers, an attacker may even gain persistency without much trouble. Note that in
|
|
systems separating cryptographic functions into some form of cryptographic module such as systems used in Germany
|
|
% TODO list other countries as well? FIXME cite BSI standard requiring this
|
|
we can be optimistic and assume the attacker has not in fact compromised this cryptographic co-processor yet and does
|
|
not have access to any cryptographic secrets yet.
|
|
|
|
Given that the attacker has complete control over the meter's core microcontroller and given that due to cost
|
|
constraints we are bound to use whatever microcontroller the meter OEM has chosen for their design, we cannot rely on
|
|
software running on the core mircocontroller to restore system integrity.
|
|
|
|
Our solution to this problem is to add another, very small microcontroller to the smart meter design. This
|
|
microcontroller will contain a small piece of software to receive cryptographically authenticated commands from utility
|
|
companies and on demand reset the meter's core microcontroller to a known-good state. We have to assume the code in the
|
|
core controller's flash memory has been compromised, so our only option to flush out an attacker is to re-program the
|
|
core microcontroller in its entirety. We propose using JTAG to re-program the core microcontroller
|
|
% TODO get terminology consistent. Is "core microcontroller" a good term here?
|
|
with a known-good firmware image read from a sufficiently large SPI flash connected to the reset controller. JTAG is
|
|
supported by most microcontrollers complex enough to end up in a smart meter design % TODO colloquialism
|
|
and given adequate documentation JTAG programming functionality can be ported to new microcontrollers with relatively
|
|
little work.
|
|
|
|
On the microcontroller side our solution requires the JTAG interface to be activated (i.e. not fused-shut) and for our
|
|
solution to work core microcontroller firmware must not be able to permanently disable the JTAG interface from within.
|
|
In microcontrollers that do not yet provide this functionality this is a minor change that could be added to a custom
|
|
microcontroller variant at low cost. On most microcontrollers keeping JTAG open should not interfere with code readout
|
|
protection. Code secrecy should be of no concern\cite{schneier01} here but besides security manufacturers have strong
|
|
preferences about this due to fear of copyright infringement.
|
|
|
|
\section{The theory of endpoint safety}
|
|
\label{sec_criteria}
|
|
In order to gain anything by adding our reset controller to the smart meter's already complex design we must satisfy two
|
|
interrelated conditions.
|
|
\begin{enumerate}
|
|
\item \textsc{security} means our reset controller itself does not have any remotely exploitable flaws
|
|
\item \textsc{safety} menas our reset controller will perform its job as intended
|
|
\end{enumerate}
|
|
|
|
Note that our \textsc{security} property includes only remote exploitation, and excludes any form of hardware attack.
|
|
Even though most smart meters provide some level of physical security, we do not wish to make any assumptions on this.
|
|
In the following section we will elaborate our attacker model and it will become apparent that sufficient physical
|
|
security to defend against all attackers in our model would be infeasible, and thus we will design our overall system
|
|
to remain secure even assuming some number of physically compromised devices.
|
|
% FIXME expand
|
|
|
|
\subsection{Attack characteristics}
|
|
The attacker model these two conditions must hold under is as follows. We assume three angles of attack: Attacks by the
|
|
customer themselves, attacks by an insider within the metering systems controlling utility company and lastly attacks
|
|
from third parties. Examples for these third parties are hobbyist hackers or outside cyber-criminals on the one hand,
|
|
but also other companies participating in the smart grid infrastructure besides the utility company such as intermediary
|
|
providers of meter-reading services.
|
|
|
|
Due to the critical nature of the electrical grid, we have to include hostile state actors in our attacker model. When
|
|
acting directly, these would be classified as third-party attackers by the above schema, but they can reasonably be
|
|
expected to be able to assume either of the other two roles as well e.g. through infiltration or bribery.
|
|
\textcite{fraunholz01} in their elaboration of their generalized attacker model give some classification of attackers
|
|
and provide a nice taxonomy of attacker properties. In their threat/capability rating, criminals are still considered
|
|
to have higher threat rating than state-sponsored attackers. The New York Times reported in 2016 that some states
|
|
recruit their hacking personnel in part from cyber-criminals. If this report is true, in a worst-case scenario we have
|
|
to assume a state-sponsored attacker to be the worst of both types. Comparing this against the other attacker types in
|
|
\textcite{fraunholz01}, this state-sponsored attacker is strictly worse than any other type in both variables. We are
|
|
left with a highly-skilled, very well-funded, highly intentional and motivated attacker.
|
|
|
|
Based on the above classification of attack angles and our observations on state-sponsored attacks, we can adapt
|
|
\textcite{fraunholz01} to our problem, yielding the following new attacker types:
|
|
|
|
\begin{enumerate}
|
|
\item \textbf{Utility company insiders controlled by a state actor}
|
|
We can ignore the other internal threats described in \textcite{fraunholz01} since an insider cooperating with a
|
|
state actor is strictly worse in every respect.
|
|
\item \textbf{State-sponsored external attackers}
|
|
A state actor can obviously directly attack the system through the internet.
|
|
\item \textbf{Customers controlled by a state actor}
|
|
A state actor can very well compromise some customers for their purposes. They might either physically
|
|
infiltrate the system posing as legitimate customers, or they might simply deceive or bribe existing customers
|
|
into cooperation.
|
|
\item \textbf{Regular customers}
|
|
Though a hostile state actor might gain control of some number of customers through means such as voluntary
|
|
cooperation, bribery, infiltration, they are limited in attack scale since they do not want to arouse premature
|
|
attention. Though regular customers may not have the motivation, skill or resources of a state-sponsored
|
|
attacker, potentially large numbers of them may try to attack a system out of financial incentives. To allow for
|
|
this possibility, we consider regular customers separate from state actors posing as customers in some way.
|
|
\end{enumerate}
|
|
|
|
\subsection{Overall structural system security}
|
|
Considering overall security, we first introduce the \emph{reset authority}, a trusted party acting as the single
|
|
authority for issuing reset commands in our system. In practice this trusted party may be part of the utility company,
|
|
part of an external regulatory body or a hybrid setup requiring both to cooperate. We assume this party will be designed
|
|
to be secure against all of the above attacker types. The precise design of this trusted party is out of scope for this
|
|
work but we will list some practical suggestions on how to achieve security below. % FIXME do the list
|
|
% FIXME put up a large box on this limitation
|
|
|
|
Using an asymmetric cryptographic design centered around the \emph{reset authority}, we rule out all attacks except for
|
|
denial-of-service attacks on our system by any of the four attacker types. All reset commands in our system originate
|
|
from the \emph{reset authority} and are cryptographically secured to provide authentication and tamper detection.
|
|
Under this model, attacks on the electrical grid components between the \emph{reset authority} and the customer device
|
|
degrade into man-in-the-middle attacks. To ensure the \textsc{safety} criterion from \ref{sec_criteria} holds we must
|
|
% FIXME check whether this \ref displays as intended
|
|
make sure our cryptography is secure against man-in-the-middle attacks and we must try to harden the system against
|
|
denial-of-service attacks by the attacker types listed above. Given our attacker model we cannot fully guard against
|
|
this sort of attack but we can at least choose a commmunication channel that is resilient against denial of service
|
|
attacks under the above model.
|
|
|
|
Finally, we have to consider the issue of hardware security. We will solve the problem of physical attacks on some small
|
|
number of devices by simply not programming any secret information into these devices. This also simplifies hardware
|
|
production. From consideration in this work we explicitly rule out any form of supply-chain attack as
|
|
out-of-scope.
|
|
% FIXME include considerations on production testing somewhere (is the device working? is the right key programmed?)
|
|
|
|
\subsection{Complex microcontroller firmware}
|
|
The \textsc{security} property from \ref{sec_criteria} is in a large part reliant on the security of our reset
|
|
controller firmware. The best method to increase firmware security is to reduce attack surface by limiting external
|
|
interfaces as much as possible and by reducing code complexity as much as possible.
|
|
% FIXME formalize this as something like "Design Goal DG-023-42-1" ?
|
|
If we avoid the complexity of most modern microcontroller firmware we gain another benefit beyond implicitly reduced
|
|
attack surface: If the resulting design is small enough we may attempt formal verification of our security property.
|
|
Though formal verification tools are not yet suitable for highly complex tasks they are already barely adequate for
|
|
small amounds of code and simple interfaces.
|
|
|
|
\subsection{Modern microcontroller hardware}
|
|
Microcontrollers have gained enormously in both performance/efficiency as well as in peripheral support. Alas, these
|
|
gains have largely been driven by insatiable customer demand for faster, more powerful chips and for a long time
|
|
security has not been considered important outside of some specific niches such as smartcards. Traditionally a
|
|
microcontroller would spend its entire lifetime without ever being exposed to any networks. Though this trend has been
|
|
reversing with the increasing adoption of internet-of-things things % FIXME is this pun ok?
|
|
and more advanced security features have started appearing in general-purpose microcontrollers, most still lack even
|
|
basic functionality found in processors for computers or smartphones.
|
|
|
|
One of the components lacking from most microcontrollers is strong memory protection or even a memory mapping unit as
|
|
it is found in all modern computer processors and SoCs for applications such as smartphones. Without an MPU/MPU some
|
|
mitigations for memory safety violations cannot be implemented. This and the absence of virtualization tools such as
|
|
ARM's TrustZone make hardening microcontroller firmware a big task. It is very important to ensure memory safety in
|
|
microcontroller firmware through tools such as defensive coding, extensive testing and formal verification.
|
|
|
|
In our design we achieve simplicity on two levels: One, we isolate the very complex metering firmware from our reset
|
|
controller by having both run on separate microcontrollers. Two, we keep the reset controller firmware itself extremely
|
|
simple to reduce attack surface there.
|
|
|
|
\subsection{Regulatory and economical constraints}
|
|
\subsection{Safety vs. Security: Opting for restoration instead of prevention}
|
|
|
|
|
|
\subsection{Technical outline of a safety reset}
|
|
|
|
\section{Communication channels on the grid}
|
|
\subsection{Powerline communication systems and their use}
|
|
\subsection{Proprietary wireless systems}
|
|
\subsection{Landline IP}
|
|
\subsection{IP-based wireless systems}
|
|
\subsection{Frequency modulation as a communication channel}
|
|
|
|
For our system, we chose grid frequency modulation (henceforth GFC) as a low-bandwidth uni-directional communications channel.
|
|
Compared to traditional PLC GFC requires no additional hardware, works reliably throughout the grid and is harder to
|
|
manipulate by a malicious actor.
|
|
% FIXME \cite{urtasun01}
|
|
|
|
\subsubsection{The frequency dependance of grid frequency}
|
|
\subsubsection{Control systems coupled to grid frequency}
|
|
\subsubsection{Avoiding dangerous modes}
|
|
\subsubsection{Overall system parameters}
|
|
\subsubsection{An outline of practical implementation}
|
|
|
|
\section{From grid frequency to a reliable communications channel}
|
|
\subsection{Channel properties}
|
|
\subsection{Modulation and its parameters}
|
|
\subsection{Error-correcting codes}
|
|
\subsection{Cryptographic security}
|
|
|
|
\chapter{Practical implementation}
|
|
\section{Cryptographic validation}
|
|
|
|
\section{Data collection for channel validation}
|
|
\subsection{Frequency sensor hardware design}
|
|
\subsection{Frequency sensor measurement results}
|
|
|
|
\section{Channel simulation and parameter validation}
|
|
|
|
\section{Implementation of a demonstrator unit}
|
|
|
|
\section{Experimental results}
|
|
|
|
\section{Lessons learned}
|
|
|
|
\chapter{Future work}
|
|
\section{Technical standardization}
|
|
The description of a safety reset system provided in this work could be translated into a formalized technical standard
|
|
with relatively low effort. Our system is very simple compared to e.g. a full smart meter communication standard and
|
|
thus can conceivably be described in a single, concise document. The much more complicated side of standardization would
|
|
be the standardization of the backend operation including key management, coordination and command authorization.
|
|
|
|
\section{Regulatory adoption}
|
|
Since the proposed system adds significant cost and development overhead at no immediate benefit to either consumer or
|
|
utility company it is unlikely that it would be adopted voluntarily. Market forces limit what long-term planning utility
|
|
companies can do. An advanced mitigation such as this one might be out of their reach on their own and might require
|
|
regulatory intervention to be implemented. To regulatory authorities a system such as this one provides a powerful
|
|
primitive to guard against attacks. Due to the low-level approach our system might allow a regulatory authority to
|
|
restore meters to a safe state without the need of fine-grained control of implementation details such as application
|
|
network protocols.
|
|
|
|
A regulatory authority might specify that all smart meters must use a standardized reset controller that on command
|
|
resets to a minimal firmware image that disables external communication, continues basic billing functions and enables
|
|
any disconnect switches. This system would enable the \emph{reset authority} to directly preempt a large-scale attack
|
|
irrespective of implementation details of the various smart meter implementations.
|
|
|
|
Cryptographic key management for the smart reset system is not much different to the management of highly privileged
|
|
signing keys as they are used in many other systems already. If the safety reset system is implemented with a
|
|
regulatory authority as the \emph{reset authority} they would likely be able to find a public entity that is already
|
|
managing root keys for other government systems to also manage safety reset keys. Availability and security requirements
|
|
of safety reset keys do not differ significantly from those for other types of root keys.
|
|
|
|
\section{Practical implementation}
|
|
|
|
|
|
\section{Zones of trust}
|
|
In our design, we opted for a safety reset controller
|
|
% FIXME is "safety reset" the proper name here? We need some sort of branding, but is this here really about "safety"?
|
|
in form of a separate micocontroller entirely separate from whatever application microcontroller the smart meter design
|
|
is already using. This design nicely separates the meter into an untrusted application (the core microcontroller) and
|
|
the trusted reset controller. Since the interface between the two is simple and logically one-way, it can be validated
|
|
to a high standard of security.
|
|
|
|
Despite these security benefits, the cost of such a separate hardware device might prove high in a mass-market rollout.
|
|
In this case, one might attempt to integrate the reset controller into the core microcontroller in some way. Primarily,
|
|
there would be two ways to accomplish this. One is a solution that physically integrates an additional microcontroller
|
|
core into the main application microcontroller package either as a submodule on the same die or as a separate die in a
|
|
multi-chip module (MCM) with the main application microcontroller. A full-custom solution integrating both on a single
|
|
die might be a viable path for very large-scale deployments, but will most likely be too expensive in tooling costs
|
|
alone to justify its use. More likely for a medium- to large-scale deployment (millions of meters) would be a MCM
|
|
integrating an off-the-shelf smart metering microcontroller die with the reset controller running on another, much
|
|
smaller off-the-shelf microcontroller die. This solution might potentially save some cost compared to a solution using a
|
|
discrete microcontroller for the reset controller.
|
|
|
|
The more likely approach to reducing cost overhead of the reset controller would be to employ virtualization
|
|
technologies such as ARM's TrustZone in order to incorporate the reset controller firmware into the application firmware
|
|
on the same chip without compromising the reset controller's security or disturbing the application firmware's
|
|
operation.
|
|
|
|
TrustZone is a virtualization technology that provides a hardware-assisted privileged execution domain on at least one
|
|
of the microcontrollers cores. In traditional virtualization setups a privileged hypervisor is managing several
|
|
unprivileged applications sharing resources between them. Separation between applications in this setup is longitudinal
|
|
between adjacent virtual machines. Two applications would both be running in unprivileged mode sharing the same cpu and
|
|
the hypervisor would merely schedule them, configure hardware resource access and coordinate communication. This
|
|
longitudinal virtualization simplifies application development since from the application's perspective the virtual
|
|
machine looks very similar to a physical one. In addition, in general this setup reciprocally isolates two applications
|
|
with neither one being able to gain control over the other.
|
|
|
|
In contrast to this, a TrustZone-like system in general does not provide several application virtual machines and
|
|
longitudinal separation. Instead, it provides lateral separation between two domains: The unprivileged application
|
|
firmware and a privileged hypervisor. Application firmware may communicate with the hypervisor through defined
|
|
interfaces but due to TrustZone's design it need not even be aware of the hypervisor's existence. This makes a perfect
|
|
fit for our reset controller. The reset controller firmware would be running in privileged mode and without exposing any
|
|
communication interfaces to application firmware. The application firmware would be running in unprivileged mode
|
|
without any modification. The main hurdles to the implementation to a system like this are the requirement for a
|
|
microcontroller providing this type of virtualization on the one hand and the complexity of correctly employing this
|
|
virtualization on the other hand. Virtualization systems such as TrustZone are still orders of magnitude more complex to
|
|
correctly configure than it is to simply use separate hardware and secure the interfaces in between.
|
|
|
|
\chapter{Conclusion}
|
|
|
|
\newpage
|
|
\appendix
|
|
\chapter{Acknowledgements}
|
|
\newpage
|
|
|
|
\chapter{References}
|
|
\nocite{*}
|
|
\printbibliography
|
|
\newpage
|
|
|
|
\chapter{Demonstrator schematics and code}
|
|
|
|
\chapter{Economic viability of countermeasures}
|
|
\section{Attack cost}
|
|
\section{Countermeasure cost}
|
|
|
|
% FIXME maybe include a standard for the technical side of a safety reset system here, e.g. in the style of an IETF draft?
|
|
|
|
\end{document}
|