\chapter{Concept}
\label{cha:concept}
In this chapter we define the constraints for the \emph{Biometric Sensor} (BS) as well as a generic
attempt for a prototype.
The constraints include a discussion about the attack vectors to the BS.
We explain  which requirements can and will be addressed and how sensitive data is
processed in the BS.

\section{Definition of the Biometric Sensor}
\label{sec:definition-of-the-biometric-sensor}
The BS itself is defined as edge device within the Digidow network.
According to the schema shown in \autoref{fig:globalview}, the BS will be placed in a public area
(e.g. a checkpoint in an airport or as access control system at a building) to interact directly
with the Digidow users.
There, the BS acts as interface to the to the Digidow network.
By providing a biometric property, the user should be able to authenticate itself and the network may then trigger the desired action, like granting access or logging presence.
Depending on the biometric property, the sensor may not be active all the time, but activated when an authentication process is started.

The following enumeration shows the steps of the BS for identifying the interacting person.
\begin{enumerate}
	\item \emph{Listen}: Either the sensor hardware itself (e.g. a detection in a fingerprint sensor) or another electrical signal will start the authentication process.
	\item \emph{Collect}: Measure sensor data (picture, fingerprint) and calculate a biometric
	representation (attribute).
	\item \emph{Discover}: Start a network discovery in the Digidow network and find the PIA
	corresponding to the present person. It may be necessary to interact with more than one PIA
	within this and the next steps.
	\item \emph{Transmit}: Create a trusted and secure channel to the PIA and transmit the attribute.
	\item \emph{Reset}: Set the state of the system as it was before this transaction.
\end{enumerate}

Since the BS handles biometric data---which must be held confidential outside the defined use cases---a number of potential threats must be considered when designing the BS.

\section{Attack Vectors and Threat Model}
As mentioned before, the BS will work in an exposed environment.
Neither the user providing biometric data nor the network environment should be trusted for proper function.
There should only be a connection to the Digidow network for transmitting the recorded data.
This assumption of autonomy provides independence to the probably diverse target environments and use cases.

In addition to autonomy, the BS should also ensure proper handling of received and generated data.
The recorded dataset from a sensor is \emph{sensitive data} due to its ability to identify an individual.
Due to its narrow definition, it is affordable to protect sensitive data.
Besides that, \emph{metadata} is information generated during the whole transaction phase.
Timestamps and host information are metadata as well as connection lists, hashes and log entries
and much more (What? Where? When?)
There exists no exact definition or list of metadata which makes it hard to prevent any exposure of it.
Metadata does not directly identify an individual.
However huge network providers are able to combine lots of metadata to traces of individuals.
Eventually an action of those traced individuals might unveil their identity.
Consequently, a central goal of Digidow is to minimize the amount to minimize the risk of traces.

Privacy defines the ability of individuals to keep information about themselves private from others.
In the context of the BS, this is related to the recorded biometric data.
Furthermore, to prevent tracking, any interaction with a sensor should not be matched to personal
information.
Only the intended and trusted way of identification within the Digidow network should be possible.

\subsection{Threat Model}
\label{ssec:threatmodel}

To fulfill the sensor's use case, we need to consider the following attack vectors:
\begin{itemize}
	\item \emph{Rogue Hardware Components}: Modified components of the BS could, depending on their
	contribution to the system, collect data or create a gateway to the internal processes of the
	system.
		Although the produced hardware piece itself is fine, the firmware on it is acting in a malicious way.
		This threat addresses the manufacturing and installation of the system.
	\item \emph{Hardware Modification}: Similar to rogue hardware components, the system could be modified in the target environment by attaching additional hardware.
		With this attack, adversaries may get direct access to memory or to data transferred from or to
		attached devices.
	\item \emph{Metadata Extraction}: The actual sensor like camera or fingerprint sensor is usually attached via USB or similar cable connection.
		It is possible to log the protocol of those attached devices via Man-in-the-Middle attack on
		the USB cable.
	\item \emph{Attribute Extraction}: The actual sensor, like camera or fingerprint sensor, is
	usually attached via USB or a similar cable connection.
		It is possible to log the protocol of those attached devices via wiretapping the USB cable.
		With that attack, an adversary is able to directly access the attributes to identify individuals.
	\item \emph{Modification or aggregation of sensitive data within BS}: The program which prepares
	the sernsor data for transmission could modify the data before sealing it.
		The program can also just save the sensitive data for other purposes.
	\item \emph{Metadata extraction on network}: During transmission of data from the sensor into the
	Digidow network, there will be some metadata generated.
		An adversary could use these datasets to generate tracking logs and eventually match these logs
		to individuals.
	\item \emph{Replay of sensor data of a rogue BS}: When retransmitting sensor data, the
	authentication of an individual could again be proven.
		Any grants provided to the successfully identified individual could then be given to another
		person.
	\item \emph{Rogue Biometric Sensor blocks transmission}: By blocking any transmission of sensor data, any transaction within the Digidow network could be blocked and therefore the whole authentication process is stopped.
	\item \emph{Rogue Personal Identity Agent}: A rogue PIA might receive the sensor data instead of the honest one.
		Due to this error, a wrong identity and therefore false claims would be made out of that.
\end{itemize}

\section{Prototype Concept}%
\label{sec:prototype_concept}
Given the threat model and the use cases described in \autoref{sec:definition-of-the-biometric-sensor}, we will introduce a prototype which will address many of the defined requirements.
Any threats adressing the physical integrity of the BS will, however, be omitted.
These threats can be addressed with physical intrusion and vandalism protection like they are available for ATMs.
We will instead focus on the integrity of the system when the BS is operating.

\subsection{Integrity and Trust up to the Kernel}%
\label{sub:integrity_and_trust_up_to_the_kernel}

We decided to use the PC platform as hardware base for the prototype.
There are lots of different form factors available you can extend the system with a broad variety of sensors.
Furthermore the TPM support is implemented to support integrity analysis on the system.
Finally, the platform can run almost all Linux variants and supports relevant pieces of software for this project.
A flavour of Linux supporting all features described in this chapter, will be used as OS platform.
The ARM platform seem to be capable of all these features as well, however, the support of TPM, the amount of available software and the ease of installation is better on the PC platform.

As described in \autoref{sec:trusted_platform_module_tpm_}, the TPM functions can be delivered in three different flavors: As dedicated or mounted device and as part of the processor's firmware.
The fTPM is part of a large proprietary environment from AMD or Intel which which introduces, besides implementation flaws, additional attack surfaces for the TPM.
Hence we will use dedicated TPM chips on the platform, which are pluggable, to gain most control over the functionality.

Any recent PC platform supports TPMs ans consequently Trusted Boot as mentioned in \autoref{sec:trusted_boot}.
The system will describe its hardware state in the PCRs 0\,--\,7 when the EFI\,/\,BIOS hands over to the Bootloader.
We use these PCR values to detect any unauthorized modifications on hardware or firmware level.
It is important to include also \emph{epmty} PCRs to detect added hardware on the PCI bus with an Option ROM, for example.

With these PCR values we can seal a passphrase in the TPM.
The disk, secured with Full Disk Encryption (FDE), can only be accessed, when the hardware underneath is not tampered with.

To further reduce the attack surface, the prototype will not use a bootloader like GRUB.
Instead, the kernel should be run directly from the UEFI\,/\,BIOS.
Therefore, the kernel is packed directly into an EFI file, together with its command line
parameters and the initial file system for booting.
This \emph{Unified Kernel} is directly measured by the UEFI\,/\,BIOS and is also capable of decrypting the disk, given the correct PCR values.

This setup starts with two sources of trust that are formally defined:
\begin{itemize}
	\item \emph{TPM}: The TPM acts as certified Root of Trust for holding the PCRs and for the cryptographic function modifying those.
	\item \emph{RTM}: The Root of Trust for Measurement is part of the mainboard's firmware.
		The tiny program just measures all parts of the firmware and feeds the TPM with the results.
		However, the program is maintained by the mainboard manufacturer and the source is not available to the public.
		We have to trust that this piece of software is working correctly,
\end{itemize}
We implicitly assume that the CPU, executing all these instructions and interacting with the TPM, is working correctly.

All parts contributing to the boot phase will be measured into one of the PCRs before any instruction is executed.
Decrypting the disk can then be interpreted as authorization procedure against the encrypted disk.
Consequently only a \emph{known} kernel with a \emph{known} hardware and firmware setup underneath
can access the disk and finish the boot process in the OS.

The disk encryption is, however, only an optional feature which can be omitted in a production
environment when there is no sensitive data on the disk that must not be revealed to the public.
The system needs to check its integrity on the OS level and summarize that by publishing an attestation message, before any transaction data is used.

\begin{figure}
	\centering
	\includegraphics[width=0.8\linewidth]{../resources/measurements.pdf}
	\caption{Extending trust from the Roots of Trust up to the kernel}%
	\label{fig:measuements}
\end{figure}

\autoref{fig:measuements} illustrates how above proceses extend the trust on the system.
The TPM is the cryptographic root of trust, storing all measurement results and the target values for validation.
SInce the RTM is the only piece of code, which lives in the platform firmware and is executed \emph{before} it is measured, it is an important part in the trust architecture of the system.
An honest RTM will measure the binary representation of itself, which makes the code at least provable afterwards.
Finally, the CPU is assumed to execute all the code according to its specification.
Proving correctness of the instruction set cannot be done during the boot process.

When the roots of trust are honest, the trusted environment can be constructed during booting the platform with the PCR measurements.
We get then a system, where all active parts in the booting process are trusted up to the Linux kernel with its extensions and execution parameters.

\subsection{Integrity and Trust on OS Level}%
\label{sub:integrity_and_trust_on_os_level}

With the trusted kernel and IMA, we can include the file system into the trusted environment.
According to \autoref{sec:integrity_measurement_architecture}, every file will be hashed once IMA is activated and configured accordingly.
By enforcing IMA, the kernel allows access to only those files having a valid hash.
Consequently, every file which is required for proper execution needs to be hashed beforehand before IMA is enforced.
The IMA policy in place should be \texttt{appraise\_tcb}, to analyze kernel modules, executable memory mapped files, executables and all files opened by root for read.
This policy should also include drivers and kernel modules for external hardware like a camera for attached via USB.

\subsection{Proving Trust with DAA}%
\label{sub:prove_trust_with_daa}

The features described above take care of building a trusted environment on the system level.
DAA will take care of showing the \emph{trust} to a third party which has no particular knowledge about the BS.
In the Digidow context, the PIA should get, together to the biometrical measurements, a proof that
the BS is a trusted system acting honestly.

To reduce the complexity of this problem, we consider two assumptions:
\begin{enumerate}
	\item \emph{Network Discovery}: The PIA is already identified over the Digidow network and there
	exists a bidirecional channel between BS and PIA
	\item \emph{Secure Communication Channel}: The bidirectional channel is assumed to be hardened against wire tapping, metadata extraction and tampering.
		The prototype will take no further action to encrypt any payload besides the cryptographic features that come along with DAA itself.
\end{enumerate}
The DAA protocol should be applied on a simple LAN, where all parties are connected locally.
The BS will eventually become a member of the Group of sensors, managed by the Issuer.
During signup, Issuer and BS (Member) negotiate the membership credentials over the network.
By being a member of the DAA group, the Issuer fully trusts that the BS is honest and acting according the specification.
The Issuer will not check any group members, since they can now act independently of the Issuer.

When the BS is then authenticating an individual, the process illustrated in \autoref{fig:daa-attestation} will be executed.
\begin{figure}
	\centering
	\includegraphics[width=0.7\textwidth]{../resources/tpmattest}
	\caption[DAA Attestation procedure]{The DAA attestation process requires 5 steps. The PIA may trust the Biometric Sensor afterwards.}
	\label{fig:daa-attestation}
\end{figure}
\begin{enumerate}
	\item The PIA gets once and independently of any transaction the public key of the BS group.
	\item During the transaction, the PIA will eventually ask the BS for attestation together with a \texttt{nonce}.
	\item The BS will collect the PCR values, the Integrity Log and the \texttt{nonce} into an Attestation message signed with the Member SK.
	\item The Attestation Message will be sent back to the PIA.
	\item The PIA checks the signature of the message, checks the entries of the Integrity log against known values, and proves the PCR values accordingly.
\end{enumerate}

\autoref{fig:chainoftrust} shows how the sources of trust will be represented in the final attestation message.
\begin{figure}
	\centering
	\includegraphics[width=0.8\linewidth]{../resources/chainoftrust.pdf}
	\caption{Overview of the Chain of Trust of the BS}%
	\label{fig:chainoftrust}
\end{figure}
The four sources of trust are defined as groups which deliver parts of the prototype, but cannot be verified on a cryptographic level.
Hence, suppliers must be manually added to these groups by using a well defined check for trustworthiness.
Any TPM manufacturer has to implement the well defined standard from TCG.
There exists, however no such exact definition for hardware and firmware parts of the platform.
Consequently, these parts should undergo a functional analysis before they are trusted.
Trust means that, when the platform is defined trustworthy, the corresponding PCR values should be published.

The same procedure should be done for the kernel and the used OS environment and of course the used
software.
There, only the kernel with its parameters have a corresponding PCR value.
Furthermore a hash value should be published for any relevant file on the file system.

We can then build a cryptographic representation of the chain of trust in \autoref{fig:chainoftrust}.
The TPM has a signed Certificate from ist manufacturer, where it derives the Endorsement Key (EK) from it.
When all of the above checks against platform, OS and TPM are good, the DAA Issuer will assign the platform to the group of trusted BS.
The BS has now a member SK for signing its attestation message.

The Verifier can now check the valid membership by checking the signature of the message against the Issuer's PK.
Furthermore it can check the state of the platform by compare the PCR values against known values.
Finally it can check the integrity of the running software by checking the hashes in the IMA log against known values.
PCR 10 represents therefore the end of the hash chain fed by the IMA log entries.

If all values are good, the BS can be trusted and the Digidow transaction can be continued at the
PIA.