masterthesis/thesis/05_testing.tex


								\chapter{Testing}

								\label{cha:testing}

								With the setup described in the prevoius chapter, we created a system which is able to capture and process biometric data.

								The system encapsulates this data into an attestation message and sends it to the PIA which is the DAA verifier in this configuration.

								We show in the following how well each part of the setup work and which performance the tested devices show.

								Furthermore we analyze the footprint in memory as well as on the disk.

								The tests are only applied to System 1 and 3 since System 2 has a comparable hardware configuration to System 3 but uses a CPU of an older generation.

								Furthermore, only two TPMs support the cryptographic parts required for ECDAA.

								We discuss this issue in further detail in \autoref{sec:limitations}.

								Consequently, System 2 was used as DAA verifier since this host does not require a TPM.


								\section{Trusted Boot}

								The first part of the setup is trusted boot which is well integrated in recent releases of kernel and GRUB bootloader.

								Furthermore, unlocking the optional disk encryption with the TPM works seamless with the kernel, even when using the manually generated unified kernel without GRUB.

								Only when performing an update for GRUB, it will modify the entries in the EFI boot manager.

								Consequently, we recommend a check of the boot entries after a system upgrade.


								Having a backup boot option with GRUB is useful for maintenance tasks.

								Especially for modifying the IMA configuration via kernel command line, it may be necessary to boot with a backup kernel.

								Hence, a backup boot process is strongly recommended for test setups.

								This requires a backup password for the disk encryption since which allows to bypass the TPM during booting.

								Otherwise there are no updates possible with the current setup since the affected PCRs are used by the EFI bootloader and cannot be precomputed.


								GRUB already supports trusted boot and activation requires a line in the corresponding config file.

								Unlike GRUB, the unified kernel does not perform any measurements.

								Only when asking the TPM for the disk encryption key, the initramfs must have the TPM stack available.

								These files use about 62\,KB of space in initramfs which is negligible compared with the complete image using about 80\,MB.

								Furthermore, only in this case exists a measurable difference in the boot performace since asking the TPM takes around 4\,s.

								This is the time when the boot process posts \texttt{unlocking via TPM} and stays there until the disk is unlocked.

								The used amount of memory was not tested here, but it should be comparable to the results when generating a TPM key for the member.

								According to these results, we assume that less than 10\,KB are used to hold the TPM stack in memory.


								\section{IMA}

								IMA appears to have an easy setup but a complex set of consequences.

								When setting IMA in fixing or enforcing mode, the system's performance drop is significant.

								We show in \autoref{tab:bootperformance} the boot performance of the used test systems given a setup for a biometric sensor described in \autoref{cha:implementation} with TPM backed disk encryption enabled.


								\begin{table}

								  \renewcommand{\arraystretch}{1.2}

								  \centering

								  \caption{Booting and rebooting times with IMA (n=10)}

								  \label{tab:bootperformance}

								  \begin{tabular}{rl|ccc|ccc}

								    \toprule

								    \textbf{Performance} &&\multicolumn{3}{c|}{\textit{System 1}} &\multicolumn{3}{c}{\textit{System 3}} \\

								    &&\rotatebox{90}{IMA off} &\rotatebox{90}{IMA fix} &\rotatebox{90}{IMA enf} &\rotatebox{90}{IMA off} &\rotatebox{90}{IMA fix} &\rotatebox{90}{IMA enf}\\

								    \midrule

								    \textit{Boot} [s] &min   & & & &0.03 &0.03 & \\

								    &avg   & & & &0.03 &0.03 & \\

								    &max   & & & &0.06 &0.06 & \\\hline

								    \textit{Reboot} [s] &min   & & & &0.35 &0.35 & \\

								    &avg   & & & &0.37 &0.37 & \\

								    &max   & & & &0.37 &0.41 & \\

								    \bottomrule

								  \end{tabular}

								\end{table}


								The boot procedure is an example for a reproducible file intensive process.

								It starts with pressing the power button when the system is in off state and ends when the CLI login mask appears.

								Rebooting starts when the \texttt{reboot} command is entered in the shell.

								Again, the CLI login mask ends the rebooting process.

								Further timing tests comparing the different IMA states are shown in the next section.


								When enabling IMA, the file hashes will be stored as extended file attributes.

								According to the xattr man page\cite{XattrMan2021}, the ext4 file system creates therefore a single block per file where all attributes must find place.

								The block size of the used filesystem is 4\,KB.

								The number of files is determined with the following command

								\begin{lstlisting}[numbers=none]

								  find / -fstype ext4 -type f -uid 0 | wc -l

								\end{lstlisting}

								System 1 holds after setup as Biometric Sensor and several system updates 156947 files.

								This results in estimated additional disk usage of 613\,MB when adding the \texttt{xattr} block for each file.

								Further attributes, however, will not require more disk space on ext4 since every extended attribute must exist in this single block.


								IMA holds all runtime information as virtual files in memory, including the IMA log.

								In context of memory usage, it is clear that the size of the virtual files without the redundant information (like PCR number and log format) is a rough estimation for the memory usage within the kernel.

								Both memory and disk usage do not change between IMA's fixing and enforcing mode since enforcing only adds file access control.


								Saving the file to disk, takes about 2\,MB of size per 10000 lines in the log.

								This indicates a linear dependency to the number of accessed files for the log and the kernel's memory usage.

								The log can easily get over 100000 entries when the system is running long enough.

								System 1, for example, had a log file with 214561 lines after about 15 days of uptime, resulting in about 40\,MB of size.

								Saving that file on disk took several seconds which indicates that extracting the log from the kernel is very inefficient.

								The log file showed that a major impact on the size was the performed system update.

								Generating the iniramfs blob and compiling programs from source generates combined several 10000 lines in the log.


								\section{Processing and Sending Biometric Data}

								Capturing and processing biometric data from the user is quite seamless and the cryptographic part of ECDAA is reliable enough for this prototype.

								During the following tests, all software and hardware parts worked as expected.

								Neither TPM nor software errors were encountered.


								Analyzing disk and memory usage is only meaningful for the DAA member.

								The implemented protoype of the DAA issuer does only negotiate the membership key.

								Revocation lists and group management are not implemented yet, although the ECDAA library provides datastrucutres and functions for that.

								Similarly, the DAA verifier only checks the signature.

								In production use, both entities must hold the revocation list and perform further checks to trust the DAA member.


								We split the tasks of a Digidow sensor in several parts to document the conrtibution of each.

								\begin{itemize}

								  \item \emph{DAA TPM key generation}: Clear the TPM, generate a new EK and DAA key pair and persist the DAA key in the TPM's NVRAM

								  \item \emph{DAA TPM join w/o keygen}: Use the DAA key which is already in place and negotiate a group membership with the DAA issuer.

								  \item \emph{DAA TPM keygen \& join}: This combines the two steps above to give comparable time measurements to the join procedure without TPM.

								  \item \emph{DAA keygen \& join}: Generate the DAA keypair and save it to disk.

								  Join the DAA group by negotiating a secret with the DAA issuer.

								  \item \emph{Digidow sensor capture}: Create an image using \texttt{bs-capture} and save it to disk.

								  \item \emph{Digidow sensor embed}: Extract a face embedding using the tensorflow application \texttt{img2emb}.

								  \item \emph{Digidow sensor collect}: Collect the IMA log and save it to disk.

								  Create a sha512sum of the file and put it together with all PCRs and the face embedding data into one message.

								  Calculate another sha512sum from the message itself and save it to disk.

								  \item \emph{Digidow sensor send}: Sign the message's hash with the TPM DAA key and send it together with the message to the DAA verifier.

								  The verifier saves message and hash for further procedures on its disk.

								\end{itemize}


								First, we look into the memory footprint of each part by executing them via \texttt{valgrind}.

								It measures the allocated heap space in memory which is shown in \autoref{tab:memoryusage}.

								\begin{table}

								  \renewcommand{\arraystretch}{1.2}

								  \centering

								  \caption{Memory usage measured with Valgrind}

								  \label{tab:memoryusage}

								  \begin{tabular}{lrr}

								    \toprule

								    \textit{Task} &\textit{System 1} &\textit{System 3} \\

								    \midrule

								    {DAA TPM key generation} &10,160 &10,160 \\

								    {DAA TPM join w/o keygen} &23,864 &23,864 \\

								    {DAA keygen \& join} &19,296 &19,296 \\

								    {Digidow sensor capture}   &93,703	&93,703 \\

								    {Digidow sensor embed}  &1,318,722,747	&1,385,416,573\\

								    {Digidow sensor collect} &1,115,639 &1,115,597 \\

								    {Digidow sensor send}   &36,072 &36,072 \\

								    \bottomrule

								  \end{tabular}

								\end{table}

								The memory usage is constant over all procedures but creating the DAA message itself.

								This step's memory footprint depends on the size of the files which it summarizes, especially when taking the IMA log into account.

								In this case the memory usage is measured while IMA is off, representing a lower bound of memory usage for this part.

								Besides calculating the face embedding of the captured image, the whole transaction can be executed using about 1.2\,MB of heap memory.

								This would fit on most embedded devices running a Linux kernel.

								However, the face embedding algorithm uses over 1.3\,GB and requres the majority of the computation time as shown below.

								The slight difference between the two systems at the processing part seems to be consistent over several runs.


								\autoref{tab:wholeperformance} shows the time consumption for each relevant task for the Digidow sensor with its minimum, average and maximum results over 10000 runs.

								\begin{table}

								  \renewcommand{\arraystretch}{1.2}

								  \centering

								  \caption{Performance results of joining a DAA group and sending a Digidow transaction message (n=10000)} \label{tab:wholeperformance}

								  \begin{tabular}{ll|ccc|ccc}

								    \toprule

								    \multicolumn{2}{l|}{\textit{Task}} &\multicolumn{3}{c|}{\textit{System 1}} &\multicolumn{3}{c}{\textit{System 3}} \\

								    &&\rotatebox{90}{IMA off} &\rotatebox{90}{IMA fix} &\rotatebox{90}{IMA enf} &\rotatebox{90}{IMA off} &\rotatebox{90}{IMA fix} &\rotatebox{90}{IMA enf}\\

								    \midrule

								    \textit{DAA keygen \& join} [s] &min   &0.03 &0.03 & &0.03 &0.03 &0.03 \\

								    &avg   &0.05 &0.05 & &0.03 &0.03 &0.03 \\

								    &max   &0.07 &0.06 & &0.06 &0.06 &0.28 \\

								    &first &0.04 &0.07 & &0.04 &0.13 &0.04 \\\hline

								    \textit{DAA TPM keygen \& join} [s] &min   &0.33 &0.33 & &0.35 &0.35 &0.35 \\

								    &avg   &0.34 &0.34 & &0.37 &0.37 &0.37 \\

								    &max   &0.34 &0.36 & &0.37 &0.41 &0.40 \\

								    &first &0.37 &0.41 & &0.40 &0.42 &0.35 \\\hline\hline

								    \textit{Digidow sensor capture} [s] &min   &0.92 &0.91 & &0.91 &0.91 &0.91 \\

								    &avg   &1.07 &1.05 & &1.06 &1.06 &1.06 \\

								    &max   &1.14 &1.14 & &1.12 &12.48 &1.12 \\

								    &first &1.36 &1.42 & &1.34 &1.46 &1.45 \\\hline

								    \textit{Digidow sensor embed} [s] &min   &3.48 &3.51 & &4.07 &4.09 &4.10 \\

								    &avg   &3.53 &3.53 & &4.12 &4.14 &4.14 \\

								    &max   &4.11 &4.11 & &4.74 &4.46 &4.53 \\

								    &first &5.41 &19.93 & &5.99 &40.21 &40.23 \\\hline

								    \textit{Digidow sensor collect} [s] &min   &0.07 &0.14 & &0.09 &0.19 &0.19 \\

								    &avg   &0.08 &n/a &n/a &0.10 &n/a &n/a \\

								    &max   &0.09 &n/a &n/a &0.11 &n/a &n/a \\

								    &first &0.09 &0.18 & &0.11 &0.24 &0.25 \\\hline

								    \textit{Digidow sensor send} [s] &min   &0.25 &0.25 & &0.26 &0.27 &0.27 \\

								    &avg   &0.25 &0.26 & &0.28 &0.27 &0.28 \\

								    &max   &0.26 &0.27 & &0.28 &0.29 &0.29 \\

								    &first &0.26 &0.32 & &0.28 &0.40 &0.40 \\\hline\hline

								    \textit{Digidow sensor transaction} [s] &min   &4.75 &4.84 & &5.38 &5.50 &5.49 \\

								    &avg   &4.92 &n/a &n/a &5.56 &n/a &n/a \\

								    &max   &5.52 &n/a &n/a &6.14 &n/a &n/a \\

								    &first &7.12 &21.92 & &7.72 &42.31 &42.33 \\

								    \bottomrule

								  \end{tabular}

								\end{table}

								The \emph{first} run is stated separately since it is done immediately after a system reboot where the resources cached by the kernel are not loaded yet.

								Depending on the number of resources a single step needs, the overtime might be smaller or larger.


								When IMA is enabled, the kernel has to check the hash of each file accessed for reading.

								This hash must be extended into PCR 10 which makes the first run of each part significantly longer.

								Especially the tensorflow application requires significantly more time for the first run.


								With IMA set to enforcing, the kernel furthermore manages access to the file asked to read.

								This decision does not require additional resources compared to the fixing mode.

								The file must be hashed in any case.

								As long as the file \emph{integrity} is intact, PCR 10 and the IMA log file have to be written as well.

								Consequently, the difference between fixing and enforcing mode is to compare the computed filehash with the value in the extended attributes and the decision depending on that result.


								Since IMA measures every loaded resource, the corresponding log file will constantly increase during testing.

								Unfortunately the IMA log is required to collect the data for the Digidow attestation message.

								Furthermore, it is clear that every Digidow transaction contributes to the log since the handover beweent the single tasks is done based on files.

								Consequently, we expected a runtime dependent on the number of runs and the average and maximum runtime in \autoref{tab:wholeperformance} remains unavailable when IMA is enabled.


								The graphs of \autoref{fig:time-digidow-transaction} show the runtime of each of the runs on both tested systems and with IMA in fixing or enforcing mode respectively.

								\begin{figure}

								  \centering

								  \includegraphics[width=0.48\textwidth]{../resources/plots/amd1-ima-fix.png}

								  \includegraphics[width=0.48\textwidth]{../resources/plots/intel2-ima-fix.png}

								  \includegraphics[width=0.48\textwidth]{../resources/plots/amd1-ima-enf.png}

								  \includegraphics[width=0.48\textwidth]{../resources/plots/intel2-ima-enf.png}

								  \caption{Time consumption of a Digidow transaction on the tested systems}

								  \label{fig:time-digidow-transaction}

								\end{figure}

								Each run is split into the four parts of a Digidow transaction.

								The graphs clearly show that our expectation of a linear relation between runtime and number of runs were not satisfied.

								It seems that collecting the IMA log has at least the complexity of $O(n^2)$.

								Furthermore, it is interesting, that System 1 with the newer AMD processor seems to be faster in the beginning.

								When the number of runs reach 10000, the system need significantly more time than System 3 with the Intel processor.

								Since the software setup on both systems is comparable (Kernel version, Linux distro, installed programs, setup with respect to \autoref{cha:implementation}), the reason for this difference can probably be found either in the microarchitectural implementation or in (less) optimized code for the AMD CPU.


								When IMA is in fixing or enforcing mode, the corresponding log will be filled with information about every accessed file.

								The numbers in \autoref{tab:imalogentries} are taken from the IMA log after 10000 Digidow transaction tests.

								IMA was set to enforcing and the DAA member key was already in the TPM.

								\begin{table}

								  \renewcommand{\arraystretch}{1.2}

								  \centering

								  \caption{Number of additional entries in the IMA log}

								  \label{tab:imalogentries}

								  \begin{tabular}{lrr}

								    \toprule

								    \textit{no. of entries} &\textit{System 1} &\textit{System 3} \\

								    \midrule

								    \textit{Root login after boot}	&1912 \\

								    \textit{Digidow sensor capture}   &5 \\

								    \textit{Digidow sensor embed}	&2561 \\

								    \textit{Digidow sensor collect} &6 \\

								    \textit{Digidow sensor send}   &12 \\

								    \textit{Every other Digidow transaction}   &5 \\

								    \bottomrule

								  \end{tabular}

								\end{table}


								This means---given that the (very slow) hardware TPM had to extend PCR 10 for every line in the log---the slowdown is mainly caused by the interaction with the TPM itself.


								Since the IMA log file is also essential for remote attestation, the information of this file must be transmitted to the DAA verifier.


								\begin{itemize}

								  \item payload without IMA log about 15KB

								  \item No encryption for payload, but doable -- depends on the way how Sensor and PIA can communicate together

								  \item IMA log much too large

								  \item Test results how long the process of capturing takes -- with and without IMA

								\end{itemize}


								\section{Limitations}

								\label{sec:limitations}

								\begin{itemize}

								  \item older TPM does not support ECDAA

								  \item Documentation available for TPM APIs, but no changelog for \texttt{tpm2-tools}.

								  \item Trusted boot and IMA can just handle static resources like files, kernel modules and firmware of hardware components.

								  Code transmitted over network or otherwse dynamically generated can not be recognized.

								  This is an open door for non-persistent attacks.

								  \item Documentation on IMA is mostly outdated and so are some tools.

								  Further customization of rules may be useful to reduce log size.

								  However major Linux distributions support IMA by default on recent releases.

								  \item Complexity of verifying system state is too high and is connected to system complexity.

								  Reducing number of dependencies and relevant file count is key for this problem.

								  \item Implemented DAA does not support a full dynamic group scheme.

								  This might be useful in the future, maybe with a custom implementation of a recent DAA version.

								\end{itemize}


								During these memory tests, valgrind showed a large number of possible memory leaks in the python binary itself.

								The following example is executed:

								\begin{lstlisting}[numbers=none]

								  root@amd1:~/jetson-nano-facerecognition# valgrind python3 img2emb.py data/test-images/test2.jpg

								\end{lstlisting}

								Valgrind ends with the report as follows:

								\begin{lstlisting}[numbers=none]

								==1648== HEAP SUMMARY:

								==1648==     in use at exit: 32,608,730 bytes in 227,287 blocks

								==1648==   total heap usage: 810,162 allocs, 582,875 frees, 1,385,416,573 bytes allocated

								==1648==

								==1648== LEAK SUMMARY:

								==1648==    definitely lost: 3,144 bytes in 28 blocks

								==1648==    indirectly lost: 0 bytes in 1 blocks

								==1648==      possibly lost: 523,629 bytes in 12,842 blocks

								==1648==    still reachable: 32,081,957 bytes in 214,416 blocks

								==1648==                       of which reachable via heuristic:

								==1648==                         stdstring          : 537,414 bytes in 11,917 blocks

								==1648==                         newarray           : 8,920 bytes in 5 blocks

								==1648==         suppressed: 0 bytes in 0 blocks

								==1648== Rerun with --leak-check=full to see details of leaked memory

								==1648==

								==1648== Use --track-origins=yes to see where uninitialised values come from

								==1648== For lists of detected and suppressed errors, rerun with: -s

								==1648== ERROR SUMMARY: 58173 errors from 914 contexts (suppressed: 0 from 0)

								\end{lstlisting}

								This report shows that the Python binary (here Python 3.8 from Ubuntu 20.04) is not memory safe, which is a significant drawback for the system and software integrity.

								Any binary which is directly involved in the DAA protocol frees every allocated block.

								Furthrmore any binary in the TPM2 software stack is memory safe according to valgrind.

								The used shell commands may not free every allocated block, however valgrind still finds no errors in these programs.


								When IMA is set to enforcing, some unexpected problems appeared during updating Ubuntu.

								During \texttt{apt upgrade}, the package manager downloads the deb packages into its cache folder at  \texttt{/var/cache/apt/}.

								These files, however, do not have the \texttt{security.ima} attribute when the download is finished.

								The kernel prevents due to theses missing attributes any access and breaks the upgrade process.

								It is not clear why the files are not hashed although apt is run by root and every file created by root must be hashed with the active IMA policy.

								Creating a text file via text editor shows exactly this behaviour.


								The missing attributes are also a problem when attemping to turn IMA off again.

								This requires an updated command line and hence an updated unified kernel.

								Generating the new kernel work fine but moving the blob into the EFI partition fails due to the missing support of extended file attributes.

								The copy process seems to create the file, but it fails when trying to write the first chunks.

								As a result, the only way to update the kernel on this system is to boot a backup kernel and moving the file without IMA set to enforcing.

								Applying customized IMA rules might solve both problems.


								Another relevant part for attestation is to use the IMA log file do recalculate the produces hashes.

								This means on one hand recalculating the hash in the IMA log entry.

								On the other hand the chain of hashes should result into the value held in PCR 10.

								A small script shown in \autoref{code:verifyimash} tries to recalculate this value.

								\begin{lstlisting}[numbers=none,float, caption={Attempt to recalculate the value of PCR 10}, label={code:verifyimash}]

								  #!/usr/bin/bash

								  set -e

								  tpm2_pcrreset 16

								  cut -d ' ' -f 2 /sys/kernel/security/ascii_runtime_measurements > ima.hashes

								  while read i ;

								  do tpm2_pcrextend 16:sha1=$i ;

								  done < ima.hashes

								  tpm2_pcrread sha1:10,16

								\end{lstlisting}

								It uses the debug register PCR 16 which is resetable without reboot and has the same initial value as the first 10 PCRs.

								When IMA is off, the log holds only one entry of the boot aggregate.

								Then the SHA1 value can be computed with that script.

								However, comprehending the PCR 10 value with IMA enabled was not possible.

								Our tests took into account that PCR and log file could be modified when loading programs to read these resources the first time.

								Loading the log several times eventually ends up in a stable log and PCR value (it does not change anymore even when reading the log another time).

								The value of PCR 10 was still not reproducible.


								Furthermore the documentation of calculating these vaules did not mention how the sha256 hash in PCR 10 is calculated.

								\texttt{tpm2\_pcrextend} requires a sha256 hash as input for the corresponding PCR bank, but the IMA log only provides sha1 hashes.

								Any verification procedures regarding the sha256 bank of PCR 10 are currently not implemented.