added final tests

4 years ago · a30ee11be8
7 changed files with 10148 additions and 118 deletions
--- a/resources/plots/.~lock.stats.ods#
+++ b/resources/plots/.~lock.stats.ods#
@ -0,0 +1 @@
+,michael,luna,25.09.2021 10:56,file:///home/michael/.config/libreoffice/4;
--- a/resources/plots/amd1-ima-enf.dat
+++ b/resources/plots/amd1-ima-enf.dat
--- a/resources/plots/amd1-ima-enf.png
+++ b/resources/plots/amd1-ima-enf.png
--- a/resources/plots/stats.ods
+++ b/resources/plots/stats.ods
--- a/thesis/05_testing.tex
+++ b/thesis/05_testing.tex
@ -89,11 +89,11 @@ Capturing and processing biometric data from the user is quite seamless and the
 During the following tests, all software and hardware parts worked as expected.
 Neither TPM nor software errors were encountered.

-Analyzing disk and memory usage is only meaningful for the DAA member.
+Analyzing the occupied resources is only meaningful for the DAA member.
 The implemented protoype of the DAA issuer does only negotiate the membership key.
 Revocation lists and group management are not implemented yet, although the ECDAA library provides datastrucutres and functions for that.
-Similarly, the DAA verifier only checks the signature.
-In production use, both entities must hold the revocation list and perform further checks to trust the DAA member.
+Similarly, the DAA verifier only checks the signature of the received message.
+In a production setup, both entities must hold the revocation list and perform further checks to trust the DAA member and its messages.

 We split the tasks of a Digidow sensor in several parts to document the conrtibution of each.
 \begin{itemize}
@ -111,36 +111,76 @@ We split the tasks of a Digidow sensor in several parts to document the conrtibu
  The verifier saves message and hash for further procedures on its disk.
 \end{itemize}

-First, we look into the memory footprint of each part by executing them via \texttt{valgrind}.
-It measures the allocated heap space in memory which is shown in \autoref{tab:memoryusage}.
+\subsection{Disk Usage}
+In this early stage of the prototype, any statistics about how much disk space is required to run this setup is not useful.
+All programs besides the face embedding application are rather small, the binaries itself have a size ofless than 100\,kB.
+The installing process is still manual requiring a local build environment for C and Rust.
+Furthermore the programs require a list of dependencies which need to be installed with the package manager.
+Hence neither the size of the executables, nor the total disk occupation is informative for productive estimations.
+Simlarly, the face embedding application should be seen as example of a biometric sensor, making a detailed discussion about time and space efficiency less meaningful.
+
+\subsection{Memory Usage}
+First, we look into the memory footprint of each part by executing them via \texttt{/usr/bin/time}.
+It measures the the maximum resident size im memory during lifetime, which includes stack, heap and data section.
+\autoref{tab:memoryusage} show the maximum usage of each task during 10000 runs in the different IMa configurations \texttt{off}, \texttt{fix}, and \texttt{enforcing}.
 \begin{table}
  \renewcommand{\arraystretch}{1.2}
  \centering
-  \caption{Memory usage measured with Valgrind}
+  \caption{Maximum memory usage measured with \texttt{/usr/bin/time} in KB}
  \label{tab:memoryusage}
  \begin{tabular}{lrr}
    \toprule
    \textit{Task} &\textit{System 1} &\textit{System 3} \\
    \midrule
-    {DAA TPM key generation} &10,160 &10,160 \\
-    {DAA TPM join w/o keygen} &23,864 &23,864 \\
-    {DAA keygen \& join} &19,296 &19,296 \\
-    {Digidow sensor capture}   &93,703	&93,703 \\
-    {Digidow sensor embed}  &1,318,722,747	&1,385,416,573\\
-    {Digidow sensor collect} &1,115,639 &1,115,597 \\
-    {Digidow sensor send}   &36,072 &36,072 \\
+    {DAA keygen \& join} &2,000 &2,024 \\
+    {DAA TPM keygen \& join} &2,344 &2,324 \\
+    {Digidow sensor capture}   &3,748	&3,688 \\
+    {Digidow sensor embed}  &809,616	&847,712\\
+    {Digidow sensor collect} &5,128 &5,172 \\
+    {Digidow sensor send}   &2,604 &2,628 \\
    \bottomrule
  \end{tabular}
 \end{table}
-The memory usage is constant over all procedures but creating the DAA message itself.
-This step's memory footprint depends on the size of the files which it summarizes, especially when taking the IMA log into account.
-In this case the memory usage is measured while IMA is off, representing a lower bound of memory usage for this part.
-Besides calculating the face embedding of the captured image, the whole transaction can be executed using about 1.2\,MB of heap memory.
+The memory allocation is constant for all parts in this table.
+Besides calculating the face embedding of the captured image, the whole transaction can be executed using few Megabytes of heap memory.
 This would fit on most embedded devices running a Linux kernel.
-However, the face embedding algorithm uses over 1.3\,GB and requres the majority of the computation time as shown below.
-The slight difference between the two systems at the processing part seems to be consistent over several runs.
+However, the face embedding algorithm uses over 800\,MB and requres the majority of the computation time as shown below.

-\autoref{tab:wholeperformance} shows the time consumption for each relevant task for the Digidow sensor with its minimum, average and maximum results over 10000 runs.
+\subsection{Memory Safety}
+During these memory tests, valgrind showed a large number of possible memory leaks in the python binary itself.
+The following example is executed:
+\begin{lstlisting}[numbers=none]
+  root@amd1:~/jetson-nano-facerecognition# valgrind python3 img2emb.py data/test-images/test2.jpg
+\end{lstlisting}
+Valgrind ends with the report as follows:
+\begin{lstlisting}[numbers=none]
+  ==1648== HEAP SUMMARY:
+  ==1648==     in use at exit: 32,608,730 bytes in 227,287 blocks
+  ==1648==   total heap usage: 810,162 allocs, 582,875 frees, 1,385,416,573 bytes allocated
+  ==1648==
+  ==1648== LEAK SUMMARY:
+  ==1648==    definitely lost: 3,144 bytes in 28 blocks
+  ==1648==    indirectly lost: 0 bytes in 1 blocks
+  ==1648==      possibly lost: 523,629 bytes in 12,842 blocks
+  ==1648==    still reachable: 32,081,957 bytes in 214,416 blocks
+  ==1648==                       of which reachable via heuristic:
+  ==1648==                         stdstring          : 537,414 bytes in 11,917 blocks
+  ==1648==                         newarray           : 8,920 bytes in 5 blocks
+  ==1648==         suppressed: 0 bytes in 0 blocks
+  ==1648== Rerun with --leak-check=full to see details of leaked memory
+  ==1648==
+  ==1648== Use --track-origins=yes to see where uninitialised values come from
+  ==1648== For lists of detected and suppressed errors, rerun with: -s
+  ==1648== ERROR SUMMARY: 58173 errors from 914 contexts (suppressed: 0 from 0)
+\end{lstlisting}
+This report shows that the Python binary (here Python 3.8 from Ubuntu 20.04) is not memory safe, which is a significant drawback for the system and software integrity.
+Any binary which is directly involved in the DAA protocol frees every allocated block.
+Furthrmore any binary in the TPM2 software stack is memory safe according to valgrind.
+The used shell commands may not free every allocated block, however valgrind still finds no errors in these programs.
+
+
+\subsection{Performance}
+\autoref{tab:wholeperformance} shows the time consumption for each task with its minimum, average and maximum results over 10000 runs.
 \begin{table}
  \renewcommand{\arraystretch}{1.2}
  \centering
@ -150,42 +190,40 @@ The slight difference between the two systems at the processing part seems to be
    \multicolumn{2}{l|}{\textit{Task}} &\multicolumn{3}{c|}{\textit{System 1}} &\multicolumn{3}{c}{\textit{System 3}} \\
    &&\rotatebox{90}{IMA off} &\rotatebox{90}{IMA fix} &\rotatebox{90}{IMA enf} &\rotatebox{90}{IMA off} &\rotatebox{90}{IMA fix} &\rotatebox{90}{IMA enf}\\
    \midrule
-    \textit{DAA keygen \& join} [s] &min   &0.03 &0.03 & &0.03 &0.03 &0.03 \\
-    &avg   &0.05 &0.05 & &0.03 &0.03 &0.03 \\
-    &max   &0.07 &0.06 & &0.06 &0.06 &0.28 \\
-    &first &0.04 &0.07 & &0.04 &0.13 &0.04 \\\hline
-    \textit{DAA TPM keygen \& join} [s] &min   &0.33 &0.33 & &0.35 &0.35 &0.35 \\
-    &avg   &0.34 &0.34 & &0.37 &0.37 &0.37 \\
-    &max   &0.34 &0.36 & &0.37 &0.41 &0.40 \\
-    &first &0.37 &0.41 & &0.40 &0.42 &0.35 \\\hline\hline
-    \textit{Digidow sensor capture} [s] &min   &0.92 &0.91 & &0.91 &0.91 &0.91 \\
-    &avg   &1.07 &1.05 & &1.06 &1.06 &1.06 \\
-    &max   &1.14 &1.14 & &1.12 &12.48 &1.12 \\
-    &first &1.36 &1.42 & &1.34 &1.46 &1.45 \\\hline
-    \textit{Digidow sensor embed} [s] &min   &3.48 &3.51 & &4.07 &4.09 &4.10 \\
-    &avg   &3.53 &3.53 & &4.12 &4.14 &4.14 \\
-    &max   &4.11 &4.11 & &4.74 &4.46 &4.53 \\
-    &first &5.41 &19.93 & &5.99 &40.21 &40.23 \\\hline
-    \textit{Digidow sensor collect} [s] &min   &0.07 &0.14 & &0.09 &0.19 &0.19 \\
+    \textit{DAA keygen \& join} [s] &min   &0.03 &0.03 &0.03 &0.03 &0.03 &0.03 \\
+    &avg   &0.05 &0.05 &0.05 &0.03 &0.03 &0.03 \\
+    &max   &0.07 &0.06 &0.06 &0.06 &0.06 &0.28 \\
+    &first &0.04 &0.07 &0.07 &0.04 &0.13 &0.04 \\\hline
+    \textit{DAA TPM keygen \& join} [s] &min   &0.33 &0.33 &0.33 &0.35 &0.35 &0.35 \\
+    &avg   &0.34 &0.34 &0.34 &0.37 &0.37 &0.37 \\
+    &max   &0.34 &0.36 &0.35 &0.37 &0.41 &0.40 \\
+    &first &0.37 &0.41 &0.41 &0.40 &0.42 &0.35 \\\hline\hline
+    \textit{Digidow sensor capture} [s] &min   &0.92 &0.91 &0.92 &0.91 &0.91 &0.91 \\
+    &avg   &1.07 &1.05 &1.05 &1.06 &1.06 &1.06 \\
+    &max   &1.14 &1.14 &1.14 &1.12 &12.48 &1.12 \\
+    &first &1.36 &1.42 &1.47 &1.34 &1.46 &1.45 \\\hline
+    \textit{Digidow sensor embed} [s] &min   &3.48 &3.51 &3.51 &4.07 &4.09 &4.10 \\
+    &avg   &3.53 &3.53 &3.55 &4.12 &4.14 &4.14 \\
+    &max   &4.11 &4.11 &4.09 &4.74 &4.46 &4.53 \\
+    &first &5.41 &19.93 &19.88 &5.99 &40.21 &40.23 \\\hline
+    \textit{Digidow sensor collect} [s] &min   &0.07 &0.14 &0.14 &0.09 &0.19 &0.19 \\
    &avg   &0.08 &n/a &n/a &0.10 &n/a &n/a \\
    &max   &0.09 &n/a &n/a &0.11 &n/a &n/a \\
-    &first &0.09 &0.18 & &0.11 &0.24 &0.25 \\\hline
-    \textit{Digidow sensor send} [s] &min   &0.25 &0.25 & &0.26 &0.27 &0.27 \\
-    &avg   &0.25 &0.26 & &0.28 &0.27 &0.28 \\
-    &max   &0.26 &0.27 & &0.28 &0.29 &0.29 \\
-    &first &0.26 &0.32 & &0.28 &0.40 &0.40 \\\hline\hline
-    \textit{Digidow sensor transaction} [s] &min   &4.75 &4.84 & &5.38 &5.50 &5.49 \\
+    &first &0.09 &0.18 &0.22 &0.11 &0.24 &0.25 \\\hline
+    \textit{Digidow sensor send} [s] &min   &0.25 &0.25 &0.25 &0.26 &0.27 &0.27 \\
+    &avg   &0.25 &0.26 &0.26 &0.28 &0.27 &0.28 \\
+    &max   &0.26 &0.27 &0.27 &0.28 &0.29 &0.29 \\
+    &first &0.26 &0.32 &0.32 &0.28 &0.40 &0.40 \\\hline\hline
+    \textit{Digidow sensor transaction} [s] &min   &4.75 &4.84 &4.85 &5.38 &5.50 &5.49 \\
    &avg   &4.92 &n/a &n/a &5.56 &n/a &n/a \\
    &max   &5.52 &n/a &n/a &6.14 &n/a &n/a \\
-    &first &7.12 &21.92 & &7.72 &42.31 &42.33 \\
+    &first &7.12 &21.92 &21.89 &7.72 &42.31 &42.33 \\
    \bottomrule
  \end{tabular}
 \end{table}
-The \emph{first} run is stated separately since it is done immediately after a system reboot where the resources cached by the kernel are not loaded yet.
-Depending on the number of resources a single step needs, the overtime might be smaller or larger.
-
-When IMA is enabled, the kernel has to check the hash of each file accessed for reading.
-This hash must be extended into PCR 10 which makes the first run of each part significantly longer.
+The \emph{first} run is stated separately, because it is done immediately after a system reboot where the resources cached by the kernel are not loaded yet.
+The major delay in the first run is caused by the face embedding program, especially when IMA is enabled.
+As stated in \autoref{ssub:integrity_log}, each resource has to be hashed and extended into PCR 10 before access is granted, making the first access significantly longer.
 Especially the tensorflow application requires significantly more time for the first run.

 With IMA set to enforcing, the kernel furthermore manages access to the file asked to read.
@ -218,7 +256,6 @@ Since the software setup on both systems is comparable (Kernel version, Linux di

 When IMA is in fixing or enforcing mode, the corresponding log will be filled with information about every accessed file.
 The numbers in \autoref{tab:imalogentries} are taken from the IMA log after 10000 Digidow transaction tests.
-IMA was set to enforcing and the DAA member key was already in the TPM.
 \begin{table}
  \renewcommand{\arraystretch}{1.2}
  \centering
@ -228,79 +265,52 @@ IMA was set to enforcing and the DAA member key was already in the TPM.
    \toprule
    \textit{no. of entries} &\textit{System 1} &\textit{System 3} \\
    \midrule
-    \textit{Root login after boot}	&1912 \\
-    \textit{Digidow sensor capture}   &5 \\
-    \textit{Digidow sensor embed}	&2561 \\
-    \textit{Digidow sensor collect} &6 \\
-    \textit{Digidow sensor send}   &12 \\
-    \textit{Every other Digidow transaction}   &5 \\
+    \textit{Root login after boot}	&1912 &2159\\
+    \textit{Digidow sensor capture}   &5 &5\\
+    \textit{Digidow sensor embed}	&2561 &2162\\
+    \textit{Digidow sensor collect} &6 &6\\
+    \textit{Digidow sensor send}   &12 &12\\
+    \textit{Every other Digidow transaction}   &6 &6\\
    \bottomrule
  \end{tabular}
 \end{table}
+IMA was set to enforcing and the DAA member key was already in the TPM.
+The root login happens when the \texttt{inputrc} file is loaeded.
+In the IMA file, this file appears roughly in line 2000.
+Unfortunately, neither the number nor the sequence of entries in the IMA log is predictable in this setup.
+The number depends on services and daemons started during the boot sequence.
+Not every service is consistently loaded every time and---depending on its individual state---the number of files loaded by the service may differ over several boot attempts.
+Predicting the sequence of entries in the log is currently not possible since the kernel is taking advantage of the multicore CPU by starting services and daemons in parallel when possible.
+
+In the example of \autoref{tab:imalogentries}, System 3 loaded parts of the python environment before root could login.
+These resources were partly used by the tensorflow application.
+Consequently, these two entries are very volatile and hence hard to predict.
+However, the contribution of capture, collect and send as well as every further Digidow transaction are consistent.
+The six entries per Digidow transaction are the changing working files for each run and one file in \texttt{/tmp}.
+Thus, after 10000 runs to start a Digidow transaction the IMA log ended up with about 65000 entries, taking the kernel already several seconds to simply output the file.
+
+In the current setup, the single reliable information out of the IMA log is when one of the programs were executed on the system, the corresponding entries must be somewhere in the log.
+Furthermore you have to uniquely identify one program or script by the single file defining it.

-
-This means---given that the (very slow) hardware TPM had to extend PCR 10 for every line in the log---the slowdown is mainly caused by the interaction with the TPM itself.
-
-Since the IMA log file is also essential for remote attestation, the information of this file must be transmitted to the DAA verifier.
-
-\begin{itemize}
-  \item payload without IMA log about 15KB
-  \item No encryption for payload, but doable -- depends on the way how Sensor and PIA can communicate together
-  \item IMA log much too large
-  \item Test results how long the process of capturing takes -- with and without IMA
-\end{itemize}
-
-\section{Limitations}
-\label{sec:limitations}
-\begin{itemize}
-  \item older TPM does not support ECDAA
-  \item Documentation available for TPM APIs, but no changelog for \texttt{tpm2-tools}.
-  \item Trusted boot and IMA can just handle static resources like files, kernel modules and firmware of hardware components.
-  Code transmitted over network or otherwse dynamically generated can not be recognized.
-  This is an open door for non-persistent attacks.
-  \item Documentation on IMA is mostly outdated and so are some tools.
-  Further customization of rules may be useful to reduce log size.
-  However major Linux distributions support IMA by default on recent releases.
-  \item Complexity of verifying system state is too high and is connected to system complexity.
-  Reducing number of dependencies and relevant file count is key for this problem.
-  \item Implemented DAA does not support a full dynamic group scheme.
-  This might be useful in the future, maybe with a custom implementation of a recent DAA version.
-\end{itemize}
-
-During these memory tests, valgrind showed a large number of possible memory leaks in the python binary itself.
-The following example is executed:
-\begin{lstlisting}[numbers=none]
-  root@amd1:~/jetson-nano-facerecognition# valgrind python3 img2emb.py data/test-images/test2.jpg
-\end{lstlisting}
-Valgrind ends with the report as follows:
-\begin{lstlisting}[numbers=none]
-==1648== HEAP SUMMARY:
-==1648==     in use at exit: 32,608,730 bytes in 227,287 blocks
-==1648==   total heap usage: 810,162 allocs, 582,875 frees, 1,385,416,573 bytes allocated
-==1648==
-==1648== LEAK SUMMARY:
-==1648==    definitely lost: 3,144 bytes in 28 blocks
-==1648==    indirectly lost: 0 bytes in 1 blocks
-==1648==      possibly lost: 523,629 bytes in 12,842 blocks
-==1648==    still reachable: 32,081,957 bytes in 214,416 blocks
-==1648==                       of which reachable via heuristic:
-==1648==                         stdstring          : 537,414 bytes in 11,917 blocks
-==1648==                         newarray           : 8,920 bytes in 5 blocks
-==1648==         suppressed: 0 bytes in 0 blocks
-==1648== Rerun with --leak-check=full to see details of leaked memory
-==1648==
-==1648== Use --track-origins=yes to see where uninitialised values come from
-==1648== For lists of detected and suppressed errors, rerun with: -s
-==1648== ERROR SUMMARY: 58173 errors from 914 contexts (suppressed: 0 from 0)
-\end{lstlisting}
-This report shows that the Python binary (here Python 3.8 from Ubuntu 20.04) is not memory safe, which is a significant drawback for the system and software integrity.
-Any binary which is directly involved in the DAA protocol frees every allocated block.
-Furthrmore any binary in the TPM2 software stack is memory safe according to valgrind.
-The used shell commands may not free every allocated block, however valgrind still finds no errors in these programs.
+\begin{table}
+  \renewcommand{\arraystretch}{1.2}
+  \centering
+  \caption{Estimation of performance of creating an IMA log entry}
+  \label{tab:imalogspeed}
+  \begin{tabular}{lrrr}
+    \toprule
+    &{slowdown} &{log entries} &{per entry} \\
+    \midrule
+    \textit{System 1}	&14.87 &2,162 &6.88\,ms\\
+    \textit{System 3} &34.24 &2,561 &13.43\,ms\\
+    \bottomrule
+  \end{tabular}
+\end{table}


+\section{Further Test Experiences}
 When IMA is set to enforcing, some unexpected problems appeared during updating Ubuntu.
-During \texttt{apt upgrade}, the package manager downloads the deb packages into its cache folder at  \texttt{/var/cache/apt/}.
+While executing \texttt{apt upgrade}, the package manager downloads the deb packages into its cache folder at  \texttt{/var/cache/apt/}.
 These files, however, do not have the \texttt{security.ima} attribute when the download is finished.
 The kernel prevents due to theses missing attributes any access and breaks the upgrade process.
 It is not clear why the files are not hashed although apt is run by root and every file created by root must be hashed with the active IMA policy.
--- a/thesis/06_conclusion.tex
+++ b/thesis/06_conclusion.tex
@ -1,6 +1,24 @@
-\chapter{Conclusion and Outlook}
+\chapter{State of Work and Outlook}
 \label{cha:conclusion}

+\section{Limitations}
+\label{sec:limitations}
+\begin{itemize}
+  \item older TPM does not support ECDAA
+  \item Documentation available for TPM APIs, but no changelog for \texttt{tpm2-tools}.
+  \item Trusted boot and IMA can just handle static resources like files, kernel modules and firmware of hardware components.
+  Code transmitted over network or otherwse dynamically generated can not be recognized.
+  This is an open door for non-persistent attacks.
+  \item Documentation on IMA is mostly outdated and so are some tools.
+  Further customization of rules may be useful to reduce log size.
+  However major Linux distributions support IMA by default on recent releases.
+  \item Complexity of verifying system state is too high and is connected to system complexity.
+  Reducing number of dependencies and relevant file count is key for this problem.
+  \item Implemented DAA does not support a full dynamic group scheme.
+  This might be useful in the future, maybe with a custom implementation of a recent DAA version.
+\end{itemize}
+
+
 \section{Future Work}
 \begin{itemize}
  \item Remove building tools on target device - just deliver binaries
@ -17,6 +35,6 @@ Activate a credential with to certify that the Membership key is in the Endorsem

  Further integration in the Digidow environment if DAA is useful for that.

-  \section{Outlook}
+  \section{Conclusion}
  Hardening of the system beyond IMA useful.
  Minimization also useful, because the logging gets shorter.
--- a/thesis/MAIN.pdf
+++ b/thesis/MAIN.pdf
			`@ -0,0 +1 @@`
			`,michael,luna,25.09.2021 10:56,file:///home/michael/.config/libreoffice/4;`